Asynchronous Operations

Ducky processes most operations asynchronously to ensure fast API responses and efficient resource utilization. Understanding this asynchronous nature is crucial for building reliable applications and managing expectations around processing times.

Why Operations Are Asynchronous

When you index documents, upload files, or delete indexes, Ducky performs complex operations behind the scenes including content analysis, semantic processing, and search optimization. These operations can take anywhere from seconds to minutes depending on the size and complexity, so Ducky returns immediately while processing continues in the background.

Note: Document retrieval is not asynchronous - search queries return results as quickly as possible, typically within milliseconds.

Document Indexing

How It Works

Document indexing follows a two-phase approach:

Phase 1: Immediate Response

from duckyai import DuckyAI

ducky = DuckyAI(api_key="<DUCKYAI_API_KEY>")

# This returns immediately
result = ducky.documents.index(
    index_name="knowledge-base",
    doc_id="large-document",
    content="Very long document content...",
    title="Large Document"
)

print(f"Document {result.doc_id} queued for processing")
# Document is now queued for background processing

import { Ducky } from "duckyai-ts";

const ducky = new Ducky({
  apiKey: process.env["DUCKY_API_KEY"] ?? "",
});

// This returns immediately
const result = await ducky.documents.index({
  indexName: "knowledge-base",
  docId: "large-document",
  content: "Very long document content...",
  title: "Large Document"
});

console.log(`Document ${result.docId} queued for processing`);
// Document is now queued for background processing

Phase 2: Background Processing

After the API returns, Ducky performs complex processing operations to make your document searchable. This involves multiple steps including content analysis, semantic processing, and search optimization, which is why documents may take some time to be ready for retrieval.

Processing Time Expectations

Processing time depends on document complexity, content length, and current system load. Most documents are ready for search within seconds to minutes after indexing.

File Upload Processing

File uploads are particularly resource-intensive and always processed asynchronously.

File Size Limits

Maximum file size: 60MB
Supported formats: PDF, text files, UTF-8 encoded documents

Processing Flow

# Large file upload - returns immediately
with open("large-manual.pdf", "rb") as file:
    result = ducky.documents.index_file(
        index_name="manuals",
        doc_id="user-manual",
        file={
            "file_name": "large-manual.pdf",
            "content": file
        }
    )

print(f"File {result.doc_id} queued for processing")
# File is now being processed in the background

import { openAsBlob } from "node:fs";

// Large file upload - returns immediately
const result = await ducky.documents.indexFile({
  indexName: "manuals",
  docId: "user-manual",
  file: await openAsBlob("large-manual.pdf")
});

console.log(`File ${result.docId} queued for processing`);
// File is now being processed in the background

PDF Processing

PDF files require additional processing as each page is treated as a separate document and goes through content extraction and indexing. Most PDFs are ready within 3 minutes, though individual pages may become searchable sooner as they're processed.

Document Deletion

Document deletion is also asynchronous and involves cleanup across multiple systems.

Deletion Process

# Delete document - returns immediately
ducky.documents.delete(
    index_name="knowledge-base",
    doc_id="document-to-delete"
)

print("Document deletion queued")
# Document cleanup happens in background

// Delete document - returns immediately
await ducky.documents.delete({
  indexName: "knowledge-base",
  docId: "document-to-delete"
});

console.log("Document deletion queued");
// Document cleanup happens in background

Document deletion involves removing multiple forms of data across different systems, which is why it takes time to complete. Most documents are fully removed within seconds.

Index Deletion

Index deletion removes all documents within an index, which means the processing time depends on how many documents need to be deleted. Large indexes with thousands of documents will take longer to process than smaller ones.

# Index deletion - returns immediately but processing time varies by size
ducky.indexes.delete(index_name="knowledge-base")

print("Index deletion started - time depends on number of documents")
# Larger indexes will take longer to fully delete

// Index deletion - returns immediately but processing time varies by size
await ducky.indexes.delete({
  indexName: "knowledge-base"
});

console.log("Index deletion started - time depends on number of documents");
// Larger indexes will take longer to fully delete

Best Practices for Async Operations

1. Design for Asynchronous Processing

# Good - Don't assume immediate availability
result = ducky.documents.index(
    index_name="my-index",
    doc_id="new-doc",
    content="Document content"
)

# Wait before trying to retrieve
import time
time.sleep(5)  # Give processing time

# Then search for the document
results = ducky.documents.retrieve(
    index_name="my-index",
    query="document content",
    top_k=1
)

// Good - Don't assume immediate availability
const result = await ducky.documents.index({
  indexName: "my-index",
  docId: "new-doc",
  content: "Document content"
});

// Wait before trying to retrieve
await new Promise(resolve => setTimeout(resolve, 5000));

// Then search for the document
const results = await ducky.documents.retrieve({
  indexName: "my-index",
  query: "document content",
  topK: 1
});

2. Handle Large Operations Appropriately

# For large file uploads
def upload_large_file(file_path, index_name, doc_id):
    with open(file_path, "rb") as file:
        result = ducky.documents.index_file(
            index_name=index_name,
            doc_id=doc_id,
            file={"file_name": file_path, "content": file}
        )
    
    print(f"Large file {doc_id} queued - processing time depends on file size")
    return result

// For large file uploads
async function uploadLargeFile(filePath: string, indexName: string, docId: string) {
  const file = await openAsBlob(filePath);
  const result = await ducky.documents.indexFile({
    indexName,
    docId,
    file
  });
  
  console.log(`Large file ${docId} queued - processing time depends on file size`);
  return result;
}

3. Plan for Large Operations

Index deletion: Consider timing based on index size
Large file uploads: Larger files take longer to process
Multiple operations: Spread out large operations to manage processing load

Summary

Understanding Ducky's asynchronous nature helps you:

Set proper expectations for processing times
Design resilient applications that handle async operations
Plan operations around processing requirements

Remember: Most operations return immediately, but actual processing happens in the background. Documents become searchable within seconds to minutes, with larger files and indexes taking longer. Always design your applications to handle this asynchronous behavior gracefully.