Document Chunks

Ducky automatically breaks larger documents into smaller, searchable pieces called chunks. This enables more precise search results from long documents.

What Are Chunks?

Chunks are smaller segments of your original document content. When you index a long document, Ducky intelligently splits it into chunks at natural boundaries like paragraphs or sentences.

How Chunking Works

Chunking happens automatically when you index documents - no configuration required:

from duckyai import DuckyAI

ducky = DuckyAI(api_key="your-api-key")

# This long document will be automatically chunked
ducky.documents.index(
    index_name="knowledge-base",
    doc_id="long-article",
    content="""
        This is a very long article with multiple sections...
        
        Section 1: Introduction to the topic...
        
        Section 2: Detailed explanation...
        
        Section 3: Advanced concepts...
    """
)

import { Ducky } from "duckyai-ts";

const ducky = new Ducky({
  apiKey: process.env.DUCKY_API_KEY ?? "",
});

// This long document will be automatically chunked
await ducky.documents.index({
  indexName: "knowledge-base",
  docId: "long-article",
  content: `
    This is a very long article with multiple sections...
    
    Section 1: Introduction to the topic...
    
    Section 2: Detailed explanation...
    
    Section 3: Advanced concepts...
  `
});

Chunks in Search Results

When you retrieve documents, you get both document-level and chunk-level results:

{
  "documents": [
    {
      "doc_id": "long-article",
      "content_chunks": [
        "Section 2: Detailed explanation of the main concept..."
      ],
      "title": "Long Article",
      "metadata": {...}
    }
  ],
  "chunks": [
    {
      "chunk_id": "0",
      "content": "Section 2: Detailed explanation of the main concept...",
      "doc_id": "long-article"
    }
  ]
}

documents: Groups chunks by their original document
chunks: Individual pieces that matched your query

This structure lets you work with either the complete document context or the specific relevant sections.

🦆
Get in touch or see our roadmap if you need help