Getting started

Learn the basics of retrieval with Ducky

Overview

Ducky is a fully managed retrieval system designed to accelerate development for Retrieval-Augmented Generation (RAG) applications. It offers a seamless retrieval API that prioritizes high-quality results and robust deployment readiness, allowing developers to focus on building smarter applications with less overhead.

Key features

  • Accurate retrievals: Ducky's retrieval engine is optimized for high precision, delivering relevant results that enhance the performance of your RAG applications.
  • Developer-friendly: Built with developers in mind, Ducky provides an intuitive API and SDK, making setup and integration seamless and straightforward.
  • Fully managed: With Ducky, you get a fully managed solution that’s ready for deployment, eliminating the need for complex infrastructure management and enabling faster scaling.

Getting started

Install the Python client

If you are ready working in Python, try out the Python SDK

python -m pip install duckyai

Sign up and create a Project and Index

Create an account at app.duckyai.dev, follow the getting started guide and create an Index. Index is simply a collection of data.

Check out ourCore concepts page for information on Projects and Indexes.


Create an API key

The getting started page will provide you with a starter code with API key, which you can also manage these keys in the API keys page.


Upload data to the index

To start using Ducky’s retrieval, you need to upload your data to an index. A document is a representation of this data, which is stored under a index and has a content that represents the nature of the document.

from duckyai import DuckyAI

client = DuckyAI(api_key="<DUCKYAI_API_KEY>")

client.documents.index(
    index_name="cakes",
    content="""
        The Black Forest Cake is named after Kirschwasser,
        a cherry brandy from Germany's Black Forest,
        traditionally used to flavor its chocolate layers.
    """,
)

Make your first retrieval request

Now that your data is uploaded, you can perform your first retrieval request. Retrieval is the process of searching for documents based on a query. Here’s how to get started:

rsp = client.documents.retrieve(
    index_name="cakes",
    query="which country is black forest from?",
    top_k=1,
    alpha=1,
    rerank=False,
)

import json
print(json.dumps(rsp.model_dump(), indent=4))

Parameters:

  • top_k: the number of top results to return.
  • alpha: balancing between keyword (alpha=0) and semantic (alpha=1) hybrid search.
  • rerank: if a second stage reranking is used or not.

Output:

{
    "documents": [
        {
            "doc_id": "217bacd4-e5e2-4a46-a78f-a1967ed08118",
            "content_chunks": [
                "The Black Forest Cake is named after Kirschwasser, a cherry brandy from Germany's Black Forest, traditionally used to flavor its chocolate layers."
            ],
            "next_cursor": null,
            "metadata": null,
            "title": null,
            "source_url": null,
            "status": "INDEXED"
        }
    ],
    "chunks": [
        {
            "chunk_id": "0",
            "content": "The Black Forest Cake is named after Kirschwasser, a cherry brandy from Germany's Black Forest, traditionally used to flavor its chocolate layers.",
            "doc_id": "217bacd4-e5e2-4a46-a78f-a1967ed08118",
            "metadata": null
        }
    ]
}

Ducky return the results in two different format: documents and chunks. Ducky will automatically break larger content into smaller sizes, which we refer as chunks.


Retrieval with filters

Filters allow you to narrow down results based on metadata. For example, if your documents include a category field, you can retrieve results specific to that category. Consider the following document

client.documents.index(
    index_name="cakes",
    content="""
        Fish cake is a savory dish made from a mixture of fish and potato,
        coated in breadcrumbs, and fried until golden brown. It is popular
        in British and Asian cuisines.
    """,
    doc_id="fish cake",
    metadata={
        "type": "savory",
        "cook_time_mins": 30,
    },
)

You can retrieve only savory type cakes using the metadata_filter argument in retrieval.

rsp = client.documents.retrieve(
    index_name="cakes",
    query="dinner ideas",
    top_k=1,
    metadata_filter={"type": {"$eq": "savory"}},
)

For an comprehensive list of function we support via metadata filter, checkout out the retrieval API reference.


🦆

Get in touch or see our roadmap if you need help


What’s Next

Checkout our core concepts and use cases