Getting started
Learn the basics of retrieval with Ducky
Overview
Ducky is a fully managed retrieval system designed to accelerate development for Retrieval-Augmented Generation (RAG) applications. It offers a seamless retrieval API that prioritizes high-quality results and robust deployment readiness, allowing developers to focus on building smarter applications with less overhead.
Key features
- Accurate retrievals: Ducky's retrieval engine is optimized for high precision, delivering relevant results that enhance the performance of your RAG applications.
- Developer-friendly: Built with developers in mind, Ducky provides an intuitive API and SDK, making setup and integration seamless and straightforward.
- Fully managed: With Ducky, you get a fully managed solution that’s ready for deployment, eliminating the need for complex infrastructure management and enabling faster scaling.
Getting started
Install the Python client
If you are ready working in Python, try out the Python SDK
python -m pip install duckyai
Sign up and create a Project and Index
Create an account at app.duckyai.dev, follow the getting started guide and create an Index. Index is simply a collection of data.
Check out ourCore concepts page for information on Projects and Indexes.
Create an API key
The getting started page will provide you with a starter code with API key, which you can also manage these keys in the API keys page.
Upload data to the index
To start using Ducky’s retrieval, you need to upload your data to an index. A document is a representation of this data, which is stored under a index and has a content that represents the nature of the document.
from duckyai import DuckyAI
client = DuckyAI(api_key="<DUCKYAI_API_KEY>")
client.documents.index(
index_name="cakes",
content="""
The Black Forest Cake is named after Kirschwasser,
a cherry brandy from Germany's Black Forest,
traditionally used to flavor its chocolate layers.
""",
)
Make your first retrieval request
Now that your data is uploaded, you can perform your first retrieval request. Retrieval is the process of searching for documents based on a query. Here’s how to get started:
rsp = client.documents.retrieve(
index_name="cakes",
query="which country is black forest from?",
top_k=1,
alpha=1,
rerank=False,
)
import json
print(json.dumps(rsp.model_dump(), indent=4))
Parameters:
top_k
: the number of top results to return.alpha
: balancing between keyword (alpha=0
) and semantic (alpha=1
) hybrid search.rerank
: if a second stage reranking is used or not.
Output:
{
"documents": [
{
"doc_id": "217bacd4-e5e2-4a46-a78f-a1967ed08118",
"content_chunks": [
"The Black Forest Cake is named after Kirschwasser, a cherry brandy from Germany's Black Forest, traditionally used to flavor its chocolate layers."
],
"next_cursor": null,
"metadata": null,
"title": null,
"source_url": null,
"status": "INDEXED"
}
],
"chunks": [
{
"chunk_id": "0",
"content": "The Black Forest Cake is named after Kirschwasser, a cherry brandy from Germany's Black Forest, traditionally used to flavor its chocolate layers.",
"doc_id": "217bacd4-e5e2-4a46-a78f-a1967ed08118",
"metadata": null
}
]
}
Ducky return the results in two different format: documents
and chunks
. Ducky will automatically break larger content into smaller sizes, which we refer as chunks.
Retrieval with filters
Filters allow you to narrow down results based on metadata. For example, if your documents include a category field, you can retrieve results specific to that category. Consider the following document
client.documents.index(
index_name="cakes",
content="""
Fish cake is a savory dish made from a mixture of fish and potato,
coated in breadcrumbs, and fried until golden brown. It is popular
in British and Asian cuisines.
""",
doc_id="fish cake",
metadata={
"type": "savory",
"cook_time_mins": 30,
},
)
You can retrieve only savory type cakes using the metadata_filter
argument in retrieval.
rsp = client.documents.retrieve(
index_name="cakes",
query="dinner ideas",
top_k=1,
metadata_filter={"type": {"$eq": "savory"}},
)
For an comprehensive list of function we support via metadata filter, checkout out the retrieval API reference.
Get in touch or see our roadmap if you need help
Updated 23 days ago