Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
bcfe06c
feat: add GeminiFileSearch class for vector operations with Google's API
Nov 23, 2025
45757c7
Merge branch 'main' of https://github.com/ishrath99/agno into gemini-…
Nov 23, 2025
f373974
add unit tests for GeminiFileSearch functionality
Nov 23, 2025
b7b5ea1
format
Nov 23, 2025
27e3499
fix: streamline document deletion logic and enhance test verification
Nov 23, 2025
60d91d9
feat: add debug logging for document retrieval in GeminiFileSearch
Nov 23, 2025
31cc937
fix: update comments for clarity on mocking google.genai module
Nov 23, 2025
64e6178
feat: add Gemini File Search documentation and examples, including as…
Nov 23, 2025
08f1ea9
format
Nov 23, 2025
aee4b8f
feat: add Gemini dependency to pyproject.toml for Google GenAI integr…
Nov 23, 2025
4919d25
fix: add checks for document name existence in GeminiFileSearch methods
Nov 23, 2025
a02bb37
fix: correct variable usage in name_exists method to check document ID
Nov 23, 2025
31d56ed
Merge branch 'main' into gemini-file-search-vectordb
ishrath99 Dec 1, 2025
7e2a132
Merge branch 'main' of https://github.com/ishrath99/agno into gemini-…
Dec 3, 2025
049089f
Merge branch 'gemini-file-search-vectordb' of https://github.com/ishr…
Dec 3, 2025
d0df1f0
format
Dec 3, 2025
8fb91a9
format
Dec 3, 2025
cde93a6
Merge branch 'main' of https://github.com/ishrath99/agno into gemini-…
Dec 5, 2025
f44679e
Merge branch 'main' into gemini-file-search-vectordb
ishrath99 Dec 5, 2025
fe0dd75
Merge branch 'main' into gemini-file-search-vectordb
ishrath99 Dec 5, 2025
ddd16ed
Merge branch 'main' into gemini-file-search-vectordb
ishrath99 Dec 6, 2025
3db248d
Merge branch 'main' into gemini-file-search-vectordb
ishrath99 Dec 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions cookbook/knowledge/vector_db/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ agent = Agent(
- **[ChromaDB](./chroma_db/)** - Embedded vector database
- **[ClickHouse](./clickhouse_db/)** - Columnar database with vector functions
- **[Couchbase](./couchbase_db/)** - NoSQL database with vector search
- **[Gemini File Search](./gemini_file_search/)** - Google's managed vector database with Gemini integration
- **[LanceDB](./lance_db/)** - Fast columnar vector database
- **[LangChain](./langchain/)** - Use any LangChain vector store
- **[LightRAG](./lightrag/)** - Graph-based RAG system
Expand Down
109 changes: 109 additions & 0 deletions cookbook/knowledge/vector_db/gemini_file_search/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Gemini File Search

Gemini File Search provides a managed vector database integrated with Google's Gemini AI models. It allows you to upload documents and perform semantic search using Google's infrastructure.

## Features

- **Managed Service**: No infrastructure management required
- **Native Integration**: Seamlessly works with Gemini models
- **File Upload**: Directly upload documents to Google's File Search Store
- **Metadata Filtering**: Filter search results by custom metadata
- **Grounding Support**: Get responses with citation metadata

## Installation

```bash
pip install google-genai
```

## Configuration

Set your Google API key as an environment variable:

```bash
export GOOGLE_API_KEY="your-api-key-here"
```

Or get one from: https://ai.google.dev/

## Basic Usage

```python
from agno.agent import Agent
from agno.knowledge.knowledge import Knowledge
from agno.vectordb.gemini.gemini_file_search import GeminiFileSearch

# Create Gemini File Search vector database
vector_db = GeminiFileSearch(
file_search_store_name="my-knowledge-store",
model_name="gemini-2.5-flash-lite",
api_key="your-api-key",
)

# Create knowledge base
knowledge = Knowledge(
name="My Knowledge Base",
vector_db=vector_db,
)

# Add documents
knowledge.add_content(
name="MyDocument",
url="https://example.com/document.pdf",
metadata={"doc_type": "manual"},
)

# Create agent and query
agent = Agent(knowledge=knowledge, search_knowledge=True)
agent.print_response("What is covered in the document?")
```

## Examples

- **[gemini_file_search.py](./gemini_file_search.py)** - Basic usage with Thai recipe knowledge base
- **[async_gemini_file_search.py](./async_gemini_file_search.py)** - Async operations with Agno documentation
- **[gemini_file_search_with_filters.py](./gemini_file_search_with_filters.py)** - Using metadata filters for refined search

## Supported Operations

| Operation | Supported | Notes |
|-----------|-----------|-------|
| `create()` | ✅ | Creates or gets existing File Search Store |
| `insert()` | ✅ | Uploads documents to the store |
| `search()` | ✅ | Semantic search with optional metadata filters |
| `upsert()` | ✅ | Updates existing documents or inserts new ones |
| `delete_by_name()` | ✅ | Delete documents by display name |
| `delete_by_id()` | ✅ | Delete documents by ID |
| `delete_by_content_id()` | ✅ | Delete documents by content ID |
| `delete_by_metadata()` | ❌ | Not supported by Gemini File Search |
| `update_metadata()` | ❌ | Not supported by Gemini File Search |

## Important Notes

1. **File Search Store**: Documents are organized in "File Search Stores" - named containers for your documents
2. **Document Names**: Each document has both a system-generated `name` (ID) and a user-defined `display_name`
3. **Operation Polling**: Document uploads are asynchronous; the library polls until completion
4. **Metadata Limitations**:
- Supports string, numeric, and float metadata values
- Metadata can be used for filtering during search
- Cannot update metadata after upload (must delete and re-upload)
5. **Cost**: Check Google AI pricing for File Search Store usage

## Model Options

Gemini File Search supports various Gemini models:

- `gemini-2.5-flash-lite` (default) - Fast and cost-effective
- `gemini-2.5-flash` - Balanced performance
- `gemini-2.0-flash` - High performance
- `gemini-2.0-flash-exp` - Experimental features

## API Reference

See the [GeminiFileSearch documentation](../../../../libs/agno/agno/vectordb/gemini/gemini_file_search.py) for detailed API information.

## Resources

- [Google AI Gemini Docs](https://ai.google.dev/gemini-api/docs)
- [File Search API](https://ai.google.dev/gemini-api/docs/file-search)
- [Agno Documentation](https://docs.agno.com)
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""Gemini File Search cookbook examples."""
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
"""
Async example of using Gemini File Search as a vector database.

Requirements:
- pip install google-genai
- Set GOOGLE_API_KEY environment variable
"""

import asyncio
from os import getenv

from agno.agent import Agent
from agno.knowledge.knowledge import Knowledge
from agno.vectordb.gemini.gemini_file_search import GeminiFileSearch

# Get API key from environment
api_key = getenv("GOOGLE_API_KEY")

# Initialize Gemini File Search
vector_db = GeminiFileSearch(
file_search_store_name="agno-docs-store",
model_name="gemini-2.5-flash-lite",
api_key=api_key,
)

# Create knowledge base
knowledge = Knowledge(
name="Agno Documentation",
description="Knowledge base with Agno documentation using Gemini File Search",
vector_db=vector_db,
)

# Create and use the agent
agent = Agent(knowledge=knowledge, search_knowledge=True)


async def main():
"""Main async function."""
# Add content to the knowledge base
# Comment out after first run to avoid re-uploading
await knowledge.add_content_async(
name="AgnoIntroduction",
url="https://docs.agno.com/concepts/agents/introduction.md",
metadata={"doc_type": "documentation", "topic": "agents"},
)

# Query the knowledge base using async
await agent.aprint_response("What is the purpose of an Agno Agent?", markdown=True)

# Additional query
await agent.aprint_response(
"How do I create an agent with knowledge?", markdown=True
)


if __name__ == "__main__":
asyncio.run(main())
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
"""
Basic example of using Gemini File Search as a vector database.

Requirements:
- pip install google-genai
- Set GOOGLE_API_KEY environment variable
"""

import asyncio
from os import getenv

from agno.agent import Agent
from agno.knowledge.knowledge import Knowledge
from agno.vectordb.gemini.gemini_file_search import GeminiFileSearch

# Get API key from environment
api_key = getenv("GOOGLE_API_KEY")

# Create Gemini File Search vector database
vector_db = GeminiFileSearch(
file_search_store_name="thai-recipes-store",
model_name="gemini-2.5-flash-lite",
api_key=api_key,
)

# Create Knowledge Instance with Gemini File Search
knowledge = Knowledge(
name="Thai Recipe Knowledge Base",
description="Agno 2.0 Knowledge Implementation with Gemini File Search",
vector_db=vector_db,
)

# Add content to the knowledge base
# Note: This uploads documents to Gemini File Search Store
asyncio.run(
knowledge.add_content_async(
name="Recipes",
url="https://agno-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf",
metadata={"doc_type": "recipe_book", "cuisine": "Thai"},
)
)

# Create and use the agent
agent = Agent(knowledge=knowledge, search_knowledge=True)

# Query the knowledge base
agent.print_response("List down the ingredients to make Massaman Gai", markdown=True)

# Delete operations examples
# Delete by name
vector_db.delete_by_name("Recipes")

# Note: delete_by_metadata is not supported by Gemini File Search
# You can only delete documents by name or ID
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
"""
Example of using Gemini File Search with metadata filters.

Requirements:
- pip install google-genai
- Set GOOGLE_API_KEY environment variable
"""

import asyncio
from os import getenv

from agno.agent import Agent
from agno.knowledge.knowledge import Knowledge
from agno.vectordb.gemini.gemini_file_search import GeminiFileSearch

# Get API key from environment
api_key = getenv("GOOGLE_API_KEY")

# Create Gemini File Search vector database
vector_db = GeminiFileSearch(
file_search_store_name="multi-cuisine-recipes",
model_name="gemini-2.5-flash-lite",
api_key=api_key,
)

# Create Knowledge Instance
knowledge = Knowledge(
name="Multi-Cuisine Recipe Knowledge Base",
description="Knowledge base with recipes from different cuisines",
vector_db=vector_db,
)


async def add_recipes():
"""Add multiple recipe documents with different metadata."""
# Add Thai recipes
await knowledge.add_content_async(
name="ThaiRecipes",
url="https://agno-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf",
metadata={"cuisine": "Thai", "difficulty": "medium"},
)

# You can add more recipes from other sources with different metadata
# await knowledge.add_content_async(
# name="ItalianRecipes",
# content="Italian recipe content here...",
# metadata={"cuisine": "Italian", "difficulty": "easy"},
# )


async def main():
"""Main function."""
# Add content (comment out after first run)
await add_recipes()

# Create agent
agent = Agent(knowledge=knowledge, search_knowledge=True)

# Query with specific filters
# Note: Gemini File Search supports metadata filtering in search
print("\n=== Searching for Thai recipes ===\n")
await agent.aprint_response(
"What are some popular Thai dishes?",
markdown=True,
)

# You can also search the vector database directly with filters
print("\n=== Direct vector DB search with filters ===\n")
results = vector_db.search(
query="coconut curry recipes",
limit=3,
filters={"cuisine": "Thai"}, # Filter by cuisine
)

for i, result in enumerate(results, 1):
print(f"\nResult {i}:")
print(f"Content: {result.content[:200]}...")
if result.meta_data:
print(f"Metadata: {result.meta_data}")


if __name__ == "__main__":
asyncio.run(main())
5 changes: 5 additions & 0 deletions libs/agno/agno/vectordb/gemini/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from agno.vectordb.gemini.gemini_file_search import GeminiFileSearch

__all__ = [
"GeminiFileSearch",
]
Loading