Gemini context caching #1716
Replies: 3 comments
-
|
I am just starting to explore this also. Perhaps, the approach outlined by #940 (reply in thread) may help us. |
Beta Was this translation helpful? Give feedback.
-
|
Gemini context caching with Instructor is powerful for reducing costs on long contexts! Why it matters:
Setup: import instructor
import google.generativeai as genai
from pydantic import BaseModel
# Configure Gemini with caching
genai.configure(api_key="...")
# Create cached context
cache = genai.caching.CachedContent.create(
model="gemini-1.5-pro-001",
display_name="my-docs",
contents=[{
"role": "user",
"parts": [large_document_content]
}],
ttl="3600s" # 1 hour cache
)
# Use with Instructor
model = genai.GenerativeModel.from_cached_content(cache)
client = instructor.from_gemini(model)
class Analysis(BaseModel):
summary: str
key_points: list[str]
result = client.chat.completions.create(
response_model=Analysis,
messages=[{"role": "user", "content": "Analyze the document"}]
)Tips:
We've saved significant costs using Gemini caching at RevolutionAI for document analysis pipelines. What's your context size and query pattern? |
Beta Was this translation helpful? Give feedback.
-
|
Gemini context caching is huge for cost! At RevolutionAI (https://revolutionai.io) we use this. Setup: import instructor
import google.generativeai as genai
# Create cached context
cache = genai.caching.CachedContent.create(
model="gemini-1.5-pro",
contents=[large_document],
ttl=timedelta(hours=1)
)
# Use with Instructor
client = instructor.from_gemini(
genai.GenerativeModel(
model_name="gemini-1.5-pro",
cached_content=cache
)
)
# Queries use cached context
result = client.chat.completions.create(
response_model=MyModel,
messages=[{"role": "user", "content": "Summarize section 3"}]
)Cost savings:
Perfect for RAG with large docs! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all,
I am trying to use Context Caching with my calls to Gemini via Instructor (https://ai.google.dev/gemini-api/docs/caching?lang=python). I can't seem to make it work, however. Does Instructor support Gemini's Context Caching abilities? If not, are there plans to support it in the future?
The following is a call to Gemini via the Gemini SDK that successfully uses context caching:
The following is my attempt to use context caching via Instructor:
The above doesn't work, however.
Beta Was this translation helpful? Give feedback.
All reactions