Litellm aws gov cloud mode support#25254
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR adds AWS GovCloud Bedrock catalog entries for Changes:
Issue found:
Confidence Score: 4/5Safe to merge after addressing the missing search_context_cost_per_query field in the GovCloud converse entry One P1 finding remains: the us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0 converse entry is missing search_context_cost_per_query, which is present in every other cross-region variant of this model. The previously flagged supports_native_structured_output omission has been resolved. The max_output_tokens corrections and new bedrock/ entries look correct and consistent with existing patterns. Both model_prices_and_context_window.json and litellm/model_prices_and_context_window_backup.json need the search_context_cost_per_query field added to the us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0 entry
|
| Filename | Overview |
|---|---|
| model_prices_and_context_window.json | Adds 3 new GovCloud catalog entries and updates 2 existing token limits; missing search_context_cost_per_query on the converse entry |
| litellm/model_prices_and_context_window_backup.json | Mirror of main file changes; same search_context_cost_per_query omission applies to the us-gov converse entry |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[AWS GovCloud Bedrock Request] --> B{Model ID Format}
B -->|bedrock/us-gov-east-1/anthropic.claude-sonnet-4-5-20250929-v1:0| C[bedrock provider\ninput: 3.3e-06\noutput: 1.65e-05\nmax_output: 8192]
B -->|bedrock/us-gov-west-1/anthropic.claude-sonnet-4-5-20250929-v1:0| D[bedrock provider\ninput: 3.3e-06\noutput: 1.65e-05\nmax_output: 8192]
B -->|us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0| E[bedrock_converse provider\ninput: 3.3e-06\noutput: 1.65e-05\nmax_output: 64000\nsupports_native_structured_output: true]
C --> F[Cost Tracking]
D --> F
E --> F
F -->|search_context_cost_per_query missing| G[⚠️ Web Search Costs Untracked]
style G fill:#ff9999
style E fill:#ffffcc
Reviews (2): Last reviewed commit: "greptile fix" | Re-trigger Greptile
| "us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0": { | ||
| "cache_creation_input_token_cost": 4.125e-06, | ||
| "cache_read_input_token_cost": 3.3e-07, | ||
| "input_cost_per_token": 3.3e-06, | ||
| "input_cost_per_token_above_200k_tokens": 6.6e-06, | ||
| "output_cost_per_token_above_200k_tokens": 2.475e-05, | ||
| "cache_creation_input_token_cost_above_200k_tokens": 8.25e-06, | ||
| "cache_read_input_token_cost_above_200k_tokens": 6.6e-07, | ||
| "litellm_provider": "bedrock_converse", | ||
| "max_input_tokens": 200000, | ||
| "max_output_tokens": 64000, | ||
| "max_tokens": 64000, | ||
| "mode": "chat", | ||
| "output_cost_per_token": 1.65e-05, | ||
| "supports_assistant_prefill": true, | ||
| "supports_computer_use": true, | ||
| "supports_function_calling": true, | ||
| "supports_pdf_input": true, | ||
| "supports_prompt_caching": true, | ||
| "supports_reasoning": true, | ||
| "supports_response_schema": true, | ||
| "supports_tool_choice": true, | ||
| "supports_vision": true, | ||
| "tool_use_system_prompt_tokens": 346 | ||
| }, |
There was a problem hiding this comment.
Missing
supports_native_structured_output flag on the new GovCloud converse entry
The analogous cross-region commercial entry us.anthropic.claude-sonnet-4-5-20250929-v1:0 (line 28942) includes "supports_native_structured_output": true, but the new us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0 entry omits it. If GovCloud Bedrock Converse supports structured output for this model (same model weights, same Converse API surface), omitting this flag causes structured-output routing and validation to silently skip GovCloud users.
The same field is absent from the corresponding entry in litellm/model_prices_and_context_window_backup.json.
"tool_use_system_prompt_tokens": 346,
+ "supports_native_structured_output": true
},|
@greptile re-review |
* add us gov models * added max tokens * greptile fix --------- Co-authored-by: mubashir1osmani <mubashir.osmani777@gmail.com>
- Add 8 content PRs that merged directly to the release branch outside the listed staging PRs: #23769 (Ramp callback), #25252 (JWT OAuth2 override), #25254 (AWS GovCloud mode), #25258 (batch-limit cleanup), #25334 (router custom_llm_provider), #25345 (Triton embeddings), #25347 (tag-based routing), #25358 (Baseten pricing attribution) - Add @kedarthakkar to new contributors (first-ever PR via #23769) - Update RELEASE_NOTES_GENERATION_INSTRUCTIONS: require walking git log range between release tags in addition to staging PRs, and verify new-contributor status per author rather than trusting the GH release body floor
- Add 8 content PRs that merged directly to the release branch outside the listed staging PRs: BerriAI#23769 (Ramp callback), BerriAI#25252 (JWT OAuth2 override), BerriAI#25254 (AWS GovCloud mode), BerriAI#25258 (batch-limit cleanup), BerriAI#25334 (router custom_llm_provider), BerriAI#25345 (Triton embeddings), BerriAI#25347 (tag-based routing), BerriAI#25358 (Baseten pricing attribution) - Add @kedarthakkar to new contributors (first-ever PR via BerriAI#23769) - Update RELEASE_NOTES_GENERATION_INSTRUCTIONS: require walking git log range between release tags in addition to staging PRs, and verify new-contributor status per author rather than trusting the GH release body floor
- Add 8 content PRs that merged directly to the release branch outside the listed staging PRs: #23769 (Ramp callback), #25252 (JWT OAuth2 override), #25254 (AWS GovCloud mode), #25258 (batch-limit cleanup), #25334 (router custom_llm_provider), #25345 (Triton embeddings), #25347 (tag-based routing), #25358 (Baseten pricing attribution) - Add @kedarthakkar to new contributors (first-ever PR via #23769) - Update RELEASE_NOTES_GENERATION_INSTRUCTIONS: require walking git log range between release tags in addition to staging PRs, and verify new-contributor status per author rather than trusting the GH release body floor
Relevant issues
Register Claude Sonnet 4.5 for AWS GovCloud Bedrock in LiteLLM’s model catalog (model_prices_and_context_window.json and backup), with correct pricing, limits, and capability flags so routing, cost tracking, and validation behave like the commercial Bedrock entries.
Cause
US Gov workloads use different model IDs and regional prefixes (bedrock/us-gov-east-1/…, bedrock/us-gov-west-1/…, and us-gov.… for Converse) than standard commercial Bedrock. Without catalog rows, LiteLLM can’t resolve costs, context/output limits, or feature metadata for those model strings.
Some existing US Gov Sonnet 4.5–style entries still had max_output_tokens / max_tokens set to 4096, which is out of date relative to the model’s supported output cap (aligned to 8192 where applicable, and 64000 for the Converse-style us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0 entry, matching the commercial Converse pattern).
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewDelays in PR merge?
If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).
CI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🐛 Bug Fix
Changes
Add catalog entries for:
bedrock/us-gov-east-1/anthropic.claude-sonnet-4-5-20250929-v1:0
bedrock/us-gov-west-1/anthropic.claude-sonnet-4-5-20250929-v1:0
us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0 (bedrock_converse), including above-200k token pricing fields where used elsewhere for this family.
Raise max_output_tokens / max_tokens from 4096 → 8192 for the corresponding existing US Gov Bedrock Sonnet 4.5 entries (both regions) so defaults match current limits.
Keep litellm/model_prices_and_context_window_backup.json in sync with the same additions and limit corrections.