add us gov models by mubashir1osmani · Pull Request #24660 · BerriAI/litellm

mubashir1osmani · 2026-03-27T03:00:47Z

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

vercel · 2026-03-27T03:00:53Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Mar 27, 2026 8:12pm

codspeed-hq · 2026-03-27T03:03:09Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing mubashir1osmani:aws-gov-models (d8a2a3f) with main (d949085)}

greptile-apps · 2026-03-27T03:04:11Z

Greptile Summary

This PR adds AWS GovCloud (us-gov-east-1 / us-gov-west-1) support for Claude Sonnet 4.5 (claude-sonnet-4-5-20250929-v1:0) by adding three new JSON entries and correcting two pre-existing ones. The changes land in both the root model_prices_and_context_window.json and its backup copy in litellm/.

Key changes:

Adds bedrock/us-gov-east-1/anthropic.claude-sonnet-4-5-20250929-v1:0 and bedrock/us-gov-west-1/anthropic.claude-sonnet-4-5-20250929-v1:0 (bedrock direct-API entries, max_output_tokens: 8192, consistent with all other GovCloud bedrock direct-API entries in the file)
Adds us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0 (bedrock_converse cross-region inference profile, max_output_tokens: 64000) — the us-gov prefix is already recognised by get_bedrock_cross_region_inference_regions() in common_utils.py, so routing will work correctly
Bumps max_output_tokens / max_tokens from 4096 → 8192 on the two pre-existing bedrock/us-gov-*/claude-sonnet-4-5-20250929-v1:0 entries, aligning them with the new anthropic.-prefixed siblings
The new bedrock_converse entry still omits search_context_cost_per_query, which all equivalent cross-region entries (us., au.) include — this was flagged in a prior review thread and remains unresolved

Process note: The PR checklist items for tests and CI runs are unchecked. While a pure JSON config addition is low-risk, the project requires at least one new unit test per contribution.

Confidence Score: 4/5

Safe to merge for routing correctness; one cost-estimation field remains missing from the new bedrock_converse entry per a prior thread, and no tests were added

The routing infrastructure already supports the us-gov. prefix, all new entry costs mirror the equivalent us. entry, and the max_output_tokens inconsistency flagged in the prior thread was resolved by updating the pre-existing entries. Score stops at 4 because the search_context_cost_per_query omission from the new us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0 entry (called out in a previous review cycle) is still unaddressed, and the mandatory test requirement is unfulfilled.

model_prices_and_context_window.json — specifically the us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0 entry around line 28463

Important Files Changed

Filename	Overview
model_prices_and_context_window.json	Adds two new bedrock direct-API entries for us-gov-east-1 and us-gov-west-1 (with `anthropic.` prefix), one new bedrock_converse cross-region inference profile entry (`us-gov.`), and updates existing `claude-sonnet-4-5-20250929-v1:0` GovCloud bedrock entries from 4096 → 8192 max output tokens; the new bedrock_converse entry is missing `search_context_cost_per_query` (see prior thread)
litellm/model_prices_and_context_window_backup.json	Mirror of all changes in the root JSON — identical set of additions and `max_output_tokens` bumps for GovCloud Sonnet 4.5 entries

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[LiteLLM request] --> B{Parse model prefix}
    B -->|bedrock/us-gov-east-1/...| C[bedrock direct API\nus-gov-east-1]
    B -->|bedrock/us-gov-west-1/...| D[bedrock direct API\nus-gov-west-1]
    B -->|us-gov.*| E{get_bedrock_cross_region_inference_regions\ncontains 'us-gov'?}
    E -->|Yes| F[bedrock_converse\ncross-region inference profile\nroute to us-gov-east-1 or us-gov-west-1]
    E -->|No| G[Error: unrecognised prefix]
    C --> H[anthropic.claude-sonnet-4-5-20250929-v1:0\nmax_output_tokens: 8192]
    D --> I[anthropic.claude-sonnet-4-5-20250929-v1:0\nmax_output_tokens: 8192]
    F --> J[anthropic.claude-sonnet-4-5-20250929-v1:0\nmax_output_tokens: 64000]

_{Reviews (2): Last reviewed commit: "added max tokens" | Re-trigger Greptile}

greptile-apps · 2026-03-27T03:04:15Z

+        "max_output_tokens": 8192,
+        "max_tokens": 8192,


max_output_tokens inconsistency with pre-existing sibling entry

The new bedrock/us-gov-east-1/anthropic.claude-sonnet-4-5-20250929-v1:0 entry uses max_output_tokens: 8192 and max_tokens: 8192, but the pre-existing entry for the same model in the same region — bedrock/us-gov-east-1/claude-sonnet-4-5-20250929-v1:0 (lines 7669–7688) — uses max_output_tokens: 4096 and max_tokens: 4096.

These two keys map to the same underlying model (Bedrock model IDs with and without the anthropic. prefix are interchangeable). Having different max_output_tokens for the same model creates confusion and potential bugs: a user who looks up context window information will get different results depending on which key format they happen to use.

The same inconsistency exists for the west-1 variant (bedrock/us-gov-west-1/anthropic.claude-sonnet-4-5-20250929-v1:0 reports 8192 while bedrock/us-gov-west-1/claude-sonnet-4-5-20250929-v1:0 reports 4096).

For reference, other bedrock/ format Claude models in these us-gov regions consistently use max_output_tokens: 8192 (e.g. bedrock/us-gov-west-1/anthropic.claude-3-7-sonnet-20250219-v1:0 and bedrock/us-gov-west-1/anthropic.claude-3-5-sonnet-20240620-v1:0), suggesting 8192 is correct and the pre-existing 4096 entries should be updated to match.

greptile-apps · 2026-03-27T03:04:16Z

+    "us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0": {
+        "cache_creation_input_token_cost": 4.125e-06,
+        "cache_read_input_token_cost": 3.3e-07,
+        "input_cost_per_token": 3.3e-06,
+        "input_cost_per_token_above_200k_tokens": 6.6e-06,
+        "output_cost_per_token_above_200k_tokens": 2.475e-05,
+        "cache_creation_input_token_cost_above_200k_tokens": 8.25e-06,
+        "cache_read_input_token_cost_above_200k_tokens": 6.6e-07,
+        "litellm_provider": "bedrock_converse",
+        "max_input_tokens": 200000,
+        "max_output_tokens": 64000,
+        "max_tokens": 64000,
+        "mode": "chat",
+        "output_cost_per_token": 1.65e-05,
+        "supports_assistant_prefill": true,
+        "supports_computer_use": true,
+        "supports_function_calling": true,
+        "supports_pdf_input": true,
+        "supports_prompt_caching": true,
+        "supports_reasoning": true,
+        "supports_response_schema": true,
+        "supports_tool_choice": true,
+        "supports_vision": true,
+        "tool_use_system_prompt_tokens": 346
+    },


Missing search_context_cost_per_query compared to equivalent cross-region entries

All other bedrock_converse cross-region Claude Sonnet 4.5 entries include a search_context_cost_per_query block:

us.anthropic.claude-sonnet-4-5-20250929-v1:0 — has search_context_cost_per_query

au.anthropic.claude-sonnet-4-5-20250929-v1:0 — has search_context_cost_per_query

The new us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0 entry omits this field entirely. If the web search tool is not available on AWS GovCloud, this is intentional and should be noted (a comment or doc reference would help). If it is available, the field should be added for cost-estimation correctness.

* add us gov models * added max tokens

* fix(vertex_ai): support pluggable (executable) credential_source for WIF auth (#24700) The WIF credential dispatch in load_auth() only handled identity_pool and aws credential types. When credential_source.executable was present (used for Azure Managed Identity via Workload Identity Federation), it fell through to identity_pool.Credentials which rejected it with MalformedError. Add dispatch to google.auth.pluggable.Credentials for executable-type credential sources, following the same pattern as the existing identity_pool and aws helpers. Fixes authentication for Azure Container Apps → GCP Vertex AI via WIF with executable credential sources. * feat(logging): add component and logger fields to JSON logs for 3rd p… (#24447) * feat(logging): add component and logger fields to JSON logs for 3rd party filtering * Let user-supplied extra fields win over auto-generated component/logger, tighten test assertions * Feat - Add organization into the metrics metadata for org_id & org_alias (#24440) * Add org_id and org_alias label names to Prometheus metric definitions * Add user_api_key_org_alias to StandardLoggingUserAPIKeyMetadata * Populate user_api_key_org_alias in pre-call metadata * Pass org_id and org_alias into per-request Prometheus metric labels * Add test for org labels on per-request Prometheus metrics * chore: resolve test mockdata * Address review: populate org_alias from DB view, add feature flag, use .get() for org metadata * Add org labels to failure path and verify flag behavior in test * Fix test: build flag-off enum_values without org fields * Gate org labels behind feature flag in get_labels() instead of static metric lists * Scope org label injection to metrics that carry team context, remove orphaned budget label defs, add test teardown * Use explicit metric allowlist for org label injection instead of team heuristic * Fix duplicate org label guard, move _org_label_metrics to class constant * Reset custom_prometheus_metadata_labels after duplicate label assertion * fix: emit org labels by default, remove flag, fix missing org_alias in all metadata paths * fix: emit org labels by default, no opt-in flag required * fix: write org_alias to metadata unconditionally in proxy_server.py * fix: 429s from batch creation being converted to 500 (#24703) * add us gov models (#24660) * add us gov models * added max tokens * Litellm dev 04 02 2026 p1 (#25052) * fix: replace hardcoded url * fix: Anthropic web search cost not tracked for Chat Completions The ModelResponse branch in response_object_includes_web_search_call() only checked url_citation annotations and prompt_tokens_details, missing Anthropic's server_tool_use.web_search_requests field. This caused _handle_web_search_cost() to never fire for Anthropic Claude models. Also routes vertex_ai/claude-* models to the Anthropic cost calculator instead of the Gemini one, since Claude on Vertex uses the same server_tool_use billing structure as the direct Anthropic API. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix(anthropic): pass logging_obj to client.post for litellm_overhead_time_ms (#24071) When LITELLM_DETAILED_TIMING=true, litellm_overhead_time_ms was null for Anthropic because the handler did not pass logging_obj to client.post(), so track_llm_api_timing could not set llm_api_duration_ms. Pass logging_obj=logging_obj at all four post() call sites (make_call, make_sync_call, acompletion, completion). Add test to ensure make_call passes logging_obj to client.post. Made-with: Cursor * sap - add additional parameters for grounding - additional parameter for grounding added for the sap provider * sap - fix models * (sap) add filtering, masking, translation SAP GEN AI Hub modules * (sap) add tests and docs for new SAP modules * (sap) add support of multiple modules config * (sap) code refactoring * (sap) rename file * test(): add safeguard tests * (sap) update tests * (sap) update docs, solve merge conflict in transformation.py * (sap) linter fix * (sap) Align embedding request transformation with current API * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) mock commit * (sap) run black formater * (sap) add literals to models, add negative tests, fix test for tool transformation * (sap) fix formating * (sap) fix models * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) commit for rerun bot review * (sap) minor improve * (sap) fix after bot review * (sap) lint fix * docs(sap): update documentation * fix(sap): change creds priority * fix(sap): change creds priority * fix(sap): fix sap creds unit test * fix(sap): linter fix * fix(sap): linter fix * linter fix * (sap) update logic of fetching creds, add additional tests * (sap) clean up code * (sap) fix after review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) add a possibility to put the service key by both variants * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) update test * (sap) update service key resolve function * (sap) run black formater * (sap) fix validate credentials, add negative tests for credential fetching * (sap) fix validate credentials, add negative tests for credential fetching * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) lint fix * (sap) lint fix * feat: support service_tier in gemini * chore: add a service_tier field mapping from openai to gemini * fix: use x-gemini-service-tier header in response * docs: add service_tier to gemini docs * chore: add defaut/standard mapping, and some tests * chore: tidying up some case insensitivity * chore: remove unnecessary guard * fix: remove redundant test file * fix: handle 'auto' case-insensitively * fix: return service_tier on final steamed chunk * chore: black * feat: enable supports_service_tier to gemini models * Fix get_standard_logging_metadata tests * Fix test_get_model_info_bedrock_models * Fix test_get_model_info_bedrock_models * Fix remaining tests * Fix mypy issues * Fix tests * Fix merge conflicts * Fix code qa * Fix code qa * Fix code qa * Fix greptile review --------- Co-authored-by: michelligabriele <gabriele.michelli@icloud.com> Co-authored-by: Josh <36064836+J-Byron@users.noreply.github.com> Co-authored-by: mubashir1osmani <mubashir.osmani777@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: milan-berri <milan@berri.ai> Co-authored-by: Alperen Kömürcü <alperen.koemuercue@sap.com> Co-authored-by: Vasilisa Parshikova <vasilisa.parshikova@sap.com> Co-authored-by: Lin Xu <lin.xu03@sap.com> Co-authored-by: Mark McDonald <macd@google.com> Co-authored-by: Sameer Kankute <sameer@berri.ai>

add us gov models

053a26b

vercel bot deployed to Preview March 27, 2026 03:02 View deployment

greptile-apps bot reviewed Mar 27, 2026

View reviewed changes

added max tokens

d8a2a3f

vercel bot deployed to Preview March 27, 2026 20:12 View deployment

krrish-berri-2 changed the base branch from main to litellm_oss_staging_04_02_2026_p1 April 3, 2026 04:48

krrish-berri-2 merged commit c54aee3 into BerriAI:litellm_oss_staging_04_02_2026_p1 Apr 3, 2026
40 of 42 checks passed

Sameerlite pushed a commit that referenced this pull request Apr 8, 2026

add us gov models (#24660)

19a6e60

* add us gov models * added max tokens

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add us gov models#24660

add us gov models#24660
krrish-berri-2 merged 2 commits intoBerriAI:litellm_oss_staging_04_02_2026_p1from
mubashir1osmani:aws-gov-models

mubashir1osmani commented Mar 27, 2026

Uh oh!

vercel bot commented Mar 27, 2026 •

edited

Loading

Uh oh!

codspeed-hq bot commented Mar 27, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 27, 2026 •

edited

Loading

Important Files Changed

Uh oh!

greptile-apps bot Mar 27, 2026

Uh oh!

greptile-apps bot Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

mubashir1osmani commented Mar 27, 2026

Relevant issues

Pre-Submission checklist

Delays in PR merge?

CI (LiteLLM team)

Type

Changes

Uh oh!

vercel bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

greptile-apps bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel bot commented Mar 27, 2026 •

edited

Loading

codspeed-hq bot commented Mar 27, 2026 •

edited

Loading

greptile-apps bot commented Mar 27, 2026 •

edited

Loading