[Infra] Promote Internal Staging to main by yuneng-berri · Pull Request #25924 · BerriAI/litellm

yuneng-berri · 2026-04-17T01:14:34Z

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Screenshots / Proof of Fix

Type

🚄 Infrastructure

Changes

Replace static Modal.confirm with DeleteResourceModal so attachment delete reliably triggers the API call. Add a regression test covering the confirm->delete flow. Made-with: Cursor

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

Adopt a React Query mutation for policy attachment deletion and add a pending-state test on the policies index panel. This removes local delete-loading state and keeps modal loading tied to mutation status. Made-with: Cursor

…nup loop - proxy_server.py: disable allow_credentials when allow_origins=['*'] (wildcard + credentials is a browser security misconfiguration). Add LITELLM_CORS_ORIGINS env var to configure explicit allowed origins. - create_views.py: narrow broad 'except Exception' to only catch genuine 'view does not exist' errors; re-raise all other DB errors (auth, connection, etc.) that were previously silently swallowed. - spend_log_cleanup.py: validate execute_raw() return type is int before using it as a deletion count; break loop safely on unexpected types to prevent infinite deletion loops.

…w feedback) Add test_proxy_server_cors_invariant which directly imports and checks the module-level origins and allow_cors_credentials variables in proxy_server.py. This catches any future drift between the mirror helper and the real code.

…tests

…env vars reference

- New troubleshoot page and blog post with step-by-step comparison workflow - Screenshots under static/img/cost-discrepancy-debug - Link from spend tracking; sidebar entry under Troubleshooting - Flowchart SVG: Path B connectors below box; clarify LiteLLM schedules customer calls when stuck Made-with: Cursor

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

Strip thinking blocks from the request body and retry once when Anthropic returns an invalid thinking signature error (e.g. after credential or deployment change). Applies to all BaseAnthropicMessagesConfig providers (direct Anthropic, Bedrock, Vertex, Azure AI). Made-with: Cursor

fix: harden CORS credentials, create_views exception handling, and spend log cleanup loop

- Omit messages whose list content is empty after stripping thinking blocks - Retry only on HTTP 400 plus invalid-signature body match - Return response inline from retry loop; drop unreachable None guard - Tests: thinking-only turn dropped, non-400 no retry Made-with: Cursor

…hook

Block cross-team key update/regenerate operations by raising when the caller is not a member of the target key's team, and add unit coverage for deny/allow team membership paths. Made-with: Cursor

- Add optional instructions on MCPServer (config/DB/types) and Prisma migration. - MCPClient: fetch_upstream_initialize_instructions() for one-shot initialize. - Gateway merges per-request instructions: YAML/API overrides; otherwise fetch upstream initialize instructions (skip spec_path/OpenAPI-only servers). - Pass auth headers into instruction merge; ContextVar for gateway Server. - REST: wire instructions on connection-test MCPServer payloads. Made-with: Cursor

Remove the gateway-specific initialize fetch path and reuse instructions captured during existing MCP calls (list_tools/health_check/call_tool), while keeping YAML/DB instructions as immediate overrides. Made-with: Cursor

Extend existing test modules with coverage for the instructions merge logic, upstream cache, ContextVar-based injection, and client-side capture — following each file's established patterns. Made-with: Cursor

Remove the trivial one-line wrapper and access the dict directly. Made-with: Cursor

…-standard-logging-object fix(azure/passthrough): populate standard_logging_object via logging hook

…ctions feat(mcp): expose per-server InitializeResult.instructions from gateway

…he-key fix(caching): add Responses API params to cache key allow-list

[Infra] Merge dev branch

Fix version in docs

[Infra] Bump llm_translation_testing resource class to xlarge

…orker restarts Workers in llm_translation_testing have been crashing mid-run with "Not properly terminated" (OOM), even after bumping resource_class to xlarge. Reduce xdist workers from 8 to 4 to lower peak memory, and add --max-worker-restart=5 so a crashed worker is replaced instead of failing the whole run.

…ation_staging [Infra] Reduce llm_translation_testing parallelism and tolerate worker restarts

…ssertion Drop test_bedrock_invoke_messages_injects_thinking_for_clear_thinking_context_management. Its assertion 'interleaved-thinking-2025-05-14' in betas cannot hold because anthropic_beta_headers_config.json maps that header to null for the bedrock provider, so filter_and_transform_beta_headers drops it from the auto-added beta set before anthropic_beta is written to the request. The adjacent test_bedrock_invoke_messages_skips_thinking_injection_when_already_enabled already covers the inverse behavior for the same model, so no coverage is lost.

* Add announcement bar for Trivy compromise resolution notice Add a Docusaurus announcement bar to the top of the docs site informing users that the Trivy supply-chain compromise has been mitigated and resolved. The banner: - States all affected packages have been deleted and releases are safe - Links to the Security Townhall blog post for details - Links to the CI/CD v2 blog post for improvements made - Uses a green background with closeable dismiss button Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com> * Use :::note admonition instead of announcement bar Replace the Docusaurus announcementBar with a :::note admonition on the docs index page. The note appears below the hero image with the title 'Security Update' and links to the Security Townhall and CI/CD v2 blog posts. Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com> * Update security notice wording to 'contained' Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com> * Move note above hero image and add to root page - Move the security notice above the product screenshot on /docs - Add the same notice to the root page (src/pages/index.md) Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com> * Update security notice wording Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com>

…gBetaTest [Test] Remove dead Bedrock clear_thinking interleaved-thinking-beta assertion

Three tests inherited by TestBedrockMoonshotInvoke from BaseLLMChatTest make live AWS Bedrock completion calls: test_developer_role_translation, test_message_with_name, and test_completion_cost. These have been crashing llm_translation_testing CI workers (reported as "failed on setup with worker 'gwN' crashed"). Replace each with a mocked override that intercepts the outgoing request via HTTPHandler.post / AsyncHTTPHandler.post patching: - test_developer_role_translation asserts the outgoing body maps the developer role to system (LiteLLM's translation for non-OpenAI providers). - test_message_with_name asserts the outgoing body preserves the user message. - test_completion_cost returns a canned moonshot-shaped response body with usage and asserts response_cost > 0 against the local model cost map. Follows the existing HTTPHandler + patch.object(client, "post") pattern used in test_bedrock_gpt_oss.py and test_bedrock_completion.py. No network traffic; the three tests now complete in ~0.3s.

…itellm_/amazing-almeida # Conflicts: # tests/test_litellm/llms/bedrock/messages/invoke_transformations/test_anthropic_claude3_transformation.py

TogetherAIConfig.get_supported_openai_params called get_model_info(), whose first line calls litellm.get_supported_openai_params() — which for together_ai routes straight back into this method. The recursion only terminated when Python's recursion limit was hit or when _get_model_info_helper raised "not mapped" at the deepest level. Either way the try/except caught it, so the bug stayed silent — but the cycle ran ~332 deep every time, emitting hundreds of DEBUG log lines per call. Surfaced as "infinite loop" in CI when the success_handler thread emitted that log spam against an already-closed stderr during test teardown. Replace the get_model_info() call with supports_function_calling(), which uses _get_model_info_helper directly and does not call get_supported_openai_params. Measured drop from 332 to 2 _get_model_info_helper calls per first uncached lookup. Also swap the test model from Qwen/Qwen3.5-9B (not in model_cost map) back to a mapped serverless model, Qwen/Qwen2.5-7B-Instruct-Turbo. The mapping gap is what made the recursion's tail end raise up into the success handler during teardown in the first place.

Extends the prior moonshot mocking to cover every inherited BaseLLMChatTest test that still made a live AWS Bedrock call. Adds request-body assertions for each override. New overrides: - test_content_list_handling: verifies the outgoing body round-trips user content in list-of-text form; asserts response.choices[0]. message.content parses back from the canned response. - test_pydantic_model_input: verifies a pydantic Message input does not raise and produces a parseable response. - test_response_format_type_text_with_tool_calls_no_tool_choice: verifies tools are forwarded and response_format + drop_params do not break the call. - test_streaming: verifies stream=True routes to the invoke-with-response-stream endpoint. Bedrock invoke streaming is intercepted at the make_sync_call import site rather than via the caller-supplied client, because CustomStreamWrapper.fetch_sync_stream invokes the stored make_call partial with client=litellm.module_level_client, overriding any client passed by the caller. Extracts a shared _make_moonshot_response helper and a _invoke_with_mocked_post harness so all the sync mocks share one canned response body. After this change TestBedrockMoonshotInvoke runs 23 passed, 29 skipped, 0 live-callers, all in under 1s locally.

[Test] Mock Bedrock Moonshot tests + [Fix] TogetherAIConfig recursion

bump: proxy extras version 0.4.65 → 0.4.66

bump: version 1.83.8 → 1.83.9

vercel · 2026-04-17T01:14:40Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Apr 17, 2026 1:14am

greptile-apps · 2026-04-17T01:14:42Z

Too many files changed for review. (635 files found, 100 file limit)

codspeed-hq · 2026-04-17T01:17:10Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing litellm_internal_staging (bf7b7f7) with main (c0fc4c4)}

github-advanced-security

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

codecov · 2026-04-17T01:18:41Z

Codecov Report

❌ Patch coverage is 94.44444% with 16 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
litellm/llms/custom_httpx/llm_http_handler.py	60.71%	11 Missing ⚠️
litellm/llms/anthropic/common_utils.py	92.30%	2 Missing ⚠️
...tellm/llms/bedrock/chat/converse_transformation.py	83.33%	1 Missing ⚠️
litellm/llms/bedrock/realtime/handler.py	50.00%	1 Missing ⚠️
litellm/proxy/_experimental/mcp_server/server.py	97.67%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

gitguardian · 2026-04-17T01:19:12Z

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
29203053	Triggered	Generic Password	`ef774a1`	.circleci/config.yml	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secret safely. Learn here the best practices.
Revoke and rotate this secret.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

kothamah and others added 30 commits March 19, 2026 13:52

added bedrock guardrail API exception

a5dd01d

Added test cases for the null type handling

ead822b

added changes based on the feedback

168b0a0

fix(ui): delete policy attachments via controlled modal

a02ec3b

Replace static Modal.confirm with DeleteResourceModal so attachment delete reliably triggers the API call. Add a regression test covering the confirm->delete flow. Made-with: Cursor

Update ui/litellm-dashboard/src/components/policies/index.tsx

727a6f2

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

feat(health-check): add BACKGROUND_HEALTH_CHECK_MAX_TOKENS env var

6621c40

Fix code qa

d8598ab

Fix greptile reviews

b08e058

fix(ui): use mutation for attachment delete

0ba8adf

Adopt a React Query mutation for policy attachment deletion and add a pending-state test on the policies index panel. This removes local delete-loading state and keeps modal loading tied to mutation status. Made-with: Cursor

fix: address Greptile P1 review comments

e01fe01

refactor: extract _get_cors_config() for testability, fix no-op CORS …

f686d33

…tests

docs: add LITELLM_CORS_ORIGINS and LITELLM_CORS_ALLOW_CREDENTIALS to …

75438ac

…env vars reference

Update docs/my-website/blog/debug_cost_discrepancy/index.md

639135e

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

fix(caching): add Responses API params to cache key allow-list

8549774

Merge pull request #25559 from shreyescodes/fix/cors-and-db-safety-bugs

b0a40fd

fix: harden CORS credentials, create_views exception handling, and spend log cleanup loop

Fic code qa

0f453cc

fix(azure/passthrough): populate standard_logging_object via logging …

63281e8

…hook

fix(proxy): enforce team membership in team-scoped key management checks

b9cd32b

Block cross-team key update/regenerate operations by raising when the caller is not a member of the target key's team, and add unit coverage for deny/allow team membership paths. Made-with: Cursor

test: add unit tests for MCP initialize instructions feature

7e656f4

Extend existing test modules with coverage for the instructions merge logic, upstream cache, ContextVar-based injection, and client-side capture — following each file's established patterns. Made-with: Cursor

refactor: inline get_upstream_initialize_instructions

e7c630e

Remove the trivial one-line wrapper and access the dict directly. Made-with: Cursor

Merge pull request #25679 from michelligabriele/fix/azure-passthrough…

15ef3fa

…-standard-logging-object fix(azure/passthrough): populate standard_logging_object via logging hook

Merge pull request #25694 from milan-berri/feat/mcp-initialize-instru…

d479234

…ctions feat(mcp): expose per-server InitializeResult.instructions from gateway

Merge pull request #25673 from michelligabriele/fix/responses-api-cac…

f6058bd

…he-key fix(caching): add Responses API params to cache key allow-list

yuneng-berri and others added 19 commits April 16, 2026 09:48

bump: version 1.83.8 → 1.83.9

b80bd9d

bump: version 0.4.65 → 0.4.66

0736851

Fix version in docs

13522ff

Merge pull request #25871 from BerriAI/litellm_yj_apr15

21c0718

[Infra] Merge dev branch

Merge pull request #25875 from BerriAI/litellm_docs_opus_4.7

c6c970c

Fix version in docs

[Infra] Bump llm_translation_testing resource class to xlarge

72ba880

Merge pull request #25887 from BerriAI/litellm_/vigilant-cannon

65717ad

[Infra] Bump llm_translation_testing resource class to xlarge

Merge pull request #25898 from BerriAI/litellm_llmTranslationOomMitig…

7279dca

…ation_staging [Infra] Reduce llm_translation_testing parallelism and tolerate worker restarts

Merge pull request #25913 from BerriAI/litellm_dropDeadBedrockThinkin…

66f0d14

…gBetaTest [Test] Remove dead Bedrock clear_thinking interleaved-thinking-beta assertion

Merge remote-tracking branch 'origin/litellm_internal_staging' into l…

e1da27d

…itellm_/amazing-almeida # Conflicts: # tests/test_litellm/llms/bedrock/messages/invoke_transformations/test_anthropic_claude3_transformation.py

Merge pull request #25920 from BerriAI/litellm_/amazing-almeida

724926f

[Test] Mock Bedrock Moonshot tests + [Fix] TogetherAIConfig recursion

Merge pull request #25873 from BerriAI/yj_extras_bump_apr16

f07aadc

bump: proxy extras version 0.4.65 → 0.4.66

Merge pull request #25872 from BerriAI/yj_bump_apr16_2

bf7b7f7

bump: version 1.83.8 → 1.83.9

yuneng-berri requested a review from ryan-crabbe-berri April 17, 2026 01:15

ryan-crabbe-berri approved these changes Apr 17, 2026

View reviewed changes

shin-berri self-requested a review April 17, 2026 01:17

shin-berri approved these changes Apr 17, 2026

View reviewed changes

github-advanced-security AI found potential problems Apr 17, 2026

View reviewed changes

yuneng-berri merged commit 850fe59 into main Apr 17, 2026
105 of 113 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Infra] Promote Internal Staging to main#25924

[Infra] Promote Internal Staging to main#25924
yuneng-berri merged 207 commits intomainfrom
litellm_internal_staging

yuneng-berri commented Apr 17, 2026

Uh oh!

vercel Bot commented Apr 17, 2026

Uh oh!

greptile-apps Bot commented Apr 17, 2026

Uh oh!

codspeed-hq Bot commented Apr 17, 2026

Uh oh!

github-advanced-security AI left a comment

Uh oh!

codecov Bot commented Apr 17, 2026 •

edited

Loading

Uh oh!

gitguardian Bot commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Uh oh!

Conversation

yuneng-berri commented Apr 17, 2026

Relevant issues

Pre-Submission checklist

Delays in PR merge?

CI (LiteLLM team)

Screenshots / Proof of Fix

Type

Changes

Uh oh!

vercel Bot commented Apr 17, 2026

Uh oh!

greptile-apps Bot commented Apr 17, 2026

Uh oh!

codspeed-hq Bot commented Apr 17, 2026

Merging this PR will not alter performance

Uh oh!

github-advanced-security AI left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gitguardian Bot commented Apr 17, 2026

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

codecov Bot commented Apr 17, 2026 •

edited

Loading