Skip to content

[Infra] Promote Internal Staging to main#25924

Merged
yuneng-berri merged 207 commits intomainfrom
litellm_internal_staging
Apr 17, 2026
Merged

[Infra] Promote Internal Staging to main#25924
yuneng-berri merged 207 commits intomainfrom
litellm_internal_staging

Conversation

@yuneng-berri
Copy link
Copy Markdown
Collaborator

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Screenshots / Proof of Fix

Type

🚄 Infrastructure

Changes

kothamah and others added 30 commits March 19, 2026 13:52
Replace static Modal.confirm with DeleteResourceModal so attachment delete reliably triggers the API call. Add a regression test covering the confirm->delete flow.

Made-with: Cursor
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Adopt a React Query mutation for policy attachment deletion and add a pending-state test on the policies index panel. This removes local delete-loading state and keeps modal loading tied to mutation status.

Made-with: Cursor
…nup loop

- proxy_server.py: disable allow_credentials when allow_origins=['*'] (wildcard
  + credentials is a browser security misconfiguration). Add LITELLM_CORS_ORIGINS
  env var to configure explicit allowed origins.
- create_views.py: narrow broad 'except Exception' to only catch genuine
  'view does not exist' errors; re-raise all other DB errors (auth, connection,
  etc.) that were previously silently swallowed.
- spend_log_cleanup.py: validate execute_raw() return type is int before using
  it as a deletion count; break loop safely on unexpected types to prevent
  infinite deletion loops.
…w feedback)

Add test_proxy_server_cors_invariant which directly imports and checks the
module-level origins and allow_cors_credentials variables in proxy_server.py.
This catches any future drift between the mirror helper and the real code.
- New troubleshoot page and blog post with step-by-step comparison workflow
- Screenshots under static/img/cost-discrepancy-debug
- Link from spend tracking; sidebar entry under Troubleshooting
- Flowchart SVG: Path B connectors below box; clarify LiteLLM schedules customer calls when stuck

Made-with: Cursor
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Strip thinking blocks from the request body and retry once when Anthropic returns an invalid thinking signature error (e.g. after credential or deployment change). Applies to all BaseAnthropicMessagesConfig providers (direct Anthropic, Bedrock, Vertex, Azure AI).

Made-with: Cursor
fix: harden CORS credentials, create_views exception handling, and spend log cleanup loop
- Omit messages whose list content is empty after stripping thinking blocks
- Retry only on HTTP 400 plus invalid-signature body match
- Return response inline from retry loop; drop unreachable None guard
- Tests: thinking-only turn dropped, non-400 no retry

Made-with: Cursor
Block cross-team key update/regenerate operations by raising when the caller is not a member of the target key's team, and add unit coverage for deny/allow team membership paths.

Made-with: Cursor
- Add optional instructions on MCPServer (config/DB/types) and Prisma migration.
- MCPClient: fetch_upstream_initialize_instructions() for one-shot initialize.
- Gateway merges per-request instructions: YAML/API overrides; otherwise fetch
  upstream initialize instructions (skip spec_path/OpenAPI-only servers).
- Pass auth headers into instruction merge; ContextVar for gateway Server.
- REST: wire instructions on connection-test MCPServer payloads.

Made-with: Cursor
Remove the gateway-specific initialize fetch path and reuse instructions captured during existing MCP calls (list_tools/health_check/call_tool), while keeping YAML/DB instructions as immediate overrides.

Made-with: Cursor
Extend existing test modules with coverage for the instructions merge
logic, upstream cache, ContextVar-based injection, and client-side
capture — following each file's established patterns.

Made-with: Cursor
Remove the trivial one-line wrapper and access the dict directly.

Made-with: Cursor
…-standard-logging-object

fix(azure/passthrough): populate standard_logging_object via logging hook
…ctions

feat(mcp): expose per-server InitializeResult.instructions from gateway
…he-key

fix(caching): add Responses API params to cache key allow-list
yuneng-berri and others added 19 commits April 16, 2026 09:48
[Infra] Bump llm_translation_testing resource class to xlarge
…orker restarts

Workers in llm_translation_testing have been crashing mid-run with
"Not properly terminated" (OOM), even after bumping resource_class to
xlarge. Reduce xdist workers from 8 to 4 to lower peak memory, and add
--max-worker-restart=5 so a crashed worker is replaced instead of
failing the whole run.
…ation_staging

[Infra] Reduce llm_translation_testing parallelism and tolerate worker restarts
…ssertion

Drop test_bedrock_invoke_messages_injects_thinking_for_clear_thinking_context_management.
Its assertion 'interleaved-thinking-2025-05-14' in betas cannot hold because
anthropic_beta_headers_config.json maps that header to null for the bedrock
provider, so filter_and_transform_beta_headers drops it from the auto-added
beta set before anthropic_beta is written to the request.

The adjacent test_bedrock_invoke_messages_skips_thinking_injection_when_already_enabled
already covers the inverse behavior for the same model, so no coverage is lost.
* Add announcement bar for Trivy compromise resolution notice

Add a Docusaurus announcement bar to the top of the docs site informing
users that the Trivy supply-chain compromise has been mitigated and
resolved. The banner:
- States all affected packages have been deleted and releases are safe
- Links to the Security Townhall blog post for details
- Links to the CI/CD v2 blog post for improvements made
- Uses a green background with closeable dismiss button

Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com>

* Use :::note admonition instead of announcement bar

Replace the Docusaurus announcementBar with a :::note admonition on the
docs index page. The note appears below the hero image with the title
'Security Update' and links to the Security Townhall and CI/CD v2 blog
posts.

Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com>

* Update security notice wording to 'contained'

Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com>

* Move note above hero image and add to root page

- Move the security notice above the product screenshot on /docs
- Add the same notice to the root page (src/pages/index.md)

Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com>

* Update security notice wording

Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com>
…gBetaTest

[Test] Remove dead Bedrock clear_thinking interleaved-thinking-beta assertion
Three tests inherited by TestBedrockMoonshotInvoke from BaseLLMChatTest
make live AWS Bedrock completion calls: test_developer_role_translation,
test_message_with_name, and test_completion_cost. These have been
crashing llm_translation_testing CI workers (reported as "failed on
setup with worker 'gwN' crashed").

Replace each with a mocked override that intercepts the outgoing
request via HTTPHandler.post / AsyncHTTPHandler.post patching:

- test_developer_role_translation asserts the outgoing body maps the
  developer role to system (LiteLLM's translation for non-OpenAI
  providers).
- test_message_with_name asserts the outgoing body preserves the user
  message.
- test_completion_cost returns a canned moonshot-shaped response body
  with usage and asserts response_cost > 0 against the local model
  cost map.

Follows the existing HTTPHandler + patch.object(client, "post") pattern
used in test_bedrock_gpt_oss.py and test_bedrock_completion.py. No
network traffic; the three tests now complete in ~0.3s.
…itellm_/amazing-almeida

# Conflicts:
#	tests/test_litellm/llms/bedrock/messages/invoke_transformations/test_anthropic_claude3_transformation.py
TogetherAIConfig.get_supported_openai_params called get_model_info(),
whose first line calls litellm.get_supported_openai_params() — which for
together_ai routes straight back into this method. The recursion only
terminated when Python's recursion limit was hit or when
_get_model_info_helper raised "not mapped" at the deepest level. Either
way the try/except caught it, so the bug stayed silent — but the cycle
ran ~332 deep every time, emitting hundreds of DEBUG log lines per
call. Surfaced as "infinite loop" in CI when the success_handler thread
emitted that log spam against an already-closed stderr during test
teardown.

Replace the get_model_info() call with supports_function_calling(),
which uses _get_model_info_helper directly and does not call
get_supported_openai_params. Measured drop from 332 to 2
_get_model_info_helper calls per first uncached lookup.

Also swap the test model from Qwen/Qwen3.5-9B (not in model_cost map)
back to a mapped serverless model, Qwen/Qwen2.5-7B-Instruct-Turbo. The
mapping gap is what made the recursion's tail end raise up into the
success handler during teardown in the first place.
Extends the prior moonshot mocking to cover every inherited
BaseLLMChatTest test that still made a live AWS Bedrock call. Adds
request-body assertions for each override.

New overrides:

- test_content_list_handling: verifies the outgoing body round-trips
  user content in list-of-text form; asserts response.choices[0].
  message.content parses back from the canned response.
- test_pydantic_model_input: verifies a pydantic Message input does
  not raise and produces a parseable response.
- test_response_format_type_text_with_tool_calls_no_tool_choice:
  verifies tools are forwarded and response_format + drop_params do
  not break the call.
- test_streaming: verifies stream=True routes to the
  invoke-with-response-stream endpoint. Bedrock invoke streaming is
  intercepted at the make_sync_call import site rather than via the
  caller-supplied client, because CustomStreamWrapper.fetch_sync_stream
  invokes the stored make_call partial with
  client=litellm.module_level_client, overriding any client passed by
  the caller.

Extracts a shared _make_moonshot_response helper and a
_invoke_with_mocked_post harness so all the sync mocks share one
canned response body.

After this change TestBedrockMoonshotInvoke runs 23 passed, 29
skipped, 0 live-callers, all in under 1s locally.
[Test] Mock Bedrock Moonshot tests + [Fix] TogetherAIConfig recursion
bump: proxy extras version 0.4.65 → 0.4.66
bump: version 1.83.8 → 1.83.9
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 17, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 17, 2026 1:14am

Request Review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 17, 2026

Too many files changed for review. (635 files found, 100 file limit)

@shin-berri shin-berri self-requested a review April 17, 2026 01:17
@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented Apr 17, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing litellm_internal_staging (bf7b7f7) with main (c0fc4c4)

Open in CodSpeed

Copy link
Copy Markdown
Contributor

@github-advanced-security github-advanced-security AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 17, 2026

@gitguardian
Copy link
Copy Markdown

gitguardian Bot commented Apr 17, 2026

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
29203053 Triggered Generic Password ef774a1 .circleci/config.yml View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@yuneng-berri yuneng-berri merged commit 850fe59 into main Apr 17, 2026
105 of 113 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.