Skip to content

Litellm ishaan april14#25699

Merged
ishaan-berri merged 41 commits intolitellm_internal_stagingfrom
litellm_ishaan_april14
Apr 16, 2026
Merged

Litellm ishaan april14#25699
ishaan-berri merged 41 commits intolitellm_internal_stagingfrom
litellm_ishaan_april14

Conversation

@ishaan-berri
Copy link
Copy Markdown
Contributor

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Screenshots / Proof of Fix

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

Sameerlite and others added 13 commits April 8, 2026 19:43
Strip thinking blocks from the request body and retry once when Anthropic returns an invalid thinking signature error (e.g. after credential or deployment change). Applies to all BaseAnthropicMessagesConfig providers (direct Anthropic, Bedrock, Vertex, Azure AI).

Made-with: Cursor
- Omit messages whose list content is empty after stripping thinking blocks
- Retry only on HTTP 400 plus invalid-signature body match
- Return response inline from retry loop; drop unreachable None guard
- Tests: thinking-only turn dropped, non-400 no retry

Made-with: Cursor
- Add optional instructions on MCPServer (config/DB/types) and Prisma migration.
- MCPClient: fetch_upstream_initialize_instructions() for one-shot initialize.
- Gateway merges per-request instructions: YAML/API overrides; otherwise fetch
  upstream initialize instructions (skip spec_path/OpenAPI-only servers).
- Pass auth headers into instruction merge; ContextVar for gateway Server.
- REST: wire instructions on connection-test MCPServer payloads.

Made-with: Cursor
Remove the gateway-specific initialize fetch path and reuse instructions captured during existing MCP calls (list_tools/health_check/call_tool), while keeping YAML/DB instructions as immediate overrides.

Made-with: Cursor
Extend existing test modules with coverage for the instructions merge
logic, upstream cache, ContextVar-based injection, and client-side
capture — following each file's established patterns.

Made-with: Cursor
Remove the trivial one-line wrapper and access the dict directly.

Made-with: Cursor
…-standard-logging-object

fix(azure/passthrough): populate standard_logging_object via logging hook
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 14, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 16, 2026 1:53am

Request Review

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 14, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
3 out of 4 committers have signed the CLA.

✅ Sameerlite
✅ michelligabriele
✅ milan-berri
❌ ishaan-berri
You have signed the CLA already but the status is still pending? Let us recheck it.

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented Apr 14, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing litellm_ishaan_april14 (9977e63) with main (72a461b)

Open in CodSpeed

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 14, 2026

Greptile Summary

This PR bundles four features: (1) MCP gateway InitializeResult.instructions forwarding — instructions from upstream servers or YAML config are merged and injected into the gateway's initialize response via a new _mcp_gateway_initialize_instructions ContextVar; (2) an HTTP-error retry loop for Anthropic /v1/messages to strip invalid thinking signatures; (3) a BACKGROUND_HEALTH_CHECK_MAX_TOKENS env-var override for health checks; and (4) Azure passthrough standard-logging-object population.

  • The load_servers_from_config method in mcp_server_manager.py contains a duplicate alias-resolution block (lines 259–283 are an exact copy of 233–257). Because used_aliases is already populated after the first block, mcp_aliases-only servers lose their alias in the second block — MCPServer ends up with alias=None and the wrong name prefix.

Confidence Score: 4/5

Not safe to merge without fixing the duplicate alias-resolution block in load_servers_from_config.

One P1 logic bug: the duplicate block in load_servers_from_config silently strips mcp_aliases-derived aliases, causing wrong tool-name prefixes. The P2 finding (hardcoded model strings) doesn't block merge. All other changes are sound.

litellm/proxy/_experimental/mcp_server/mcp_server_manager.py — duplicate alias-resolution block.

Important Files Changed

Filename Overview
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py Adds instructions field support and _remember_upstream_initialize_instructions; contains a duplicate alias-resolution block (lines 259–283) that silently breaks mcp_aliases-configured servers.
litellm/proxy/_experimental/mcp_server/server.py Adds _gateway_create_initialization_options to merge gateway instructions into InitializeResult, backed by _mcp_gateway_initialize_instructions ContextVar — logic looks correct.
litellm/proxy/_experimental/mcp_server/mcp_context.py New module introducing _mcp_gateway_initialize_instructions ContextVar to carry per-request merged instructions — clean, minimal, no issues.
litellm/llms/anthropic/common_utils.py Adds _is_claude_4_6_model helper with hardcoded version strings, violating the team's rule about model-specific flags; used by is_effort_used to skip a beta header for 4.6 models.
litellm/llms/custom_httpx/llm_http_handler.py Adds HTTP-error retry loop for Anthropic /v1/messages; optional_params_dict is initialised from litellm_params (noted in previous thread, not re-flagged).
litellm-proxy-extras/litellm_proxy_extras/migrations/20260414140000_add_mcp_server_instructions/migration.sql Simple ADD COLUMN IF NOT EXISTS "instructions" TEXT migration — safe and idempotent.
litellm/proxy/health_check.py Adds BACKGROUND_HEALTH_CHECK_MAX_TOKENS env-var override for health-check token limit; straightforward and non-breaking.
litellm/llms/azure/passthrough/transformation.py Adds logging_non_streaming_response to populate the standard logging object for Azure passthrough chat completions — inline imports present (per prior thread).
litellm/types/mcp_server/mcp_server_manager.py Adds optional instructions: Optional[str] field to MCPServer Pydantic model — straightforward, backward-compatible.

Sequence Diagram

sequenceDiagram
    participant Client as MCP Client
    participant Gateway as LiteLLM MCP Gateway
    participant CtxVar as _mcp_gateway_initialize_instructions (ContextVar)
    participant Upstream as Upstream MCP Server

    Client->>Gateway: initialize request
    Gateway->>CtxVar: _gateway_initialize_instructions_request_scope()
    Gateway->>Upstream: list_tools (fetches instructions via MCPClient)
    Upstream-->>Gateway: tools + InitializeResult.instructions
    Gateway->>CtxVar: set(merged instructions)
    Gateway-->>Client: InitializeResult{instructions: merged}
    Note over Gateway,CtxVar: ContextVar reset after request scope
Loading

Comments Outside Diff (1)

  1. litellm/proxy/_experimental/mcp_server/mcp_server_manager.py, line 259-283 (link)

    P1 Duplicate alias-resolution block clobbers mcp_aliases-derived aliases

    Lines 259–283 are an exact copy of lines 233–257. On the second pass, alias is reset to server_config.get("alias", None) and then the mcp_aliases lookup immediately fails because alias_name not in used_aliases is already False — the first block already added it to used_aliases. As a result, any server whose alias comes only from mcp_aliases (not from server_config["alias"]) ends up with alias=None and name=server_name instead of the alias-based prefix, silently breaking tool-name prefixing and MCPServer.alias for all such servers.

    The second block should be removed entirely; the first block already computes the correct alias and name_for_prefix.

Reviews (8): Last reviewed commit: "chore: merge litellm_internal_staging, r..." | Re-trigger Greptile


def logging_non_streaming_response(
self,
model: str,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unused custom_llm_provider parameter

custom_llm_provider is declared in the signature but never referenced in the method body. The Bedrock counterpart uses it to look up the provider config dynamically (LlmProviders(custom_llm_provider)); Azure hardcodes OpenAIGPTConfig instead. Consider either using the parameter or marking it with a leading _ to signal intentional non-use, so future readers aren't confused.

Suggested change
model: str,
custom_llm_provider: str, # noqa: ARG002 – not needed; Azure always uses OpenAIGPTConfig

Comment on lines +100 to +102
from litellm import encoding
from litellm.llms.openai.chat.gpt_transformation import OpenAIGPTConfig
from litellm.types.utils import ModelResponse
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Inline imports conflict with CLAUDE.md style guidance

CLAUDE.md says "Avoid imports within methods — place all imports at the top of the file (module-level). The only exception is avoiding circular imports where absolutely necessary." The Bedrock implementation uses the same inline-import pattern so these are presumably needed to break circular imports, but a brief comment would make the intent explicit — e.g.:

# Inline to avoid circular import with litellm.__init__
from litellm import encoding
from litellm.llms.openai.chat.gpt_transformation import OpenAIGPTConfig
from litellm.types.utils import ModelResponse

Context Used: CLAUDE.md (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

import httpx
from httpx import Response

from litellm.litellm_core_utils.litellm_logging import Logging
if TYPE_CHECKING:
from httpx import URL

from litellm.types.utils import CostResponseTypes
endpoint: str,
) -> Optional["CostResponseTypes"]:
from litellm import encoding
from litellm.llms.openai.chat.gpt_transformation import OpenAIGPTConfig
) -> Optional["CostResponseTypes"]:
from litellm import encoding
from litellm.llms.openai.chat.gpt_transformation import OpenAIGPTConfig
from litellm.types.utils import ModelResponse
…ctions

feat(mcp): expose per-server InitializeResult.instructions from gateway
…he-key

fix(caching): add Responses API params to cache key allow-list
Comment on lines +1836 to +1837
litellm_params_dict = dict(litellm_params)
optional_params_dict = dict(litellm_params)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 optional_params_dict is a copy of litellm_params, not request optional params

Both litellm_params_dict and optional_params_dict are initialized identically from litellm_params. optional_params_dict is then passed as optional_params to sign_request on retry. In the original call path (before this PR), sign_request receives the request-level optional params (temperature, max_tokens, extended-thinking config, etc.) — not the litellm-level params (api_key, api_base, etc.). Passing litellm_params as optional_params could cause sign_request to generate an incorrect or incomplete request signature on the retry attempt, particularly for providers that use extended-thinking config in the signature calculation.

The function signature should accept optional_params: dict explicitly, mirroring what async_anthropic_messages_handler computes for the initial sign_request call, so the retry uses the same value.

@ishaan-berri ishaan-berri enabled auto-merge April 16, 2026 01:39
@ishaan-berri ishaan-berri changed the base branch from main to litellm_internal_staging April 16, 2026 01:45
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 16, 2026 01:52 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 16, 2026 01:52 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 16, 2026 01:52 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 16, 2026 01:52 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri merged commit 0b73352 into litellm_internal_staging Apr 16, 2026
91 of 99 checks passed
@ishaan-berri ishaan-berri deleted the litellm_ishaan_april14 branch April 16, 2026 02:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants