Litellm ishaan april14#25699
Conversation
Strip thinking blocks from the request body and retry once when Anthropic returns an invalid thinking signature error (e.g. after credential or deployment change). Applies to all BaseAnthropicMessagesConfig providers (direct Anthropic, Bedrock, Vertex, Azure AI). Made-with: Cursor
- Omit messages whose list content is empty after stripping thinking blocks - Retry only on HTTP 400 plus invalid-signature body match - Return response inline from retry loop; drop unreachable None guard - Tests: thinking-only turn dropped, non-400 no retry Made-with: Cursor
- Add optional instructions on MCPServer (config/DB/types) and Prisma migration. - MCPClient: fetch_upstream_initialize_instructions() for one-shot initialize. - Gateway merges per-request instructions: YAML/API overrides; otherwise fetch upstream initialize instructions (skip spec_path/OpenAPI-only servers). - Pass auth headers into instruction merge; ContextVar for gateway Server. - REST: wire instructions on connection-test MCPServer payloads. Made-with: Cursor
Remove the gateway-specific initialize fetch path and reuse instructions captured during existing MCP calls (list_tools/health_check/call_tool), while keeping YAML/DB instructions as immediate overrides. Made-with: Cursor
Extend existing test modules with coverage for the instructions merge logic, upstream cache, ContextVar-based injection, and client-side capture — following each file's established patterns. Made-with: Cursor
Remove the trivial one-line wrapper and access the dict directly. Made-with: Cursor
…-standard-logging-object fix(azure/passthrough): populate standard_logging_object via logging hook
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
|
Greptile SummaryThis PR bundles four features: (1) MCP gateway
Confidence Score: 4/5Not safe to merge without fixing the duplicate alias-resolution block in One P1 logic bug: the duplicate block in litellm/proxy/_experimental/mcp_server/mcp_server_manager.py — duplicate alias-resolution block.
|
| Filename | Overview |
|---|---|
| litellm/proxy/_experimental/mcp_server/mcp_server_manager.py | Adds instructions field support and _remember_upstream_initialize_instructions; contains a duplicate alias-resolution block (lines 259–283) that silently breaks mcp_aliases-configured servers. |
| litellm/proxy/_experimental/mcp_server/server.py | Adds _gateway_create_initialization_options to merge gateway instructions into InitializeResult, backed by _mcp_gateway_initialize_instructions ContextVar — logic looks correct. |
| litellm/proxy/_experimental/mcp_server/mcp_context.py | New module introducing _mcp_gateway_initialize_instructions ContextVar to carry per-request merged instructions — clean, minimal, no issues. |
| litellm/llms/anthropic/common_utils.py | Adds _is_claude_4_6_model helper with hardcoded version strings, violating the team's rule about model-specific flags; used by is_effort_used to skip a beta header for 4.6 models. |
| litellm/llms/custom_httpx/llm_http_handler.py | Adds HTTP-error retry loop for Anthropic /v1/messages; optional_params_dict is initialised from litellm_params (noted in previous thread, not re-flagged). |
| litellm-proxy-extras/litellm_proxy_extras/migrations/20260414140000_add_mcp_server_instructions/migration.sql | Simple ADD COLUMN IF NOT EXISTS "instructions" TEXT migration — safe and idempotent. |
| litellm/proxy/health_check.py | Adds BACKGROUND_HEALTH_CHECK_MAX_TOKENS env-var override for health-check token limit; straightforward and non-breaking. |
| litellm/llms/azure/passthrough/transformation.py | Adds logging_non_streaming_response to populate the standard logging object for Azure passthrough chat completions — inline imports present (per prior thread). |
| litellm/types/mcp_server/mcp_server_manager.py | Adds optional instructions: Optional[str] field to MCPServer Pydantic model — straightforward, backward-compatible. |
Sequence Diagram
sequenceDiagram
participant Client as MCP Client
participant Gateway as LiteLLM MCP Gateway
participant CtxVar as _mcp_gateway_initialize_instructions (ContextVar)
participant Upstream as Upstream MCP Server
Client->>Gateway: initialize request
Gateway->>CtxVar: _gateway_initialize_instructions_request_scope()
Gateway->>Upstream: list_tools (fetches instructions via MCPClient)
Upstream-->>Gateway: tools + InitializeResult.instructions
Gateway->>CtxVar: set(merged instructions)
Gateway-->>Client: InitializeResult{instructions: merged}
Note over Gateway,CtxVar: ContextVar reset after request scope
Comments Outside Diff (1)
-
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py, line 259-283 (link)Duplicate alias-resolution block clobbers
mcp_aliases-derived aliasesLines 259–283 are an exact copy of lines 233–257. On the second pass,
aliasis reset toserver_config.get("alias", None)and then themcp_aliaseslookup immediately fails becausealias_name not in used_aliasesis alreadyFalse— the first block already added it toused_aliases. As a result, any server whose alias comes only frommcp_aliases(not fromserver_config["alias"]) ends up withalias=Noneandname=server_nameinstead of the alias-based prefix, silently breaking tool-name prefixing andMCPServer.aliasfor all such servers.The second block should be removed entirely; the first block already computes the correct
aliasandname_for_prefix.
Reviews (8): Last reviewed commit: "chore: merge litellm_internal_staging, r..." | Re-trigger Greptile
|
|
||
| def logging_non_streaming_response( | ||
| self, | ||
| model: str, |
There was a problem hiding this comment.
Unused
custom_llm_provider parameter
custom_llm_provider is declared in the signature but never referenced in the method body. The Bedrock counterpart uses it to look up the provider config dynamically (LlmProviders(custom_llm_provider)); Azure hardcodes OpenAIGPTConfig instead. Consider either using the parameter or marking it with a leading _ to signal intentional non-use, so future readers aren't confused.
| model: str, | |
| custom_llm_provider: str, # noqa: ARG002 – not needed; Azure always uses OpenAIGPTConfig |
| from litellm import encoding | ||
| from litellm.llms.openai.chat.gpt_transformation import OpenAIGPTConfig | ||
| from litellm.types.utils import ModelResponse |
There was a problem hiding this comment.
Inline imports conflict with CLAUDE.md style guidance
CLAUDE.md says "Avoid imports within methods — place all imports at the top of the file (module-level). The only exception is avoiding circular imports where absolutely necessary." The Bedrock implementation uses the same inline-import pattern so these are presumably needed to break circular imports, but a brief comment would make the intent explicit — e.g.:
# Inline to avoid circular import with litellm.__init__
from litellm import encoding
from litellm.llms.openai.chat.gpt_transformation import OpenAIGPTConfig
from litellm.types.utils import ModelResponseContext Used: CLAUDE.md (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| import httpx | ||
| from httpx import Response | ||
|
|
||
| from litellm.litellm_core_utils.litellm_logging import Logging |
| if TYPE_CHECKING: | ||
| from httpx import URL | ||
|
|
||
| from litellm.types.utils import CostResponseTypes |
| endpoint: str, | ||
| ) -> Optional["CostResponseTypes"]: | ||
| from litellm import encoding | ||
| from litellm.llms.openai.chat.gpt_transformation import OpenAIGPTConfig |
| ) -> Optional["CostResponseTypes"]: | ||
| from litellm import encoding | ||
| from litellm.llms.openai.chat.gpt_transformation import OpenAIGPTConfig | ||
| from litellm.types.utils import ModelResponse |
…ctions feat(mcp): expose per-server InitializeResult.instructions from gateway
…he-key fix(caching): add Responses API params to cache key allow-list
| litellm_params_dict = dict(litellm_params) | ||
| optional_params_dict = dict(litellm_params) |
There was a problem hiding this comment.
optional_params_dict is a copy of litellm_params, not request optional params
Both litellm_params_dict and optional_params_dict are initialized identically from litellm_params. optional_params_dict is then passed as optional_params to sign_request on retry. In the original call path (before this PR), sign_request receives the request-level optional params (temperature, max_tokens, extended-thinking config, etc.) — not the litellm-level params (api_key, api_base, etc.). Passing litellm_params as optional_params could cause sign_request to generate an incorrect or incomplete request signature on the retry attempt, particularly for providers that use extended-thinking config in the signature calculation.
The function signature should accept optional_params: dict explicitly, mirroring what async_anthropic_messages_handler computes for the initial sign_request call, so the retry uses the same value.
0b73352
into
litellm_internal_staging
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewDelays in PR merge?
If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).
CI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Screenshots / Proof of Fix
Type
🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test
Changes