Litellm ishaan march23 - MCP Toolsets + GCP Caching fix #25146
Litellm ishaan march23 - MCP Toolsets + GCP Caching fix #25146ishaan-berri merged 13 commits intolitellm_ishaan_march23_2from
Conversation
…ervers (#24335) * feat(mcp): add LiteLLM_MCPToolsetTable and mcp_toolsets to ObjectPermissionTable * feat(mcp): add prisma migration for MCPToolset table * feat(mcp): add MCPToolset Python types * feat(mcp): add toolset_db.py with CRUD helpers for MCPToolset * feat(mcp): add toolset CRUD endpoints to mcp_management_endpoints * fix(mcp): skip allow_all_keys servers when explicit mcp_servers permission is set (toolset scope fix) * feat(mcp): add _apply_toolset_scope and toolset route handling in server.py * fix(mcp): resolve toolset names in responses API before fetching tools * feat(mcp): add mcp_toolsets field to LiteLLM_ObjectPermissionTable type * feat(mcp): register LiteLLM_MCPToolsetTable in prisma client initialization * feat(mcp): validate mcp_toolsets in key-vs-team permission check * feat(mcp): register toolset routes in proxy_server.py * feat(mcp): add MCPToolset and MCPToolsetTool TypeScript types * feat(mcp): add fetchMCPToolsets, createMCPToolset, updateMCPToolset, deleteMCPToolset API functions * feat(mcp): add useMCPToolsets React Query hook * feat(mcp): add toolsets (purple) as third option type in MCPServerSelector * feat(mcp): extract toolsets from combined MCP field in key form * feat(mcp): extract toolsets from combined MCP field in team form * feat(mcp): show toolsets section in MCPServerPermissions read view * feat(mcp): pass mcp_toolsets through object_permissions_view * feat(mcp): add MCPToolsetsTab component for creating and managing toolsets * feat(mcp): add Toolsets tab to mcp_servers.tsx * feat(mcp): pass mcpToolsets to playground chat and responses API calls * feat(mcp): generate correct server_url for toolsets in playground API calls * docs(mcp): add MCP Toolsets documentation * docs(mcp): add mcp_toolsets to sidebar * fix(mcp): replace x-mcp-toolset-id header with ContextVar to prevent client forgery * fix(mcp): use ContextVar + StreamingResponse for toolset MCP routes (fixes SSE streaming) * fix(mcp): cache toolset permission lookups to avoid per-request DB calls * test(mcp): add tests for toolset scope enforcement, ContextVar isolation, and access control * fix(mcp): cache toolset name lookups in MCPServerManager to avoid per-request DB calls * fix(mcp): prevent body_iter deadlock + use cached toolset lookup in responses API - _stream_mcp_asgi_response: add done callback to handler_task that puts the EOF sentinel on body_queue when the task exits, preventing body_iter from hanging forever if the handler raises after headers are sent. - litellm_proxy_mcp_handler: replace raw get_mcp_toolset_by_name() DB call with global_mcp_server_manager.get_toolset_by_name_cached() so toolset resolution uses the 60s TTL cache added for this purpose instead of hitting the DB on every responses-API request. * fix(mcp): toolset access control, asyncio fix, and real unit tests - server.py: _apply_toolset_scope now enforces that non-admin keys must have the requested toolset_id in their mcp_toolsets grant list; admin keys always bypass the check. - mcp_management_endpoints.py: three access-control fixes: * fetch_mcp_toolsets: non-admin keys with mcp_toolsets=None now return [] instead of all toolsets (only admins get 'all' when the field is absent) * fetch_mcp_toolset: non-admin keys that haven't been granted the requested toolset_id now get 403 instead of the full result * add_mcp_toolset: duplicate toolset_name now returns 409 Conflict instead of an opaque 500 - proxy_server.py: use asyncio.get_running_loop() instead of get_event_loop() inside an already-running coroutine (Python 3.10+). - test_mcp_toolset_scope.py: replace four hollow tests that only asserted local variable properties with real tests that call the production fetch_mcp_toolsets() and handle_streamable_http_mcp() functions with mocked dependencies. * fix(mcp): add mcp_toolsets to ObjectPermissionBase, fix multi-toolset overwrite, fix delete 404, allow standalone key toolsets * fix(mcp): add auth check on toolset resolution in responses API; union mcp_servers in _merge_toolset_permissions * fix(mcp): handle RecordNotFoundError in update_mcp_toolset; union direct servers with toolset servers * fix(mcp): use _user_has_admin_view; deny None mcp_toolsets for non-admin; use direct RecordNotFoundError import; fix docstring * fix(mcp): add @default(now()) to MCPToolsetTable.updated_at; fix test for non-admin toolset access * fix: use UniqueViolationError import; guard _ensure_eof for error/cancel only * fix(mcp): preserve mcp_access_groups in toolset scope, use shared Redis cache for toolset perms - Remove mcp_access_groups=[] from _apply_toolset_scope (server.py) and the responses API toolset path (litellm_proxy_mcp_handler.py). A key's access-group grants remain valid even when the request is scoped to a single toolset; clearing them silently revoked legitimate entitlements. - Switch resolve_toolset_tool_permissions and get_toolset_by_name_cached to use user_api_key_cache (Redis-backed DualCache in production) instead of per-instance in-memory dicts. Cache entries are now shared across workers, eliminating the per-worker stale-toolset-permission window flagged as a P1 by Greptile. - Use union merge (set union of tool names per server) when applying toolset permissions in the responses API path so direct-server tool restrictions are not overwritten by toolset permissions. * fix(mcp): return 404 when edit_mcp_toolset target does not exist * fix(mcp): align mcp_toolsets default to None in LiteLLM_ObjectPermissionTable * fix(mcp): admin toolset visibility, in-place tool name mutation, test helper coercion * fix(mcp): treat None/[] team mcp_toolsets as no restriction in key validation * fix(mcp): allow_all_keys backward compat, blocked_tools API write-path, efficient startup query * fix(mcp): use _mcp_active_toolset_id ContextVar to detect toolset scope, avoiding DB-default false-positive * fix(mcp): remove dead toolset cache stubs, log invalidation failures, align schema updated_at defaults * fix(mcp): deserialise MCPToolset from Redis cache hit, replace fastapi import in test * fix(mcp): evict name-cache on toolset mutation, 409 on rename conflict, warning-level list errors
…#24426) * fix(redis): regenerate GCP IAM token per connection for async cluster clients Async RedisCluster was generating the IAM token once at startup and storing it as a static password. After the 1-hour GCP token TTL, any new connection (including to newly-discovered cluster nodes) would fail to authenticate. Fix: introduce GCPIAMCredentialProvider that implements redis-py's CredentialProvider protocol. It calls _generate_gcp_iam_access_token() on every new connection, matching what the sync redis_connect_func already does. async_redis.RedisCluster accepts a credential_provider kwarg which is invoked per-connection. * refactor(redis): move GCPIAMCredentialProvider to its own file Extract GCPIAMCredentialProvider and _generate_gcp_iam_access_token into litellm/_redis_credential_provider.py. _redis.py imports them from there, keeping the public API unchanged. * fix: address Greptile review issues - GCPIAMCredentialProvider now inherits from redis.credentials.CredentialProvider so redis-py's async path calls get_credentials_async() properly - move _redis_credential_provider import to top of _redis.py (PEP 8) - remove dead else-branch that silently no-oped (gcp_service_account from redis_kwargs.get() was always None since it's popped by _get_redis_client_logic) - remove mid-function 'from litellm import get_secret_str' inline import - remove unused 'call' import from test_redis.py * chore: retrigger CI/review
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
|
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 29203053 | Triggered | Generic Password | 59ea60f | .circleci/config.yml | View secret |
| 29375658 | Triggered | JSON Web Token | 59ea60f | tests/test_litellm/proxy/auth/test_handle_jwt.py | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
Greptile SummaryThis PR introduces MCP Toolsets — a named collection of MCP Toolsets:
GCP IAM fix:
New findings:
Confidence Score: 4/5PR is safe to merge with a couple of P2 issues to address first. The MCP Toolsets feature is well-architected with solid security controls (ContextVar injection prevention, header stripping, explicit grant checks), the GCP caching fix is correct, and test coverage for the new access-control paths is good. Two new P2 findings remain: (1) _ensure_eof can silently drop the EOF sentinel when the bounded queue is full; (2) dynamic_mcp_route falls back to a DB lookup for every unrecognised server name. Prior review threads flagged json.dumps serialization into the JSONB column and inline imports which are also still present. litellm/proxy/proxy_server.py (_ensure_eof callback and dynamic_mcp_route toolset fallback), litellm/proxy/_experimental/mcp_server/toolset_db.py (json.dumps into JSONB — from prior review)
|
| Filename | Overview |
|---|---|
| litellm/proxy/_experimental/mcp_server/toolset_db.py | New DB helper layer for toolset CRUD. Manually calls json.dumps() on the tools Json field before passing to Prisma (and json.loads() on read), storing a JSON string literal in the JSONB column instead of a native JSON array — already flagged in prior review. |
| litellm/proxy/_experimental/mcp_server/mcp_context.py | New module introducing _mcp_active_toolset_id ContextVar to avoid circular imports between server.py and mcp_server_manager.py; clean and minimal. |
| litellm/proxy/_experimental/mcp_server/mcp_server_manager.py | Adds resolve_toolset_tool_permissions, invalidate_toolset_cache, and get_toolset_by_name_cached with Redis-backed caching; direct Prisma query in reload_servers_from_database bypasses get_all_mcp_servers helper (previously flagged). |
| litellm/proxy/_experimental/mcp_server/server.py | Adds _apply_toolset_scope (access-control + permission override) and _merge_toolset_permissions (union toolset grants into key permissions); strips client-supplied x-mcp-toolset-id header to prevent forgery. |
| litellm/proxy/proxy_server.py | Adds _stream_mcp_asgi_response helper with bounded queue (maxsize=1024) and toolset_mcp_route; _ensure_eof done-callback can silently fail if queue is full when handler task dies. |
| litellm/proxy/management_endpoints/mcp_management_endpoints.py | Adds CRUD endpoints for toolsets with correct admin-only write guards and toolset-filtered read for non-admins; cache invalidation on all mutating paths. |
| litellm/_redis_credential_provider.py | New GCPIAMCredentialProvider correctly refreshes IAM token on every new Redis connection, fixing the 1-hour expiry bug in async cluster clients. |
| litellm/_redis.py | Replaces one-time IAM token generation with GCPIAMCredentialProvider for async cluster; removes dead code and debug log statements. |
| litellm/proxy/management_helpers/object_permission_utils.py | Adds _extract_requested_mcp_toolsets and extends validate_key_mcp_servers_against_team to enforce that a key's requested toolsets are a subset of its team's allowed toolsets. |
| tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_toolset_scope.py | New test file with good unit coverage for _apply_toolset_scope, fetch_mcp_toolsets, and ContextVar injection prevention; all tests use mocks (no real network calls). |
| litellm-proxy-extras/litellm_proxy_extras/migrations/20260321000000_add_mcp_toolsets/migration.sql | Idempotent CREATE TABLE IF NOT EXISTS and ALTER TABLE ADD COLUMN IF NOT EXISTS; unique index on toolset_name; uses JSONB default consistent with schema. |
Sequence Diagram
sequenceDiagram
participant Client
participant ProxyServer as proxy_server.py
participant ContextVar as _mcp_active_toolset_id
participant Server as server.py
participant Manager as MCPServerManager
participant Cache as user_api_key_cache
participant DB as Prisma DB
Client->>ProxyServer: GET /toolset/{name}/mcp
ProxyServer->>Manager: get_toolset_by_name_cached(name)
Manager->>Cache: async_get_cache(toolset_name:{name})
alt Cache hit
Cache-->>Manager: MCPToolset
else Cache miss
Manager->>DB: find_first(toolset_name=name)
DB-->>Manager: row
Manager->>Cache: async_set_cache(...)
end
Manager-->>ProxyServer: toolset (id, name, tools)
ProxyServer->>ContextVar: set(toolset.toolset_id)
ProxyServer->>Server: handle_streamable_http_mcp(scope, receive, bridging_send)
Server->>Server: strip x-mcp-toolset-id header from scope
Server->>Server: _apply_toolset_scope(auth, toolset_id)
Note over Server: Non-admin: check mcp_toolsets grant list
Server->>Manager: resolve_toolset_tool_permissions([toolset_id])
Manager->>Cache: async_get_cache(toolset_perms:{id})
alt Cache hit
Cache-->>Manager: server_id to tools map
else Cache miss
Manager->>DB: list_mcp_toolsets(toolset_ids=[id])
DB-->>Manager: toolsets
Manager->>Cache: async_set_cache(...)
end
Manager-->>Server: tool_permissions dict
Server->>Server: Update object_permission (mcp_servers, mcp_tool_permissions)
Server-->>Client: filtered tool list / MCP session
ProxyServer->>ContextVar: reset(token)
Reviews (3): Last reviewed commit: "fix(tests): update reload_servers_from_d..." | Re-trigger Greptile
|
|
||
| async def get_mcp_toolset( | ||
| prisma_client: PrismaClient, |
There was a problem hiding this comment.
Non-idiomatic JSON serialisation for Prisma
Json field
tools is defined as Json in the Prisma schema (tools Json @default("[]")). Prisma's Python client expects a Python dict/list for Json fields and handles serialisation itself. By calling json.dumps(...) first, you are passing a Python string to Prisma, which stores a JSON string literal in the JSONB column (e.g., "[{\"server_id\":\"x\"}]") instead of a JSON array ([{"server_id":"x"}]).
The read path in _toolset_from_row compensates for this with an isinstance(tools, str) branch, so round-trips work. However, the JSONB column stores string values instead of arrays, which means:
- Direct SQL queries (
WHERE tools @> '[{"server_id":"x"}]'::jsonb) will silently return no results. - Future Prisma middleware or Prisma validators that type-check
Jsonfields may reject the string.
Pass the Python list directly and let Prisma serialise it:
| async def get_mcp_toolset( | |
| prisma_client: PrismaClient, | |
| data_dict["tools"] = data_dict.get("tools", []) |
And similarly for the update path in update_mcp_toolset (line 83):
# Before
data_dict["tools"] = json.dumps(data_dict["tools"])
# After
data_dict["tools"] = data_dict["tools"] # Prisma serialises Json fields| raw_rows = await prisma_client.db.litellm_mcpservertable.find_many( | ||
| where={ | ||
| "OR": [ | ||
| {"approval_status": None}, | ||
| {"approval_status": {"in": ["active", "approved"]}}, | ||
| ] | ||
| } | ||
| ) | ||
| db_mcp_servers = [LiteLLM_MCPServerTable(**r.model_dump()) for r in raw_rows] |
There was a problem hiding this comment.
Direct Prisma call bypasses
get_all_mcp_servers helper
The previous code used get_all_mcp_servers(prisma_client, approval_status="active"). The new code performs an equivalent query directly via prisma_client.db.litellm_mcpservertable.find_many(), bypassing any logic the helper encapsulates.
Per the CLAUDE.md guideline and rule d7156c05, DB access for proxy operations should go through the established helper functions to avoid pattern drift and make future schema changes easier to track. If get_all_mcp_servers doesn't support the approval_status=None case needed here, consider extending it with an include_no_approval parameter rather than duplicating the query inline.
Rule Used: What: In critical path of request, there should be... (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| if not headers_ready.done(): | ||
| headers_ready.set_result( | ||
| (message.get("status", 200), message.get("headers", [])) |
There was a problem hiding this comment.
Unbounded
asyncio.Queue can grow without limit for large streaming responses
body_queue: asyncio.Queue = asyncio.Queue() has no capacity bound. If the ASGI handler produces body chunks faster than the HTTP client consumes them (e.g., slow client, large tool-list response), body_queue will accumulate all chunks in memory before they are yielded.
For an MCP server that streams a large JSON payload, this could exhaust worker memory. Consider using a bounded queue to apply backpressure:
body_queue: asyncio.Queue = asyncio.Queue(maxsize=64)And in bridging_send, replace await body_queue.put(chunk) so it respects the bound (which put already does — it will await until there is space). The _ensure_eof callback uses put_nowait, which should stay as-is to avoid deadlock in the error path.
| from litellm.constants import DEFAULT_MANAGEMENT_OBJECT_IN_MEMORY_CACHE_TTL | ||
| from litellm.proxy._experimental.mcp_server.toolset_db import list_mcp_toolsets | ||
| from litellm.proxy.proxy_server import prisma_client, user_api_key_cache |
There was a problem hiding this comment.
Inline imports inside methods (CLAUDE.md / style guide rule)
resolve_toolset_tool_permissions, invalidate_toolset_cache, and get_toolset_by_name_cached all contain inline from … import … statements. Per CLAUDE.md: "Avoid imports within methods — place all imports at the top of the file (module-level). The only exception is avoiding circular imports where absolutely necessary."
from litellm.constants import DEFAULT_MANAGEMENT_OBJECT_IN_MEMORY_CACHE_TTL and from litellm.types.mcp_server.mcp_toolset import MCPToolset are not involved in any circular import and can be safely moved to the module-level imports at the top of mcp_server_manager.py. The from litellm.proxy.proxy_server import prisma_client, user_api_key_cache imports are a known circular-import pattern used elsewhere in this file and are acceptable as inline — but the non-circular ones should be hoisted.
The same pattern also appears in _apply_toolset_scope in server.py (lines 2411–2412): from litellm.proxy._types import LiteLLM_ObjectPermissionTable can be moved to the top of the file.
Context Used: CLAUDE.md (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| @@ -0,0 +1,524 @@ | |||
| import React, { useState, useCallback } from "react"; | |||
| import { Button, Text, Title } from "@tremor/react"; | |||
| import { Modal, Form, Input, message, Spin, Card, Typography, Space } from "antd"; | |||
Check notice
Code scanning / CodeQL
Unused variable, import, function or class Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 12 days ago
To fix unused-import issues, remove the specific unused bindings from the import statement while keeping the ones that are actually used. This preserves existing functionality and simply cleans up dead code.
In this file, the antd import currently brings in Modal, Form, Input, message, Spin, Card, Typography, Space. Based on the CodeQL report and the visible usage (Typography aliasing to AntdText), Card and Space are not used. The best fix is to edit ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx so that the destructuring import from "antd" no longer includes Card and Space, leaving the rest unchanged. No additional methods, imports, or definitions are needed.
| @@ -1,6 +1,6 @@ | ||
| import React, { useState, useCallback } from "react"; | ||
| import { Button, Text, Title } from "@tremor/react"; | ||
| import { Modal, Form, Input, message, Spin, Card, Typography, Space } from "antd"; | ||
| import { Modal, Form, Input, message, Spin, Typography } from "antd"; | ||
| import { PlusIcon, PencilIcon, TrashIcon } from "@heroicons/react/outline"; | ||
| import { ColumnDef } from "@tanstack/react-table"; | ||
| import { useMCPToolsets } from "@/app/(dashboard)/hooks/mcpServers/useMCPToolsets"; |
| } from "../networking"; | ||
| import { MCPToolset, MCPToolsetTool } from "./types"; | ||
|
|
||
| const { Text: AntdText } = Typography; |
Check notice
Code scanning / CodeQL
Unused variable, import, function or class Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 12 days ago
To fix the problem, remove the unused variable so that the code no longer declares AntdText if it is not used anywhere. This removes dead code and satisfies the static analysis rule.
The best minimal fix without changing functionality is to delete the line const { Text: AntdText } = Typography; in ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx. Keeping the Typography import itself is appropriate, because we cannot be sure from the snippet whether other members of Typography are used further down; however, destructuring Text into AntdText clearly serves no purpose if it’s unused.
No new methods, imports, or definitions are required to implement this change; it is purely a removal of an unused variable declaration.
| @@ -16,7 +16,6 @@ | ||
| } from "../networking"; | ||
| import { MCPToolset, MCPToolsetTool } from "./types"; | ||
|
|
||
| const { Text: AntdText } = Typography; | ||
|
|
||
| interface MCPToolsetsTabProps { | ||
| accessToken: string | null; |
| return; | ||
| } | ||
| // Resolve the real server ID (toolsets use toolset: prefix) | ||
| const mcpServerId = rawSelected.startsWith("toolset:") ? rawSelected : rawSelected; |
Check notice
Code scanning / CodeQL
Unused variable, import, function or class Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 12 days ago
In general, to fix an unused variable warning, either remove the variable (and any associated dead computation) if it truly isn’t needed, or start using it where it was intended to be used. Here, we see that rawSelected already holds the same value as the “resolved” server ID: const mcpServerId = rawSelected.startsWith("toolset:") ? rawSelected : rawSelected; simplifies to just rawSelected. Since no later code uses mcpServerId, the best non‑intrusive fix is to delete this declaration entirely.
Concretely, in ui/litellm-dashboard/src/components/playground/chat_ui/ChatUI.tsx, within the MCP direct mode block starting around line 592, remove the line that declares mcpServerId. Do not add new behavior or change existing variables; just delete the unused variable and keep the rest of the logic intact.
| @@ -600,7 +600,6 @@ | ||
| return; | ||
| } | ||
| // Resolve the real server ID (toolsets use toolset: prefix) | ||
| const mcpServerId = rawSelected.startsWith("toolset:") ? rawSelected : rawSelected; | ||
| if (!selectedMCPDirectTool) { | ||
| NotificationsManager.fromBackend("Please select an MCP tool to call"); | ||
| return; |
…ount below PLR0915 limit
2ce85e6
into
litellm_ishaan_march23_2
) * Litellm ishaan march23 - MCP Toolsets + GCP Caching fix (#25146) * feat(mcp): MCP Toolsets — curated tool subsets from one or more MCP servers (#24335) * feat(mcp): add LiteLLM_MCPToolsetTable and mcp_toolsets to ObjectPermissionTable * feat(mcp): add prisma migration for MCPToolset table * feat(mcp): add MCPToolset Python types * feat(mcp): add toolset_db.py with CRUD helpers for MCPToolset * feat(mcp): add toolset CRUD endpoints to mcp_management_endpoints * fix(mcp): skip allow_all_keys servers when explicit mcp_servers permission is set (toolset scope fix) * feat(mcp): add _apply_toolset_scope and toolset route handling in server.py * fix(mcp): resolve toolset names in responses API before fetching tools * feat(mcp): add mcp_toolsets field to LiteLLM_ObjectPermissionTable type * feat(mcp): register LiteLLM_MCPToolsetTable in prisma client initialization * feat(mcp): validate mcp_toolsets in key-vs-team permission check * feat(mcp): register toolset routes in proxy_server.py * feat(mcp): add MCPToolset and MCPToolsetTool TypeScript types * feat(mcp): add fetchMCPToolsets, createMCPToolset, updateMCPToolset, deleteMCPToolset API functions * feat(mcp): add useMCPToolsets React Query hook * feat(mcp): add toolsets (purple) as third option type in MCPServerSelector * feat(mcp): extract toolsets from combined MCP field in key form * feat(mcp): extract toolsets from combined MCP field in team form * feat(mcp): show toolsets section in MCPServerPermissions read view * feat(mcp): pass mcp_toolsets through object_permissions_view * feat(mcp): add MCPToolsetsTab component for creating and managing toolsets * feat(mcp): add Toolsets tab to mcp_servers.tsx * feat(mcp): pass mcpToolsets to playground chat and responses API calls * feat(mcp): generate correct server_url for toolsets in playground API calls * docs(mcp): add MCP Toolsets documentation * docs(mcp): add mcp_toolsets to sidebar * fix(mcp): replace x-mcp-toolset-id header with ContextVar to prevent client forgery * fix(mcp): use ContextVar + StreamingResponse for toolset MCP routes (fixes SSE streaming) * fix(mcp): cache toolset permission lookups to avoid per-request DB calls * test(mcp): add tests for toolset scope enforcement, ContextVar isolation, and access control * fix(mcp): cache toolset name lookups in MCPServerManager to avoid per-request DB calls * fix(mcp): prevent body_iter deadlock + use cached toolset lookup in responses API - _stream_mcp_asgi_response: add done callback to handler_task that puts the EOF sentinel on body_queue when the task exits, preventing body_iter from hanging forever if the handler raises after headers are sent. - litellm_proxy_mcp_handler: replace raw get_mcp_toolset_by_name() DB call with global_mcp_server_manager.get_toolset_by_name_cached() so toolset resolution uses the 60s TTL cache added for this purpose instead of hitting the DB on every responses-API request. * fix(mcp): toolset access control, asyncio fix, and real unit tests - server.py: _apply_toolset_scope now enforces that non-admin keys must have the requested toolset_id in their mcp_toolsets grant list; admin keys always bypass the check. - mcp_management_endpoints.py: three access-control fixes: * fetch_mcp_toolsets: non-admin keys with mcp_toolsets=None now return [] instead of all toolsets (only admins get 'all' when the field is absent) * fetch_mcp_toolset: non-admin keys that haven't been granted the requested toolset_id now get 403 instead of the full result * add_mcp_toolset: duplicate toolset_name now returns 409 Conflict instead of an opaque 500 - proxy_server.py: use asyncio.get_running_loop() instead of get_event_loop() inside an already-running coroutine (Python 3.10+). - test_mcp_toolset_scope.py: replace four hollow tests that only asserted local variable properties with real tests that call the production fetch_mcp_toolsets() and handle_streamable_http_mcp() functions with mocked dependencies. * fix(mcp): add mcp_toolsets to ObjectPermissionBase, fix multi-toolset overwrite, fix delete 404, allow standalone key toolsets * fix(mcp): add auth check on toolset resolution in responses API; union mcp_servers in _merge_toolset_permissions * fix(mcp): handle RecordNotFoundError in update_mcp_toolset; union direct servers with toolset servers * fix(mcp): use _user_has_admin_view; deny None mcp_toolsets for non-admin; use direct RecordNotFoundError import; fix docstring * fix(mcp): add @default(now()) to MCPToolsetTable.updated_at; fix test for non-admin toolset access * fix: use UniqueViolationError import; guard _ensure_eof for error/cancel only * fix(mcp): preserve mcp_access_groups in toolset scope, use shared Redis cache for toolset perms - Remove mcp_access_groups=[] from _apply_toolset_scope (server.py) and the responses API toolset path (litellm_proxy_mcp_handler.py). A key's access-group grants remain valid even when the request is scoped to a single toolset; clearing them silently revoked legitimate entitlements. - Switch resolve_toolset_tool_permissions and get_toolset_by_name_cached to use user_api_key_cache (Redis-backed DualCache in production) instead of per-instance in-memory dicts. Cache entries are now shared across workers, eliminating the per-worker stale-toolset-permission window flagged as a P1 by Greptile. - Use union merge (set union of tool names per server) when applying toolset permissions in the responses API path so direct-server tool restrictions are not overwritten by toolset permissions. * fix(mcp): return 404 when edit_mcp_toolset target does not exist * fix(mcp): align mcp_toolsets default to None in LiteLLM_ObjectPermissionTable * fix(mcp): admin toolset visibility, in-place tool name mutation, test helper coercion * fix(mcp): treat None/[] team mcp_toolsets as no restriction in key validation * fix(mcp): allow_all_keys backward compat, blocked_tools API write-path, efficient startup query * fix(mcp): use _mcp_active_toolset_id ContextVar to detect toolset scope, avoiding DB-default false-positive * fix(mcp): remove dead toolset cache stubs, log invalidation failures, align schema updated_at defaults * fix(mcp): deserialise MCPToolset from Redis cache hit, replace fastapi import in test * fix(mcp): evict name-cache on toolset mutation, 409 on rename conflict, warning-level list errors * fix(redis): regenerate GCP IAM token per connection for async cluster (#24426) * fix(redis): regenerate GCP IAM token per connection for async cluster clients Async RedisCluster was generating the IAM token once at startup and storing it as a static password. After the 1-hour GCP token TTL, any new connection (including to newly-discovered cluster nodes) would fail to authenticate. Fix: introduce GCPIAMCredentialProvider that implements redis-py's CredentialProvider protocol. It calls _generate_gcp_iam_access_token() on every new connection, matching what the sync redis_connect_func already does. async_redis.RedisCluster accepts a credential_provider kwarg which is invoked per-connection. * refactor(redis): move GCPIAMCredentialProvider to its own file Extract GCPIAMCredentialProvider and _generate_gcp_iam_access_token into litellm/_redis_credential_provider.py. _redis.py imports them from there, keeping the public API unchanged. * fix: address Greptile review issues - GCPIAMCredentialProvider now inherits from redis.credentials.CredentialProvider so redis-py's async path calls get_credentials_async() properly - move _redis_credential_provider import to top of _redis.py (PEP 8) - remove dead else-branch that silently no-oped (gcp_service_account from redis_kwargs.get() was always None since it's popped by _get_redis_client_logic) - remove mid-function 'from litellm import get_secret_str' inline import - remove unused 'call' import from test_redis.py * chore: retrigger CI/review * chore: sync schema.prisma copies from root * chore: sync schema.prisma copies from root * fix(proxy_server): use bounded asyncio.Queue with maxsize to prevent unbounded growth * fix(a2a/pydantic_ai): make api_base Optional to match base class signature * fix(a2a/pydantic_ai): make api_base Optional in handler and guard against None * fix(mcp): remove unused get_all_mcp_servers import * fix(mcp): remove unused MCPToolset import * refactor(mcp): extract toolset permission logic to reduce statement count below PLR0915 limit * fix(tests): update reload_servers_from_database tests to mock prisma directly --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix(toolset_db): lazy-import prisma to avoid ImportError when prisma not installed * fix(tests): update UI tests for toolset tab and updated empty state text * fix(tests): add get_mcp_server_by_name to fake_manager stub --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
… (BerriAI#25155) * Litellm ishaan march23 - MCP Toolsets + GCP Caching fix (BerriAI#25146) * feat(mcp): MCP Toolsets — curated tool subsets from one or more MCP servers (BerriAI#24335) * feat(mcp): add LiteLLM_MCPToolsetTable and mcp_toolsets to ObjectPermissionTable * feat(mcp): add prisma migration for MCPToolset table * feat(mcp): add MCPToolset Python types * feat(mcp): add toolset_db.py with CRUD helpers for MCPToolset * feat(mcp): add toolset CRUD endpoints to mcp_management_endpoints * fix(mcp): skip allow_all_keys servers when explicit mcp_servers permission is set (toolset scope fix) * feat(mcp): add _apply_toolset_scope and toolset route handling in server.py * fix(mcp): resolve toolset names in responses API before fetching tools * feat(mcp): add mcp_toolsets field to LiteLLM_ObjectPermissionTable type * feat(mcp): register LiteLLM_MCPToolsetTable in prisma client initialization * feat(mcp): validate mcp_toolsets in key-vs-team permission check * feat(mcp): register toolset routes in proxy_server.py * feat(mcp): add MCPToolset and MCPToolsetTool TypeScript types * feat(mcp): add fetchMCPToolsets, createMCPToolset, updateMCPToolset, deleteMCPToolset API functions * feat(mcp): add useMCPToolsets React Query hook * feat(mcp): add toolsets (purple) as third option type in MCPServerSelector * feat(mcp): extract toolsets from combined MCP field in key form * feat(mcp): extract toolsets from combined MCP field in team form * feat(mcp): show toolsets section in MCPServerPermissions read view * feat(mcp): pass mcp_toolsets through object_permissions_view * feat(mcp): add MCPToolsetsTab component for creating and managing toolsets * feat(mcp): add Toolsets tab to mcp_servers.tsx * feat(mcp): pass mcpToolsets to playground chat and responses API calls * feat(mcp): generate correct server_url for toolsets in playground API calls * docs(mcp): add MCP Toolsets documentation * docs(mcp): add mcp_toolsets to sidebar * fix(mcp): replace x-mcp-toolset-id header with ContextVar to prevent client forgery * fix(mcp): use ContextVar + StreamingResponse for toolset MCP routes (fixes SSE streaming) * fix(mcp): cache toolset permission lookups to avoid per-request DB calls * test(mcp): add tests for toolset scope enforcement, ContextVar isolation, and access control * fix(mcp): cache toolset name lookups in MCPServerManager to avoid per-request DB calls * fix(mcp): prevent body_iter deadlock + use cached toolset lookup in responses API - _stream_mcp_asgi_response: add done callback to handler_task that puts the EOF sentinel on body_queue when the task exits, preventing body_iter from hanging forever if the handler raises after headers are sent. - litellm_proxy_mcp_handler: replace raw get_mcp_toolset_by_name() DB call with global_mcp_server_manager.get_toolset_by_name_cached() so toolset resolution uses the 60s TTL cache added for this purpose instead of hitting the DB on every responses-API request. * fix(mcp): toolset access control, asyncio fix, and real unit tests - server.py: _apply_toolset_scope now enforces that non-admin keys must have the requested toolset_id in their mcp_toolsets grant list; admin keys always bypass the check. - mcp_management_endpoints.py: three access-control fixes: * fetch_mcp_toolsets: non-admin keys with mcp_toolsets=None now return [] instead of all toolsets (only admins get 'all' when the field is absent) * fetch_mcp_toolset: non-admin keys that haven't been granted the requested toolset_id now get 403 instead of the full result * add_mcp_toolset: duplicate toolset_name now returns 409 Conflict instead of an opaque 500 - proxy_server.py: use asyncio.get_running_loop() instead of get_event_loop() inside an already-running coroutine (Python 3.10+). - test_mcp_toolset_scope.py: replace four hollow tests that only asserted local variable properties with real tests that call the production fetch_mcp_toolsets() and handle_streamable_http_mcp() functions with mocked dependencies. * fix(mcp): add mcp_toolsets to ObjectPermissionBase, fix multi-toolset overwrite, fix delete 404, allow standalone key toolsets * fix(mcp): add auth check on toolset resolution in responses API; union mcp_servers in _merge_toolset_permissions * fix(mcp): handle RecordNotFoundError in update_mcp_toolset; union direct servers with toolset servers * fix(mcp): use _user_has_admin_view; deny None mcp_toolsets for non-admin; use direct RecordNotFoundError import; fix docstring * fix(mcp): add @default(now()) to MCPToolsetTable.updated_at; fix test for non-admin toolset access * fix: use UniqueViolationError import; guard _ensure_eof for error/cancel only * fix(mcp): preserve mcp_access_groups in toolset scope, use shared Redis cache for toolset perms - Remove mcp_access_groups=[] from _apply_toolset_scope (server.py) and the responses API toolset path (litellm_proxy_mcp_handler.py). A key's access-group grants remain valid even when the request is scoped to a single toolset; clearing them silently revoked legitimate entitlements. - Switch resolve_toolset_tool_permissions and get_toolset_by_name_cached to use user_api_key_cache (Redis-backed DualCache in production) instead of per-instance in-memory dicts. Cache entries are now shared across workers, eliminating the per-worker stale-toolset-permission window flagged as a P1 by Greptile. - Use union merge (set union of tool names per server) when applying toolset permissions in the responses API path so direct-server tool restrictions are not overwritten by toolset permissions. * fix(mcp): return 404 when edit_mcp_toolset target does not exist * fix(mcp): align mcp_toolsets default to None in LiteLLM_ObjectPermissionTable * fix(mcp): admin toolset visibility, in-place tool name mutation, test helper coercion * fix(mcp): treat None/[] team mcp_toolsets as no restriction in key validation * fix(mcp): allow_all_keys backward compat, blocked_tools API write-path, efficient startup query * fix(mcp): use _mcp_active_toolset_id ContextVar to detect toolset scope, avoiding DB-default false-positive * fix(mcp): remove dead toolset cache stubs, log invalidation failures, align schema updated_at defaults * fix(mcp): deserialise MCPToolset from Redis cache hit, replace fastapi import in test * fix(mcp): evict name-cache on toolset mutation, 409 on rename conflict, warning-level list errors * fix(redis): regenerate GCP IAM token per connection for async cluster (BerriAI#24426) * fix(redis): regenerate GCP IAM token per connection for async cluster clients Async RedisCluster was generating the IAM token once at startup and storing it as a static password. After the 1-hour GCP token TTL, any new connection (including to newly-discovered cluster nodes) would fail to authenticate. Fix: introduce GCPIAMCredentialProvider that implements redis-py's CredentialProvider protocol. It calls _generate_gcp_iam_access_token() on every new connection, matching what the sync redis_connect_func already does. async_redis.RedisCluster accepts a credential_provider kwarg which is invoked per-connection. * refactor(redis): move GCPIAMCredentialProvider to its own file Extract GCPIAMCredentialProvider and _generate_gcp_iam_access_token into litellm/_redis_credential_provider.py. _redis.py imports them from there, keeping the public API unchanged. * fix: address Greptile review issues - GCPIAMCredentialProvider now inherits from redis.credentials.CredentialProvider so redis-py's async path calls get_credentials_async() properly - move _redis_credential_provider import to top of _redis.py (PEP 8) - remove dead else-branch that silently no-oped (gcp_service_account from redis_kwargs.get() was always None since it's popped by _get_redis_client_logic) - remove mid-function 'from litellm import get_secret_str' inline import - remove unused 'call' import from test_redis.py * chore: retrigger CI/review * chore: sync schema.prisma copies from root * chore: sync schema.prisma copies from root * fix(proxy_server): use bounded asyncio.Queue with maxsize to prevent unbounded growth * fix(a2a/pydantic_ai): make api_base Optional to match base class signature * fix(a2a/pydantic_ai): make api_base Optional in handler and guard against None * fix(mcp): remove unused get_all_mcp_servers import * fix(mcp): remove unused MCPToolset import * refactor(mcp): extract toolset permission logic to reduce statement count below PLR0915 limit * fix(tests): update reload_servers_from_database tests to mock prisma directly --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix(toolset_db): lazy-import prisma to avoid ImportError when prisma not installed * fix(tests): update UI tests for toolset tab and updated empty state text * fix(tests): add get_mcp_server_by_name to fake_manager stub --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
… (BerriAI#25155) * Litellm ishaan march23 - MCP Toolsets + GCP Caching fix (BerriAI#25146) * feat(mcp): MCP Toolsets — curated tool subsets from one or more MCP servers (BerriAI#24335) * feat(mcp): add LiteLLM_MCPToolsetTable and mcp_toolsets to ObjectPermissionTable * feat(mcp): add prisma migration for MCPToolset table * feat(mcp): add MCPToolset Python types * feat(mcp): add toolset_db.py with CRUD helpers for MCPToolset * feat(mcp): add toolset CRUD endpoints to mcp_management_endpoints * fix(mcp): skip allow_all_keys servers when explicit mcp_servers permission is set (toolset scope fix) * feat(mcp): add _apply_toolset_scope and toolset route handling in server.py * fix(mcp): resolve toolset names in responses API before fetching tools * feat(mcp): add mcp_toolsets field to LiteLLM_ObjectPermissionTable type * feat(mcp): register LiteLLM_MCPToolsetTable in prisma client initialization * feat(mcp): validate mcp_toolsets in key-vs-team permission check * feat(mcp): register toolset routes in proxy_server.py * feat(mcp): add MCPToolset and MCPToolsetTool TypeScript types * feat(mcp): add fetchMCPToolsets, createMCPToolset, updateMCPToolset, deleteMCPToolset API functions * feat(mcp): add useMCPToolsets React Query hook * feat(mcp): add toolsets (purple) as third option type in MCPServerSelector * feat(mcp): extract toolsets from combined MCP field in key form * feat(mcp): extract toolsets from combined MCP field in team form * feat(mcp): show toolsets section in MCPServerPermissions read view * feat(mcp): pass mcp_toolsets through object_permissions_view * feat(mcp): add MCPToolsetsTab component for creating and managing toolsets * feat(mcp): add Toolsets tab to mcp_servers.tsx * feat(mcp): pass mcpToolsets to playground chat and responses API calls * feat(mcp): generate correct server_url for toolsets in playground API calls * docs(mcp): add MCP Toolsets documentation * docs(mcp): add mcp_toolsets to sidebar * fix(mcp): replace x-mcp-toolset-id header with ContextVar to prevent client forgery * fix(mcp): use ContextVar + StreamingResponse for toolset MCP routes (fixes SSE streaming) * fix(mcp): cache toolset permission lookups to avoid per-request DB calls * test(mcp): add tests for toolset scope enforcement, ContextVar isolation, and access control * fix(mcp): cache toolset name lookups in MCPServerManager to avoid per-request DB calls * fix(mcp): prevent body_iter deadlock + use cached toolset lookup in responses API - _stream_mcp_asgi_response: add done callback to handler_task that puts the EOF sentinel on body_queue when the task exits, preventing body_iter from hanging forever if the handler raises after headers are sent. - litellm_proxy_mcp_handler: replace raw get_mcp_toolset_by_name() DB call with global_mcp_server_manager.get_toolset_by_name_cached() so toolset resolution uses the 60s TTL cache added for this purpose instead of hitting the DB on every responses-API request. * fix(mcp): toolset access control, asyncio fix, and real unit tests - server.py: _apply_toolset_scope now enforces that non-admin keys must have the requested toolset_id in their mcp_toolsets grant list; admin keys always bypass the check. - mcp_management_endpoints.py: three access-control fixes: * fetch_mcp_toolsets: non-admin keys with mcp_toolsets=None now return [] instead of all toolsets (only admins get 'all' when the field is absent) * fetch_mcp_toolset: non-admin keys that haven't been granted the requested toolset_id now get 403 instead of the full result * add_mcp_toolset: duplicate toolset_name now returns 409 Conflict instead of an opaque 500 - proxy_server.py: use asyncio.get_running_loop() instead of get_event_loop() inside an already-running coroutine (Python 3.10+). - test_mcp_toolset_scope.py: replace four hollow tests that only asserted local variable properties with real tests that call the production fetch_mcp_toolsets() and handle_streamable_http_mcp() functions with mocked dependencies. * fix(mcp): add mcp_toolsets to ObjectPermissionBase, fix multi-toolset overwrite, fix delete 404, allow standalone key toolsets * fix(mcp): add auth check on toolset resolution in responses API; union mcp_servers in _merge_toolset_permissions * fix(mcp): handle RecordNotFoundError in update_mcp_toolset; union direct servers with toolset servers * fix(mcp): use _user_has_admin_view; deny None mcp_toolsets for non-admin; use direct RecordNotFoundError import; fix docstring * fix(mcp): add @default(now()) to MCPToolsetTable.updated_at; fix test for non-admin toolset access * fix: use UniqueViolationError import; guard _ensure_eof for error/cancel only * fix(mcp): preserve mcp_access_groups in toolset scope, use shared Redis cache for toolset perms - Remove mcp_access_groups=[] from _apply_toolset_scope (server.py) and the responses API toolset path (litellm_proxy_mcp_handler.py). A key's access-group grants remain valid even when the request is scoped to a single toolset; clearing them silently revoked legitimate entitlements. - Switch resolve_toolset_tool_permissions and get_toolset_by_name_cached to use user_api_key_cache (Redis-backed DualCache in production) instead of per-instance in-memory dicts. Cache entries are now shared across workers, eliminating the per-worker stale-toolset-permission window flagged as a P1 by Greptile. - Use union merge (set union of tool names per server) when applying toolset permissions in the responses API path so direct-server tool restrictions are not overwritten by toolset permissions. * fix(mcp): return 404 when edit_mcp_toolset target does not exist * fix(mcp): align mcp_toolsets default to None in LiteLLM_ObjectPermissionTable * fix(mcp): admin toolset visibility, in-place tool name mutation, test helper coercion * fix(mcp): treat None/[] team mcp_toolsets as no restriction in key validation * fix(mcp): allow_all_keys backward compat, blocked_tools API write-path, efficient startup query * fix(mcp): use _mcp_active_toolset_id ContextVar to detect toolset scope, avoiding DB-default false-positive * fix(mcp): remove dead toolset cache stubs, log invalidation failures, align schema updated_at defaults * fix(mcp): deserialise MCPToolset from Redis cache hit, replace fastapi import in test * fix(mcp): evict name-cache on toolset mutation, 409 on rename conflict, warning-level list errors * fix(redis): regenerate GCP IAM token per connection for async cluster (BerriAI#24426) * fix(redis): regenerate GCP IAM token per connection for async cluster clients Async RedisCluster was generating the IAM token once at startup and storing it as a static password. After the 1-hour GCP token TTL, any new connection (including to newly-discovered cluster nodes) would fail to authenticate. Fix: introduce GCPIAMCredentialProvider that implements redis-py's CredentialProvider protocol. It calls _generate_gcp_iam_access_token() on every new connection, matching what the sync redis_connect_func already does. async_redis.RedisCluster accepts a credential_provider kwarg which is invoked per-connection. * refactor(redis): move GCPIAMCredentialProvider to its own file Extract GCPIAMCredentialProvider and _generate_gcp_iam_access_token into litellm/_redis_credential_provider.py. _redis.py imports them from there, keeping the public API unchanged. * fix: address Greptile review issues - GCPIAMCredentialProvider now inherits from redis.credentials.CredentialProvider so redis-py's async path calls get_credentials_async() properly - move _redis_credential_provider import to top of _redis.py (PEP 8) - remove dead else-branch that silently no-oped (gcp_service_account from redis_kwargs.get() was always None since it's popped by _get_redis_client_logic) - remove mid-function 'from litellm import get_secret_str' inline import - remove unused 'call' import from test_redis.py * chore: retrigger CI/review * chore: sync schema.prisma copies from root * chore: sync schema.prisma copies from root * fix(proxy_server): use bounded asyncio.Queue with maxsize to prevent unbounded growth * fix(a2a/pydantic_ai): make api_base Optional to match base class signature * fix(a2a/pydantic_ai): make api_base Optional in handler and guard against None * fix(mcp): remove unused get_all_mcp_servers import * fix(mcp): remove unused MCPToolset import * refactor(mcp): extract toolset permission logic to reduce statement count below PLR0915 limit * fix(tests): update reload_servers_from_database tests to mock prisma directly --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix(toolset_db): lazy-import prisma to avoid ImportError when prisma not installed * fix(tests): update UI tests for toolset tab and updated empty state text * fix(tests): add get_mcp_server_by_name to fake_manager stub --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Litellm ishaan march23 - MCP Toolsets + GCP Caching fix
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewDelays in PR merge?
If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).
CI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test
Changes