Skip to content

Litellm ishaan march23 - MCP Toolsets + GCP Caching fix #25146

Merged
ishaan-berri merged 13 commits intolitellm_ishaan_march23_2from
litellm_ishaan_march23
Apr 4, 2026
Merged

Litellm ishaan march23 - MCP Toolsets + GCP Caching fix #25146
ishaan-berri merged 13 commits intolitellm_ishaan_march23_2from
litellm_ishaan_march23

Conversation

@ishaan-berri
Copy link
Copy Markdown
Contributor

Litellm ishaan march23 - MCP Toolsets + GCP Caching fix

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

…ervers (#24335)

* feat(mcp): add LiteLLM_MCPToolsetTable and mcp_toolsets to ObjectPermissionTable

* feat(mcp): add prisma migration for MCPToolset table

* feat(mcp): add MCPToolset Python types

* feat(mcp): add toolset_db.py with CRUD helpers for MCPToolset

* feat(mcp): add toolset CRUD endpoints to mcp_management_endpoints

* fix(mcp): skip allow_all_keys servers when explicit mcp_servers permission is set (toolset scope fix)

* feat(mcp): add _apply_toolset_scope and toolset route handling in server.py

* fix(mcp): resolve toolset names in responses API before fetching tools

* feat(mcp): add mcp_toolsets field to LiteLLM_ObjectPermissionTable type

* feat(mcp): register LiteLLM_MCPToolsetTable in prisma client initialization

* feat(mcp): validate mcp_toolsets in key-vs-team permission check

* feat(mcp): register toolset routes in proxy_server.py

* feat(mcp): add MCPToolset and MCPToolsetTool TypeScript types

* feat(mcp): add fetchMCPToolsets, createMCPToolset, updateMCPToolset, deleteMCPToolset API functions

* feat(mcp): add useMCPToolsets React Query hook

* feat(mcp): add toolsets (purple) as third option type in MCPServerSelector

* feat(mcp): extract toolsets from combined MCP field in key form

* feat(mcp): extract toolsets from combined MCP field in team form

* feat(mcp): show toolsets section in MCPServerPermissions read view

* feat(mcp): pass mcp_toolsets through object_permissions_view

* feat(mcp): add MCPToolsetsTab component for creating and managing toolsets

* feat(mcp): add Toolsets tab to mcp_servers.tsx

* feat(mcp): pass mcpToolsets to playground chat and responses API calls

* feat(mcp): generate correct server_url for toolsets in playground API calls

* docs(mcp): add MCP Toolsets documentation

* docs(mcp): add mcp_toolsets to sidebar

* fix(mcp): replace x-mcp-toolset-id header with ContextVar to prevent client forgery

* fix(mcp): use ContextVar + StreamingResponse for toolset MCP routes (fixes SSE streaming)

* fix(mcp): cache toolset permission lookups to avoid per-request DB calls

* test(mcp): add tests for toolset scope enforcement, ContextVar isolation, and access control

* fix(mcp): cache toolset name lookups in MCPServerManager to avoid per-request DB calls

* fix(mcp): prevent body_iter deadlock + use cached toolset lookup in responses API

- _stream_mcp_asgi_response: add done callback to handler_task that puts
  the EOF sentinel on body_queue when the task exits, preventing body_iter
  from hanging forever if the handler raises after headers are sent.
- litellm_proxy_mcp_handler: replace raw get_mcp_toolset_by_name() DB call
  with global_mcp_server_manager.get_toolset_by_name_cached() so toolset
  resolution uses the 60s TTL cache added for this purpose instead of
  hitting the DB on every responses-API request.

* fix(mcp): toolset access control, asyncio fix, and real unit tests

- server.py: _apply_toolset_scope now enforces that non-admin keys must
  have the requested toolset_id in their mcp_toolsets grant list;
  admin keys always bypass the check.
- mcp_management_endpoints.py: three access-control fixes:
  * fetch_mcp_toolsets: non-admin keys with mcp_toolsets=None now
    return [] instead of all toolsets (only admins get 'all' when
    the field is absent)
  * fetch_mcp_toolset: non-admin keys that haven't been granted the
    requested toolset_id now get 403 instead of the full result
  * add_mcp_toolset: duplicate toolset_name now returns 409 Conflict
    instead of an opaque 500
- proxy_server.py: use asyncio.get_running_loop() instead of
  get_event_loop() inside an already-running coroutine (Python 3.10+).
- test_mcp_toolset_scope.py: replace four hollow tests that only
  asserted local variable properties with real tests that call the
  production fetch_mcp_toolsets() and handle_streamable_http_mcp()
  functions with mocked dependencies.

* fix(mcp): add mcp_toolsets to ObjectPermissionBase, fix multi-toolset overwrite, fix delete 404, allow standalone key toolsets

* fix(mcp): add auth check on toolset resolution in responses API; union mcp_servers in _merge_toolset_permissions

* fix(mcp): handle RecordNotFoundError in update_mcp_toolset; union direct servers with toolset servers

* fix(mcp): use _user_has_admin_view; deny None mcp_toolsets for non-admin; use direct RecordNotFoundError import; fix docstring

* fix(mcp): add @default(now()) to MCPToolsetTable.updated_at; fix test for non-admin toolset access

* fix: use UniqueViolationError import; guard _ensure_eof for error/cancel only

* fix(mcp): preserve mcp_access_groups in toolset scope, use shared Redis cache for toolset perms

- Remove mcp_access_groups=[] from _apply_toolset_scope (server.py) and the
  responses API toolset path (litellm_proxy_mcp_handler.py). A key's access-group
  grants remain valid even when the request is scoped to a single toolset; clearing
  them silently revoked legitimate entitlements.

- Switch resolve_toolset_tool_permissions and get_toolset_by_name_cached to use
  user_api_key_cache (Redis-backed DualCache in production) instead of per-instance
  in-memory dicts. Cache entries are now shared across workers, eliminating the
  per-worker stale-toolset-permission window flagged as a P1 by Greptile.

- Use union merge (set union of tool names per server) when applying toolset
  permissions in the responses API path so direct-server tool restrictions are not
  overwritten by toolset permissions.

* fix(mcp): return 404 when edit_mcp_toolset target does not exist

* fix(mcp): align mcp_toolsets default to None in LiteLLM_ObjectPermissionTable

* fix(mcp): admin toolset visibility, in-place tool name mutation, test helper coercion

* fix(mcp): treat None/[] team mcp_toolsets as no restriction in key validation

* fix(mcp): allow_all_keys backward compat, blocked_tools API write-path, efficient startup query

* fix(mcp): use _mcp_active_toolset_id ContextVar to detect toolset scope, avoiding DB-default false-positive

* fix(mcp): remove dead toolset cache stubs, log invalidation failures, align schema updated_at defaults

* fix(mcp): deserialise MCPToolset from Redis cache hit, replace fastapi import in test

* fix(mcp): evict name-cache on toolset mutation, 409 on rename conflict, warning-level list errors
…#24426)

* fix(redis): regenerate GCP IAM token per connection for async cluster clients

Async RedisCluster was generating the IAM token once at startup and
storing it as a static password. After the 1-hour GCP token TTL, any
new connection (including to newly-discovered cluster nodes) would fail
to authenticate.

Fix: introduce GCPIAMCredentialProvider that implements redis-py's
CredentialProvider protocol. It calls _generate_gcp_iam_access_token()
on every new connection, matching what the sync redis_connect_func
already does. async_redis.RedisCluster accepts a credential_provider
kwarg which is invoked per-connection.

* refactor(redis): move GCPIAMCredentialProvider to its own file

Extract GCPIAMCredentialProvider and _generate_gcp_iam_access_token
into litellm/_redis_credential_provider.py. _redis.py imports them
from there, keeping the public API unchanged.

* fix: address Greptile review issues

- GCPIAMCredentialProvider now inherits from redis.credentials.CredentialProvider
  so redis-py's async path calls get_credentials_async() properly
- move _redis_credential_provider import to top of _redis.py (PEP 8)
- remove dead else-branch that silently no-oped (gcp_service_account from
  redis_kwargs.get() was always None since it's popped by _get_redis_client_logic)
- remove mid-function 'from litellm import get_secret_str' inline import
- remove unused 'call' import from test_redis.py

* chore: retrigger CI/review
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 4, 2026 9:34pm

Request Review

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 4, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 3 committers have signed the CLA.

✅ ishaan-jaff
❌ github-actions[bot]
❌ ishaan-berri
You have signed the CLA already but the status is still pending? Let us recheck it.

@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 20:56 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 20:56 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 20:56 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 20:56 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-redis-postgres April 4, 2026 20:56 — with GitHub Actions Inactive
@gitguardian
Copy link
Copy Markdown

gitguardian bot commented Apr 4, 2026

⚠️ GitGuardian has uncovered 2 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secrets in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
29203053 Triggered Generic Password 59ea60f .circleci/config.yml View secret
29375658 Triggered JSON Web Token 59ea60f tests/test_litellm/proxy/auth/test_handle_jwt.py View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secrets safely. Learn here the best practices.
  3. Revoke and rotate these secrets.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Apr 4, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing litellm_ishaan_march23 (9b116e2) with main (08df864)

Open in CodSpeed

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 4, 2026

Greptile Summary

This PR introduces MCP Toolsets — a named collection of {server_id, tool_name} pairs that can be granted to API keys/teams — alongside a GCP IAM Redis caching fix for async cluster clients.

MCP Toolsets:

  • New LiteLLM_MCPToolsetTable Prisma model and migration
  • mcp_toolsets: String[] column added to LiteLLM_ObjectPermissionTable for per-key/team grant lists
  • New CRUD management endpoints: POST/GET/PUT/DELETE /v1/mcp/toolset
  • Toolsets accessible at /toolset/{name}/mcp (explicit prefix) and as a fallback in /{name}/mcp when no server name matches
  • A shared ContextVar (_mcp_active_toolset_id) in mcp_context.py carries the active toolset ID server-side; any client-supplied x-mcp-toolset-id header is stripped from the ASGI scope before processing
  • resolve_toolset_tool_permissions and get_toolset_by_name_cached use user_api_key_cache (Redis-backed DualCache) to avoid per-request DB hits

GCP IAM fix:

  • Moves _generate_gcp_iam_access_token and the new GCPIAMCredentialProvider to litellm/_redis_credential_provider.py
  • Replaces the one-time token-at-startup approach with a CredentialProvider that regenerates the IAM token on every new connection, fixing the 1-hour token expiry regression

New findings:

  • _ensure_eof done-callback can silently drop the EOF sentinel when asyncio.Queue(maxsize=1024) is full, leaving body_iter() blocked
  • dynamic_mcp_route incurs a DB round-trip for every unrecognised server name (first-call cache miss path)
  • Inline global_mcp_server_manager imports repeated inside three management handlers instead of using the module-level import

Confidence Score: 4/5

PR is safe to merge with a couple of P2 issues to address first.

The MCP Toolsets feature is well-architected with solid security controls (ContextVar injection prevention, header stripping, explicit grant checks), the GCP caching fix is correct, and test coverage for the new access-control paths is good. Two new P2 findings remain: (1) _ensure_eof can silently drop the EOF sentinel when the bounded queue is full; (2) dynamic_mcp_route falls back to a DB lookup for every unrecognised server name. Prior review threads flagged json.dumps serialization into the JSONB column and inline imports which are also still present.

litellm/proxy/proxy_server.py (_ensure_eof callback and dynamic_mcp_route toolset fallback), litellm/proxy/_experimental/mcp_server/toolset_db.py (json.dumps into JSONB — from prior review)

Important Files Changed

Filename Overview
litellm/proxy/_experimental/mcp_server/toolset_db.py New DB helper layer for toolset CRUD. Manually calls json.dumps() on the tools Json field before passing to Prisma (and json.loads() on read), storing a JSON string literal in the JSONB column instead of a native JSON array — already flagged in prior review.
litellm/proxy/_experimental/mcp_server/mcp_context.py New module introducing _mcp_active_toolset_id ContextVar to avoid circular imports between server.py and mcp_server_manager.py; clean and minimal.
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py Adds resolve_toolset_tool_permissions, invalidate_toolset_cache, and get_toolset_by_name_cached with Redis-backed caching; direct Prisma query in reload_servers_from_database bypasses get_all_mcp_servers helper (previously flagged).
litellm/proxy/_experimental/mcp_server/server.py Adds _apply_toolset_scope (access-control + permission override) and _merge_toolset_permissions (union toolset grants into key permissions); strips client-supplied x-mcp-toolset-id header to prevent forgery.
litellm/proxy/proxy_server.py Adds _stream_mcp_asgi_response helper with bounded queue (maxsize=1024) and toolset_mcp_route; _ensure_eof done-callback can silently fail if queue is full when handler task dies.
litellm/proxy/management_endpoints/mcp_management_endpoints.py Adds CRUD endpoints for toolsets with correct admin-only write guards and toolset-filtered read for non-admins; cache invalidation on all mutating paths.
litellm/_redis_credential_provider.py New GCPIAMCredentialProvider correctly refreshes IAM token on every new Redis connection, fixing the 1-hour expiry bug in async cluster clients.
litellm/_redis.py Replaces one-time IAM token generation with GCPIAMCredentialProvider for async cluster; removes dead code and debug log statements.
litellm/proxy/management_helpers/object_permission_utils.py Adds _extract_requested_mcp_toolsets and extends validate_key_mcp_servers_against_team to enforce that a key's requested toolsets are a subset of its team's allowed toolsets.
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_toolset_scope.py New test file with good unit coverage for _apply_toolset_scope, fetch_mcp_toolsets, and ContextVar injection prevention; all tests use mocks (no real network calls).
litellm-proxy-extras/litellm_proxy_extras/migrations/20260321000000_add_mcp_toolsets/migration.sql Idempotent CREATE TABLE IF NOT EXISTS and ALTER TABLE ADD COLUMN IF NOT EXISTS; unique index on toolset_name; uses JSONB default consistent with schema.

Sequence Diagram

sequenceDiagram
    participant Client
    participant ProxyServer as proxy_server.py
    participant ContextVar as _mcp_active_toolset_id
    participant Server as server.py
    participant Manager as MCPServerManager
    participant Cache as user_api_key_cache
    participant DB as Prisma DB

    Client->>ProxyServer: GET /toolset/{name}/mcp
    ProxyServer->>Manager: get_toolset_by_name_cached(name)
    Manager->>Cache: async_get_cache(toolset_name:{name})
    alt Cache hit
        Cache-->>Manager: MCPToolset
    else Cache miss
        Manager->>DB: find_first(toolset_name=name)
        DB-->>Manager: row
        Manager->>Cache: async_set_cache(...)
    end
    Manager-->>ProxyServer: toolset (id, name, tools)
    ProxyServer->>ContextVar: set(toolset.toolset_id)
    ProxyServer->>Server: handle_streamable_http_mcp(scope, receive, bridging_send)
    Server->>Server: strip x-mcp-toolset-id header from scope
    Server->>Server: _apply_toolset_scope(auth, toolset_id)
    Note over Server: Non-admin: check mcp_toolsets grant list
    Server->>Manager: resolve_toolset_tool_permissions([toolset_id])
    Manager->>Cache: async_get_cache(toolset_perms:{id})
    alt Cache hit
        Cache-->>Manager: server_id to tools map
    else Cache miss
        Manager->>DB: list_mcp_toolsets(toolset_ids=[id])
        DB-->>Manager: toolsets
        Manager->>Cache: async_set_cache(...)
    end
    Manager-->>Server: tool_permissions dict
    Server->>Server: Update object_permission (mcp_servers, mcp_tool_permissions)
    Server-->>Client: filtered tool list / MCP session
    ProxyServer->>ContextVar: reset(token)
Loading

Reviews (3): Last reviewed commit: "fix(tests): update reload_servers_from_d..." | Re-trigger Greptile

Comment on lines +38 to +40

async def get_mcp_toolset(
prisma_client: PrismaClient,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Non-idiomatic JSON serialisation for Prisma Json field

tools is defined as Json in the Prisma schema (tools Json @default("[]")). Prisma's Python client expects a Python dict/list for Json fields and handles serialisation itself. By calling json.dumps(...) first, you are passing a Python string to Prisma, which stores a JSON string literal in the JSONB column (e.g., "[{\"server_id\":\"x\"}]") instead of a JSON array ([{"server_id":"x"}]).

The read path in _toolset_from_row compensates for this with an isinstance(tools, str) branch, so round-trips work. However, the JSONB column stores string values instead of arrays, which means:

  1. Direct SQL queries (WHERE tools @> '[{"server_id":"x"}]'::jsonb) will silently return no results.
  2. Future Prisma middleware or Prisma validators that type-check Json fields may reject the string.

Pass the Python list directly and let Prisma serialise it:

Suggested change
async def get_mcp_toolset(
prisma_client: PrismaClient,
data_dict["tools"] = data_dict.get("tools", [])

And similarly for the update path in update_mcp_toolset (line 83):

# Before
data_dict["tools"] = json.dumps(data_dict["tools"])
# After
data_dict["tools"] = data_dict["tools"]  # Prisma serialises Json fields

Comment on lines +2583 to +2591
raw_rows = await prisma_client.db.litellm_mcpservertable.find_many(
where={
"OR": [
{"approval_status": None},
{"approval_status": {"in": ["active", "approved"]}},
]
}
)
db_mcp_servers = [LiteLLM_MCPServerTable(**r.model_dump()) for r in raw_rows]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Direct Prisma call bypasses get_all_mcp_servers helper

The previous code used get_all_mcp_servers(prisma_client, approval_status="active"). The new code performs an equivalent query directly via prisma_client.db.litellm_mcpservertable.find_many(), bypassing any logic the helper encapsulates.

Per the CLAUDE.md guideline and rule d7156c05, DB access for proxy operations should go through the established helper functions to avoid pattern drift and make future schema changes easier to track. If get_all_mcp_servers doesn't support the approval_status=None case needed here, consider extending it with an include_no_approval parameter rather than duplicating the query inline.

Rule Used: What: In critical path of request, there should be... (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +13858 to +13860
if not headers_ready.done():
headers_ready.set_result(
(message.get("status", 200), message.get("headers", []))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unbounded asyncio.Queue can grow without limit for large streaming responses

body_queue: asyncio.Queue = asyncio.Queue() has no capacity bound. If the ASGI handler produces body chunks faster than the HTTP client consumes them (e.g., slow client, large tool-list response), body_queue will accumulate all chunks in memory before they are yielded.

For an MCP server that streams a large JSON payload, this could exhaust worker memory. Consider using a bounded queue to apply backpressure:

body_queue: asyncio.Queue = asyncio.Queue(maxsize=64)

And in bridging_send, replace await body_queue.put(chunk) so it respects the bound (which put already does — it will await until there is space). The _ensure_eof callback uses put_nowait, which should stay as-is to avoid deadlock in the error path.

Comment on lines +823 to +825
from litellm.constants import DEFAULT_MANAGEMENT_OBJECT_IN_MEMORY_CACHE_TTL
from litellm.proxy._experimental.mcp_server.toolset_db import list_mcp_toolsets
from litellm.proxy.proxy_server import prisma_client, user_api_key_cache
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Inline imports inside methods (CLAUDE.md / style guide rule)

resolve_toolset_tool_permissions, invalidate_toolset_cache, and get_toolset_by_name_cached all contain inline from … import … statements. Per CLAUDE.md: "Avoid imports within methods — place all imports at the top of the file (module-level). The only exception is avoiding circular imports where absolutely necessary."

from litellm.constants import DEFAULT_MANAGEMENT_OBJECT_IN_MEMORY_CACHE_TTL and from litellm.types.mcp_server.mcp_toolset import MCPToolset are not involved in any circular import and can be safely moved to the module-level imports at the top of mcp_server_manager.py. The from litellm.proxy.proxy_server import prisma_client, user_api_key_cache imports are a known circular-import pattern used elsewhere in this file and are acceptable as inline — but the non-circular ones should be hoisted.

The same pattern also appears in _apply_toolset_scope in server.py (lines 2411–2412): from litellm.proxy._types import LiteLLM_ObjectPermissionTable can be moved to the top of the file.

Context Used: CLAUDE.md (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@@ -0,0 +1,524 @@
import React, { useState, useCallback } from "react";
import { Button, Text, Title } from "@tremor/react";
import { Modal, Form, Input, message, Spin, Card, Typography, Space } from "antd";

Check notice

Code scanning / CodeQL

Unused variable, import, function or class Note

Unused imports Card, Space.

Copilot Autofix

AI 12 days ago

To fix unused-import issues, remove the specific unused bindings from the import statement while keeping the ones that are actually used. This preserves existing functionality and simply cleans up dead code.

In this file, the antd import currently brings in Modal, Form, Input, message, Spin, Card, Typography, Space. Based on the CodeQL report and the visible usage (Typography aliasing to AntdText), Card and Space are not used. The best fix is to edit ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx so that the destructuring import from "antd" no longer includes Card and Space, leaving the rest unchanged. No additional methods, imports, or definitions are needed.

Suggested changeset 1
ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx b/ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx
--- a/ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx
+++ b/ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx
@@ -1,6 +1,6 @@
 import React, { useState, useCallback } from "react";
 import { Button, Text, Title } from "@tremor/react";
-import { Modal, Form, Input, message, Spin, Card, Typography, Space } from "antd";
+import { Modal, Form, Input, message, Spin, Typography } from "antd";
 import { PlusIcon, PencilIcon, TrashIcon } from "@heroicons/react/outline";
 import { ColumnDef } from "@tanstack/react-table";
 import { useMCPToolsets } from "@/app/(dashboard)/hooks/mcpServers/useMCPToolsets";
EOF
@@ -1,6 +1,6 @@
import React, { useState, useCallback } from "react";
import { Button, Text, Title } from "@tremor/react";
import { Modal, Form, Input, message, Spin, Card, Typography, Space } from "antd";
import { Modal, Form, Input, message, Spin, Typography } from "antd";
import { PlusIcon, PencilIcon, TrashIcon } from "@heroicons/react/outline";
import { ColumnDef } from "@tanstack/react-table";
import { useMCPToolsets } from "@/app/(dashboard)/hooks/mcpServers/useMCPToolsets";
Copilot is powered by AI and may make mistakes. Always verify output.
} from "../networking";
import { MCPToolset, MCPToolsetTool } from "./types";

const { Text: AntdText } = Typography;

Check notice

Code scanning / CodeQL

Unused variable, import, function or class Note

Unused variable AntdText.

Copilot Autofix

AI 12 days ago

To fix the problem, remove the unused variable so that the code no longer declares AntdText if it is not used anywhere. This removes dead code and satisfies the static analysis rule.

The best minimal fix without changing functionality is to delete the line const { Text: AntdText } = Typography; in ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx. Keeping the Typography import itself is appropriate, because we cannot be sure from the snippet whether other members of Typography are used further down; however, destructuring Text into AntdText clearly serves no purpose if it’s unused.

No new methods, imports, or definitions are required to implement this change; it is purely a removal of an unused variable declaration.

Suggested changeset 1
ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx b/ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx
--- a/ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx
+++ b/ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx
@@ -16,7 +16,6 @@
 } from "../networking";
 import { MCPToolset, MCPToolsetTool } from "./types";
 
-const { Text: AntdText } = Typography;
 
 interface MCPToolsetsTabProps {
   accessToken: string | null;
EOF
@@ -16,7 +16,6 @@
} from "../networking";
import { MCPToolset, MCPToolsetTool } from "./types";

const { Text: AntdText } = Typography;

interface MCPToolsetsTabProps {
accessToken: string | null;
Copilot is powered by AI and may make mistakes. Always verify output.
return;
}
// Resolve the real server ID (toolsets use toolset: prefix)
const mcpServerId = rawSelected.startsWith("toolset:") ? rawSelected : rawSelected;

Check notice

Code scanning / CodeQL

Unused variable, import, function or class Note

Unused variable mcpServerId.

Copilot Autofix

AI 12 days ago

In general, to fix an unused variable warning, either remove the variable (and any associated dead computation) if it truly isn’t needed, or start using it where it was intended to be used. Here, we see that rawSelected already holds the same value as the “resolved” server ID: const mcpServerId = rawSelected.startsWith("toolset:") ? rawSelected : rawSelected; simplifies to just rawSelected. Since no later code uses mcpServerId, the best non‑intrusive fix is to delete this declaration entirely.

Concretely, in ui/litellm-dashboard/src/components/playground/chat_ui/ChatUI.tsx, within the MCP direct mode block starting around line 592, remove the line that declares mcpServerId. Do not add new behavior or change existing variables; just delete the unused variable and keep the rest of the logic intact.

Suggested changeset 1
ui/litellm-dashboard/src/components/playground/chat_ui/ChatUI.tsx

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/ui/litellm-dashboard/src/components/playground/chat_ui/ChatUI.tsx b/ui/litellm-dashboard/src/components/playground/chat_ui/ChatUI.tsx
--- a/ui/litellm-dashboard/src/components/playground/chat_ui/ChatUI.tsx
+++ b/ui/litellm-dashboard/src/components/playground/chat_ui/ChatUI.tsx
@@ -600,7 +600,6 @@
         return;
       }
       // Resolve the real server ID (toolsets use toolset: prefix)
-      const mcpServerId = rawSelected.startsWith("toolset:") ? rawSelected : rawSelected;
       if (!selectedMCPDirectTool) {
         NotificationsManager.fromBackend("Please select an MCP tool to call");
         return;
EOF
@@ -600,7 +600,6 @@
return;
}
// Resolve the real server ID (toolsets use toolset: prefix)
const mcpServerId = rawSelected.startsWith("toolset:") ? rawSelected : rawSelected;
if (!selectedMCPDirectTool) {
NotificationsManager.fromBackend("Please select an MCP tool to call");
return;
Copilot is powered by AI and may make mistakes. Always verify output.
@ishaan-berri ishaan-berri temporarily deployed to integration-redis-postgres April 4, 2026 21:32 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 21:32 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 21:32 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 21:32 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 21:32 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri changed the base branch from main to litellm_ishaan_march23_2 April 4, 2026 22:16
@ishaan-berri ishaan-berri merged commit 2ce85e6 into litellm_ishaan_march23_2 Apr 4, 2026
106 of 117 checks passed
@ishaan-berri ishaan-berri deleted the litellm_ishaan_march23 branch April 4, 2026 22:16
ishaan-berri added a commit that referenced this pull request Apr 4, 2026
)

* Litellm ishaan march23 - MCP Toolsets + GCP Caching fix  (#25146)

* feat(mcp): MCP Toolsets — curated tool subsets from one or more MCP servers (#24335)

* feat(mcp): add LiteLLM_MCPToolsetTable and mcp_toolsets to ObjectPermissionTable

* feat(mcp): add prisma migration for MCPToolset table

* feat(mcp): add MCPToolset Python types

* feat(mcp): add toolset_db.py with CRUD helpers for MCPToolset

* feat(mcp): add toolset CRUD endpoints to mcp_management_endpoints

* fix(mcp): skip allow_all_keys servers when explicit mcp_servers permission is set (toolset scope fix)

* feat(mcp): add _apply_toolset_scope and toolset route handling in server.py

* fix(mcp): resolve toolset names in responses API before fetching tools

* feat(mcp): add mcp_toolsets field to LiteLLM_ObjectPermissionTable type

* feat(mcp): register LiteLLM_MCPToolsetTable in prisma client initialization

* feat(mcp): validate mcp_toolsets in key-vs-team permission check

* feat(mcp): register toolset routes in proxy_server.py

* feat(mcp): add MCPToolset and MCPToolsetTool TypeScript types

* feat(mcp): add fetchMCPToolsets, createMCPToolset, updateMCPToolset, deleteMCPToolset API functions

* feat(mcp): add useMCPToolsets React Query hook

* feat(mcp): add toolsets (purple) as third option type in MCPServerSelector

* feat(mcp): extract toolsets from combined MCP field in key form

* feat(mcp): extract toolsets from combined MCP field in team form

* feat(mcp): show toolsets section in MCPServerPermissions read view

* feat(mcp): pass mcp_toolsets through object_permissions_view

* feat(mcp): add MCPToolsetsTab component for creating and managing toolsets

* feat(mcp): add Toolsets tab to mcp_servers.tsx

* feat(mcp): pass mcpToolsets to playground chat and responses API calls

* feat(mcp): generate correct server_url for toolsets in playground API calls

* docs(mcp): add MCP Toolsets documentation

* docs(mcp): add mcp_toolsets to sidebar

* fix(mcp): replace x-mcp-toolset-id header with ContextVar to prevent client forgery

* fix(mcp): use ContextVar + StreamingResponse for toolset MCP routes (fixes SSE streaming)

* fix(mcp): cache toolset permission lookups to avoid per-request DB calls

* test(mcp): add tests for toolset scope enforcement, ContextVar isolation, and access control

* fix(mcp): cache toolset name lookups in MCPServerManager to avoid per-request DB calls

* fix(mcp): prevent body_iter deadlock + use cached toolset lookup in responses API

- _stream_mcp_asgi_response: add done callback to handler_task that puts
  the EOF sentinel on body_queue when the task exits, preventing body_iter
  from hanging forever if the handler raises after headers are sent.
- litellm_proxy_mcp_handler: replace raw get_mcp_toolset_by_name() DB call
  with global_mcp_server_manager.get_toolset_by_name_cached() so toolset
  resolution uses the 60s TTL cache added for this purpose instead of
  hitting the DB on every responses-API request.

* fix(mcp): toolset access control, asyncio fix, and real unit tests

- server.py: _apply_toolset_scope now enforces that non-admin keys must
  have the requested toolset_id in their mcp_toolsets grant list;
  admin keys always bypass the check.
- mcp_management_endpoints.py: three access-control fixes:
  * fetch_mcp_toolsets: non-admin keys with mcp_toolsets=None now
    return [] instead of all toolsets (only admins get 'all' when
    the field is absent)
  * fetch_mcp_toolset: non-admin keys that haven't been granted the
    requested toolset_id now get 403 instead of the full result
  * add_mcp_toolset: duplicate toolset_name now returns 409 Conflict
    instead of an opaque 500
- proxy_server.py: use asyncio.get_running_loop() instead of
  get_event_loop() inside an already-running coroutine (Python 3.10+).
- test_mcp_toolset_scope.py: replace four hollow tests that only
  asserted local variable properties with real tests that call the
  production fetch_mcp_toolsets() and handle_streamable_http_mcp()
  functions with mocked dependencies.

* fix(mcp): add mcp_toolsets to ObjectPermissionBase, fix multi-toolset overwrite, fix delete 404, allow standalone key toolsets

* fix(mcp): add auth check on toolset resolution in responses API; union mcp_servers in _merge_toolset_permissions

* fix(mcp): handle RecordNotFoundError in update_mcp_toolset; union direct servers with toolset servers

* fix(mcp): use _user_has_admin_view; deny None mcp_toolsets for non-admin; use direct RecordNotFoundError import; fix docstring

* fix(mcp): add @default(now()) to MCPToolsetTable.updated_at; fix test for non-admin toolset access

* fix: use UniqueViolationError import; guard _ensure_eof for error/cancel only

* fix(mcp): preserve mcp_access_groups in toolset scope, use shared Redis cache for toolset perms

- Remove mcp_access_groups=[] from _apply_toolset_scope (server.py) and the
  responses API toolset path (litellm_proxy_mcp_handler.py). A key's access-group
  grants remain valid even when the request is scoped to a single toolset; clearing
  them silently revoked legitimate entitlements.

- Switch resolve_toolset_tool_permissions and get_toolset_by_name_cached to use
  user_api_key_cache (Redis-backed DualCache in production) instead of per-instance
  in-memory dicts. Cache entries are now shared across workers, eliminating the
  per-worker stale-toolset-permission window flagged as a P1 by Greptile.

- Use union merge (set union of tool names per server) when applying toolset
  permissions in the responses API path so direct-server tool restrictions are not
  overwritten by toolset permissions.

* fix(mcp): return 404 when edit_mcp_toolset target does not exist

* fix(mcp): align mcp_toolsets default to None in LiteLLM_ObjectPermissionTable

* fix(mcp): admin toolset visibility, in-place tool name mutation, test helper coercion

* fix(mcp): treat None/[] team mcp_toolsets as no restriction in key validation

* fix(mcp): allow_all_keys backward compat, blocked_tools API write-path, efficient startup query

* fix(mcp): use _mcp_active_toolset_id ContextVar to detect toolset scope, avoiding DB-default false-positive

* fix(mcp): remove dead toolset cache stubs, log invalidation failures, align schema updated_at defaults

* fix(mcp): deserialise MCPToolset from Redis cache hit, replace fastapi import in test

* fix(mcp): evict name-cache on toolset mutation, 409 on rename conflict, warning-level list errors

* fix(redis): regenerate GCP IAM token per connection for async cluster (#24426)

* fix(redis): regenerate GCP IAM token per connection for async cluster clients

Async RedisCluster was generating the IAM token once at startup and
storing it as a static password. After the 1-hour GCP token TTL, any
new connection (including to newly-discovered cluster nodes) would fail
to authenticate.

Fix: introduce GCPIAMCredentialProvider that implements redis-py's
CredentialProvider protocol. It calls _generate_gcp_iam_access_token()
on every new connection, matching what the sync redis_connect_func
already does. async_redis.RedisCluster accepts a credential_provider
kwarg which is invoked per-connection.

* refactor(redis): move GCPIAMCredentialProvider to its own file

Extract GCPIAMCredentialProvider and _generate_gcp_iam_access_token
into litellm/_redis_credential_provider.py. _redis.py imports them
from there, keeping the public API unchanged.

* fix: address Greptile review issues

- GCPIAMCredentialProvider now inherits from redis.credentials.CredentialProvider
  so redis-py's async path calls get_credentials_async() properly
- move _redis_credential_provider import to top of _redis.py (PEP 8)
- remove dead else-branch that silently no-oped (gcp_service_account from
  redis_kwargs.get() was always None since it's popped by _get_redis_client_logic)
- remove mid-function 'from litellm import get_secret_str' inline import
- remove unused 'call' import from test_redis.py

* chore: retrigger CI/review

* chore: sync schema.prisma copies from root

* chore: sync schema.prisma copies from root

* fix(proxy_server): use bounded asyncio.Queue with maxsize to prevent unbounded growth

* fix(a2a/pydantic_ai): make api_base Optional to match base class signature

* fix(a2a/pydantic_ai): make api_base Optional in handler and guard against None

* fix(mcp): remove unused get_all_mcp_servers import

* fix(mcp): remove unused MCPToolset import

* refactor(mcp): extract toolset permission logic to reduce statement count below PLR0915 limit

* fix(tests): update reload_servers_from_database tests to mock prisma directly

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix(toolset_db): lazy-import prisma to avoid ImportError when prisma not installed

* fix(tests): update UI tests for toolset tab and updated empty state text

* fix(tests): add get_mcp_server_by_name to fake_manager stub

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
fede-kamel pushed a commit to fede-kamel/litellm that referenced this pull request Apr 5, 2026
… (BerriAI#25155)

* Litellm ishaan march23 - MCP Toolsets + GCP Caching fix  (BerriAI#25146)

* feat(mcp): MCP Toolsets — curated tool subsets from one or more MCP servers (BerriAI#24335)

* feat(mcp): add LiteLLM_MCPToolsetTable and mcp_toolsets to ObjectPermissionTable

* feat(mcp): add prisma migration for MCPToolset table

* feat(mcp): add MCPToolset Python types

* feat(mcp): add toolset_db.py with CRUD helpers for MCPToolset

* feat(mcp): add toolset CRUD endpoints to mcp_management_endpoints

* fix(mcp): skip allow_all_keys servers when explicit mcp_servers permission is set (toolset scope fix)

* feat(mcp): add _apply_toolset_scope and toolset route handling in server.py

* fix(mcp): resolve toolset names in responses API before fetching tools

* feat(mcp): add mcp_toolsets field to LiteLLM_ObjectPermissionTable type

* feat(mcp): register LiteLLM_MCPToolsetTable in prisma client initialization

* feat(mcp): validate mcp_toolsets in key-vs-team permission check

* feat(mcp): register toolset routes in proxy_server.py

* feat(mcp): add MCPToolset and MCPToolsetTool TypeScript types

* feat(mcp): add fetchMCPToolsets, createMCPToolset, updateMCPToolset, deleteMCPToolset API functions

* feat(mcp): add useMCPToolsets React Query hook

* feat(mcp): add toolsets (purple) as third option type in MCPServerSelector

* feat(mcp): extract toolsets from combined MCP field in key form

* feat(mcp): extract toolsets from combined MCP field in team form

* feat(mcp): show toolsets section in MCPServerPermissions read view

* feat(mcp): pass mcp_toolsets through object_permissions_view

* feat(mcp): add MCPToolsetsTab component for creating and managing toolsets

* feat(mcp): add Toolsets tab to mcp_servers.tsx

* feat(mcp): pass mcpToolsets to playground chat and responses API calls

* feat(mcp): generate correct server_url for toolsets in playground API calls

* docs(mcp): add MCP Toolsets documentation

* docs(mcp): add mcp_toolsets to sidebar

* fix(mcp): replace x-mcp-toolset-id header with ContextVar to prevent client forgery

* fix(mcp): use ContextVar + StreamingResponse for toolset MCP routes (fixes SSE streaming)

* fix(mcp): cache toolset permission lookups to avoid per-request DB calls

* test(mcp): add tests for toolset scope enforcement, ContextVar isolation, and access control

* fix(mcp): cache toolset name lookups in MCPServerManager to avoid per-request DB calls

* fix(mcp): prevent body_iter deadlock + use cached toolset lookup in responses API

- _stream_mcp_asgi_response: add done callback to handler_task that puts
  the EOF sentinel on body_queue when the task exits, preventing body_iter
  from hanging forever if the handler raises after headers are sent.
- litellm_proxy_mcp_handler: replace raw get_mcp_toolset_by_name() DB call
  with global_mcp_server_manager.get_toolset_by_name_cached() so toolset
  resolution uses the 60s TTL cache added for this purpose instead of
  hitting the DB on every responses-API request.

* fix(mcp): toolset access control, asyncio fix, and real unit tests

- server.py: _apply_toolset_scope now enforces that non-admin keys must
  have the requested toolset_id in their mcp_toolsets grant list;
  admin keys always bypass the check.
- mcp_management_endpoints.py: three access-control fixes:
  * fetch_mcp_toolsets: non-admin keys with mcp_toolsets=None now
    return [] instead of all toolsets (only admins get 'all' when
    the field is absent)
  * fetch_mcp_toolset: non-admin keys that haven't been granted the
    requested toolset_id now get 403 instead of the full result
  * add_mcp_toolset: duplicate toolset_name now returns 409 Conflict
    instead of an opaque 500
- proxy_server.py: use asyncio.get_running_loop() instead of
  get_event_loop() inside an already-running coroutine (Python 3.10+).
- test_mcp_toolset_scope.py: replace four hollow tests that only
  asserted local variable properties with real tests that call the
  production fetch_mcp_toolsets() and handle_streamable_http_mcp()
  functions with mocked dependencies.

* fix(mcp): add mcp_toolsets to ObjectPermissionBase, fix multi-toolset overwrite, fix delete 404, allow standalone key toolsets

* fix(mcp): add auth check on toolset resolution in responses API; union mcp_servers in _merge_toolset_permissions

* fix(mcp): handle RecordNotFoundError in update_mcp_toolset; union direct servers with toolset servers

* fix(mcp): use _user_has_admin_view; deny None mcp_toolsets for non-admin; use direct RecordNotFoundError import; fix docstring

* fix(mcp): add @default(now()) to MCPToolsetTable.updated_at; fix test for non-admin toolset access

* fix: use UniqueViolationError import; guard _ensure_eof for error/cancel only

* fix(mcp): preserve mcp_access_groups in toolset scope, use shared Redis cache for toolset perms

- Remove mcp_access_groups=[] from _apply_toolset_scope (server.py) and the
  responses API toolset path (litellm_proxy_mcp_handler.py). A key's access-group
  grants remain valid even when the request is scoped to a single toolset; clearing
  them silently revoked legitimate entitlements.

- Switch resolve_toolset_tool_permissions and get_toolset_by_name_cached to use
  user_api_key_cache (Redis-backed DualCache in production) instead of per-instance
  in-memory dicts. Cache entries are now shared across workers, eliminating the
  per-worker stale-toolset-permission window flagged as a P1 by Greptile.

- Use union merge (set union of tool names per server) when applying toolset
  permissions in the responses API path so direct-server tool restrictions are not
  overwritten by toolset permissions.

* fix(mcp): return 404 when edit_mcp_toolset target does not exist

* fix(mcp): align mcp_toolsets default to None in LiteLLM_ObjectPermissionTable

* fix(mcp): admin toolset visibility, in-place tool name mutation, test helper coercion

* fix(mcp): treat None/[] team mcp_toolsets as no restriction in key validation

* fix(mcp): allow_all_keys backward compat, blocked_tools API write-path, efficient startup query

* fix(mcp): use _mcp_active_toolset_id ContextVar to detect toolset scope, avoiding DB-default false-positive

* fix(mcp): remove dead toolset cache stubs, log invalidation failures, align schema updated_at defaults

* fix(mcp): deserialise MCPToolset from Redis cache hit, replace fastapi import in test

* fix(mcp): evict name-cache on toolset mutation, 409 on rename conflict, warning-level list errors

* fix(redis): regenerate GCP IAM token per connection for async cluster (BerriAI#24426)

* fix(redis): regenerate GCP IAM token per connection for async cluster clients

Async RedisCluster was generating the IAM token once at startup and
storing it as a static password. After the 1-hour GCP token TTL, any
new connection (including to newly-discovered cluster nodes) would fail
to authenticate.

Fix: introduce GCPIAMCredentialProvider that implements redis-py's
CredentialProvider protocol. It calls _generate_gcp_iam_access_token()
on every new connection, matching what the sync redis_connect_func
already does. async_redis.RedisCluster accepts a credential_provider
kwarg which is invoked per-connection.

* refactor(redis): move GCPIAMCredentialProvider to its own file

Extract GCPIAMCredentialProvider and _generate_gcp_iam_access_token
into litellm/_redis_credential_provider.py. _redis.py imports them
from there, keeping the public API unchanged.

* fix: address Greptile review issues

- GCPIAMCredentialProvider now inherits from redis.credentials.CredentialProvider
  so redis-py's async path calls get_credentials_async() properly
- move _redis_credential_provider import to top of _redis.py (PEP 8)
- remove dead else-branch that silently no-oped (gcp_service_account from
  redis_kwargs.get() was always None since it's popped by _get_redis_client_logic)
- remove mid-function 'from litellm import get_secret_str' inline import
- remove unused 'call' import from test_redis.py

* chore: retrigger CI/review

* chore: sync schema.prisma copies from root

* chore: sync schema.prisma copies from root

* fix(proxy_server): use bounded asyncio.Queue with maxsize to prevent unbounded growth

* fix(a2a/pydantic_ai): make api_base Optional to match base class signature

* fix(a2a/pydantic_ai): make api_base Optional in handler and guard against None

* fix(mcp): remove unused get_all_mcp_servers import

* fix(mcp): remove unused MCPToolset import

* refactor(mcp): extract toolset permission logic to reduce statement count below PLR0915 limit

* fix(tests): update reload_servers_from_database tests to mock prisma directly

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix(toolset_db): lazy-import prisma to avoid ImportError when prisma not installed

* fix(tests): update UI tests for toolset tab and updated empty state text

* fix(tests): add get_mcp_server_by_name to fake_manager stub

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
harish876 pushed a commit to harish876/litellm that referenced this pull request Apr 8, 2026
… (BerriAI#25155)

* Litellm ishaan march23 - MCP Toolsets + GCP Caching fix  (BerriAI#25146)

* feat(mcp): MCP Toolsets — curated tool subsets from one or more MCP servers (BerriAI#24335)

* feat(mcp): add LiteLLM_MCPToolsetTable and mcp_toolsets to ObjectPermissionTable

* feat(mcp): add prisma migration for MCPToolset table

* feat(mcp): add MCPToolset Python types

* feat(mcp): add toolset_db.py with CRUD helpers for MCPToolset

* feat(mcp): add toolset CRUD endpoints to mcp_management_endpoints

* fix(mcp): skip allow_all_keys servers when explicit mcp_servers permission is set (toolset scope fix)

* feat(mcp): add _apply_toolset_scope and toolset route handling in server.py

* fix(mcp): resolve toolset names in responses API before fetching tools

* feat(mcp): add mcp_toolsets field to LiteLLM_ObjectPermissionTable type

* feat(mcp): register LiteLLM_MCPToolsetTable in prisma client initialization

* feat(mcp): validate mcp_toolsets in key-vs-team permission check

* feat(mcp): register toolset routes in proxy_server.py

* feat(mcp): add MCPToolset and MCPToolsetTool TypeScript types

* feat(mcp): add fetchMCPToolsets, createMCPToolset, updateMCPToolset, deleteMCPToolset API functions

* feat(mcp): add useMCPToolsets React Query hook

* feat(mcp): add toolsets (purple) as third option type in MCPServerSelector

* feat(mcp): extract toolsets from combined MCP field in key form

* feat(mcp): extract toolsets from combined MCP field in team form

* feat(mcp): show toolsets section in MCPServerPermissions read view

* feat(mcp): pass mcp_toolsets through object_permissions_view

* feat(mcp): add MCPToolsetsTab component for creating and managing toolsets

* feat(mcp): add Toolsets tab to mcp_servers.tsx

* feat(mcp): pass mcpToolsets to playground chat and responses API calls

* feat(mcp): generate correct server_url for toolsets in playground API calls

* docs(mcp): add MCP Toolsets documentation

* docs(mcp): add mcp_toolsets to sidebar

* fix(mcp): replace x-mcp-toolset-id header with ContextVar to prevent client forgery

* fix(mcp): use ContextVar + StreamingResponse for toolset MCP routes (fixes SSE streaming)

* fix(mcp): cache toolset permission lookups to avoid per-request DB calls

* test(mcp): add tests for toolset scope enforcement, ContextVar isolation, and access control

* fix(mcp): cache toolset name lookups in MCPServerManager to avoid per-request DB calls

* fix(mcp): prevent body_iter deadlock + use cached toolset lookup in responses API

- _stream_mcp_asgi_response: add done callback to handler_task that puts
  the EOF sentinel on body_queue when the task exits, preventing body_iter
  from hanging forever if the handler raises after headers are sent.
- litellm_proxy_mcp_handler: replace raw get_mcp_toolset_by_name() DB call
  with global_mcp_server_manager.get_toolset_by_name_cached() so toolset
  resolution uses the 60s TTL cache added for this purpose instead of
  hitting the DB on every responses-API request.

* fix(mcp): toolset access control, asyncio fix, and real unit tests

- server.py: _apply_toolset_scope now enforces that non-admin keys must
  have the requested toolset_id in their mcp_toolsets grant list;
  admin keys always bypass the check.
- mcp_management_endpoints.py: three access-control fixes:
  * fetch_mcp_toolsets: non-admin keys with mcp_toolsets=None now
    return [] instead of all toolsets (only admins get 'all' when
    the field is absent)
  * fetch_mcp_toolset: non-admin keys that haven't been granted the
    requested toolset_id now get 403 instead of the full result
  * add_mcp_toolset: duplicate toolset_name now returns 409 Conflict
    instead of an opaque 500
- proxy_server.py: use asyncio.get_running_loop() instead of
  get_event_loop() inside an already-running coroutine (Python 3.10+).
- test_mcp_toolset_scope.py: replace four hollow tests that only
  asserted local variable properties with real tests that call the
  production fetch_mcp_toolsets() and handle_streamable_http_mcp()
  functions with mocked dependencies.

* fix(mcp): add mcp_toolsets to ObjectPermissionBase, fix multi-toolset overwrite, fix delete 404, allow standalone key toolsets

* fix(mcp): add auth check on toolset resolution in responses API; union mcp_servers in _merge_toolset_permissions

* fix(mcp): handle RecordNotFoundError in update_mcp_toolset; union direct servers with toolset servers

* fix(mcp): use _user_has_admin_view; deny None mcp_toolsets for non-admin; use direct RecordNotFoundError import; fix docstring

* fix(mcp): add @default(now()) to MCPToolsetTable.updated_at; fix test for non-admin toolset access

* fix: use UniqueViolationError import; guard _ensure_eof for error/cancel only

* fix(mcp): preserve mcp_access_groups in toolset scope, use shared Redis cache for toolset perms

- Remove mcp_access_groups=[] from _apply_toolset_scope (server.py) and the
  responses API toolset path (litellm_proxy_mcp_handler.py). A key's access-group
  grants remain valid even when the request is scoped to a single toolset; clearing
  them silently revoked legitimate entitlements.

- Switch resolve_toolset_tool_permissions and get_toolset_by_name_cached to use
  user_api_key_cache (Redis-backed DualCache in production) instead of per-instance
  in-memory dicts. Cache entries are now shared across workers, eliminating the
  per-worker stale-toolset-permission window flagged as a P1 by Greptile.

- Use union merge (set union of tool names per server) when applying toolset
  permissions in the responses API path so direct-server tool restrictions are not
  overwritten by toolset permissions.

* fix(mcp): return 404 when edit_mcp_toolset target does not exist

* fix(mcp): align mcp_toolsets default to None in LiteLLM_ObjectPermissionTable

* fix(mcp): admin toolset visibility, in-place tool name mutation, test helper coercion

* fix(mcp): treat None/[] team mcp_toolsets as no restriction in key validation

* fix(mcp): allow_all_keys backward compat, blocked_tools API write-path, efficient startup query

* fix(mcp): use _mcp_active_toolset_id ContextVar to detect toolset scope, avoiding DB-default false-positive

* fix(mcp): remove dead toolset cache stubs, log invalidation failures, align schema updated_at defaults

* fix(mcp): deserialise MCPToolset from Redis cache hit, replace fastapi import in test

* fix(mcp): evict name-cache on toolset mutation, 409 on rename conflict, warning-level list errors

* fix(redis): regenerate GCP IAM token per connection for async cluster (BerriAI#24426)

* fix(redis): regenerate GCP IAM token per connection for async cluster clients

Async RedisCluster was generating the IAM token once at startup and
storing it as a static password. After the 1-hour GCP token TTL, any
new connection (including to newly-discovered cluster nodes) would fail
to authenticate.

Fix: introduce GCPIAMCredentialProvider that implements redis-py's
CredentialProvider protocol. It calls _generate_gcp_iam_access_token()
on every new connection, matching what the sync redis_connect_func
already does. async_redis.RedisCluster accepts a credential_provider
kwarg which is invoked per-connection.

* refactor(redis): move GCPIAMCredentialProvider to its own file

Extract GCPIAMCredentialProvider and _generate_gcp_iam_access_token
into litellm/_redis_credential_provider.py. _redis.py imports them
from there, keeping the public API unchanged.

* fix: address Greptile review issues

- GCPIAMCredentialProvider now inherits from redis.credentials.CredentialProvider
  so redis-py's async path calls get_credentials_async() properly
- move _redis_credential_provider import to top of _redis.py (PEP 8)
- remove dead else-branch that silently no-oped (gcp_service_account from
  redis_kwargs.get() was always None since it's popped by _get_redis_client_logic)
- remove mid-function 'from litellm import get_secret_str' inline import
- remove unused 'call' import from test_redis.py

* chore: retrigger CI/review

* chore: sync schema.prisma copies from root

* chore: sync schema.prisma copies from root

* fix(proxy_server): use bounded asyncio.Queue with maxsize to prevent unbounded growth

* fix(a2a/pydantic_ai): make api_base Optional to match base class signature

* fix(a2a/pydantic_ai): make api_base Optional in handler and guard against None

* fix(mcp): remove unused get_all_mcp_servers import

* fix(mcp): remove unused MCPToolset import

* refactor(mcp): extract toolset permission logic to reduce statement count below PLR0915 limit

* fix(tests): update reload_servers_from_database tests to mock prisma directly

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix(toolset_db): lazy-import prisma to avoid ImportError when prisma not installed

* fix(tests): update UI tests for toolset tab and updated empty state text

* fix(tests): add get_mcp_server_by_name to fake_manager stub

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants