Skip to content

Litellm ishaan march23 - MCP Toolsets + GCP Caching fix (#25146)#25155

Merged
ishaan-berri merged 4 commits intomainfrom
litellm_ishaan_march23_2
Apr 4, 2026
Merged

Litellm ishaan march23 - MCP Toolsets + GCP Caching fix (#25146)#25155
ishaan-berri merged 4 commits intomainfrom
litellm_ishaan_march23_2

Conversation

@ishaan-berri
Copy link
Copy Markdown
Contributor

  • feat(mcp): MCP Toolsets — curated tool subsets from one or more MCP servers (feat(mcp): MCP Toolsets — curated tool subsets from one or more MCP servers #24335)

  • feat(mcp): add LiteLLM_MCPToolsetTable and mcp_toolsets to ObjectPermissionTable

  • feat(mcp): add prisma migration for MCPToolset table

  • feat(mcp): add MCPToolset Python types

  • feat(mcp): add toolset_db.py with CRUD helpers for MCPToolset

  • feat(mcp): add toolset CRUD endpoints to mcp_management_endpoints

  • fix(mcp): skip allow_all_keys servers when explicit mcp_servers permission is set (toolset scope fix)

  • feat(mcp): add _apply_toolset_scope and toolset route handling in server.py

  • fix(mcp): resolve toolset names in responses API before fetching tools

  • feat(mcp): add mcp_toolsets field to LiteLLM_ObjectPermissionTable type

  • feat(mcp): register LiteLLM_MCPToolsetTable in prisma client initialization

  • feat(mcp): validate mcp_toolsets in key-vs-team permission check

  • feat(mcp): register toolset routes in proxy_server.py

  • feat(mcp): add MCPToolset and MCPToolsetTool TypeScript types

  • feat(mcp): add fetchMCPToolsets, createMCPToolset, updateMCPToolset, deleteMCPToolset API functions

  • feat(mcp): add useMCPToolsets React Query hook

  • feat(mcp): add toolsets (purple) as third option type in MCPServerSelector

  • feat(mcp): extract toolsets from combined MCP field in key form

  • feat(mcp): extract toolsets from combined MCP field in team form

  • feat(mcp): show toolsets section in MCPServerPermissions read view

  • feat(mcp): pass mcp_toolsets through object_permissions_view

  • feat(mcp): add MCPToolsetsTab component for creating and managing toolsets

  • feat(mcp): add Toolsets tab to mcp_servers.tsx

  • feat(mcp): pass mcpToolsets to playground chat and responses API calls

  • feat(mcp): generate correct server_url for toolsets in playground API calls

  • docs(mcp): add MCP Toolsets documentation

  • docs(mcp): add mcp_toolsets to sidebar

  • fix(mcp): replace x-mcp-toolset-id header with ContextVar to prevent client forgery

  • fix(mcp): use ContextVar + StreamingResponse for toolset MCP routes (fixes SSE streaming)

  • fix(mcp): cache toolset permission lookups to avoid per-request DB calls

  • test(mcp): add tests for toolset scope enforcement, ContextVar isolation, and access control

  • fix(mcp): cache toolset name lookups in MCPServerManager to avoid per-request DB calls

  • fix(mcp): prevent body_iter deadlock + use cached toolset lookup in responses API

  • _stream_mcp_asgi_response: add done callback to handler_task that puts the EOF sentinel on body_queue when the task exits, preventing body_iter from hanging forever if the handler raises after headers are sent.
  • litellm_proxy_mcp_handler: replace raw get_mcp_toolset_by_name() DB call with global_mcp_server_manager.get_toolset_by_name_cached() so toolset resolution uses the 60s TTL cache added for this purpose instead of hitting the DB on every responses-API request.
  • fix(mcp): toolset access control, asyncio fix, and real unit tests
  • server.py: _apply_toolset_scope now enforces that non-admin keys must have the requested toolset_id in their mcp_toolsets grant list; admin keys always bypass the check.
  • mcp_management_endpoints.py: three access-control fixes:
    • fetch_mcp_toolsets: non-admin keys with mcp_toolsets=None now return [] instead of all toolsets (only admins get 'all' when the field is absent)
    • fetch_mcp_toolset: non-admin keys that haven't been granted the requested toolset_id now get 403 instead of the full result
    • add_mcp_toolset: duplicate toolset_name now returns 409 Conflict instead of an opaque 500
  • proxy_server.py: use asyncio.get_running_loop() instead of get_event_loop() inside an already-running coroutine (Python 3.10+).
  • test_mcp_toolset_scope.py: replace four hollow tests that only asserted local variable properties with real tests that call the production fetch_mcp_toolsets() and handle_streamable_http_mcp() functions with mocked dependencies.
  • fix(mcp): add mcp_toolsets to ObjectPermissionBase, fix multi-toolset overwrite, fix delete 404, allow standalone key toolsets

  • fix(mcp): add auth check on toolset resolution in responses API; union mcp_servers in _merge_toolset_permissions

  • fix(mcp): handle RecordNotFoundError in update_mcp_toolset; union direct servers with toolset servers

  • fix(mcp): use _user_has_admin_view; deny None mcp_toolsets for non-admin; use direct RecordNotFoundError import; fix docstring

  • fix(mcp): add @default(now()) to MCPToolsetTable.updated_at; fix test for non-admin toolset access

  • fix: use UniqueViolationError import; guard _ensure_eof for error/cancel only

  • fix(mcp): preserve mcp_access_groups in toolset scope, use shared Redis cache for toolset perms

  • Remove mcp_access_groups=[] from _apply_toolset_scope (server.py) and the responses API toolset path (litellm_proxy_mcp_handler.py). A key's access-group grants remain valid even when the request is scoped to a single toolset; clearing them silently revoked legitimate entitlements.

  • Switch resolve_toolset_tool_permissions and get_toolset_by_name_cached to use user_api_key_cache (Redis-backed DualCache in production) instead of per-instance in-memory dicts. Cache entries are now shared across workers, eliminating the per-worker stale-toolset-permission window flagged as a P1 by Greptile.

  • Use union merge (set union of tool names per server) when applying toolset permissions in the responses API path so direct-server tool restrictions are not overwritten by toolset permissions.

  • fix(mcp): return 404 when edit_mcp_toolset target does not exist

  • fix(mcp): align mcp_toolsets default to None in LiteLLM_ObjectPermissionTable

  • fix(mcp): admin toolset visibility, in-place tool name mutation, test helper coercion

  • fix(mcp): treat None/[] team mcp_toolsets as no restriction in key validation

  • fix(mcp): allow_all_keys backward compat, blocked_tools API write-path, efficient startup query

  • fix(mcp): use _mcp_active_toolset_id ContextVar to detect toolset scope, avoiding DB-default false-positive

  • fix(mcp): remove dead toolset cache stubs, log invalidation failures, align schema updated_at defaults

  • fix(mcp): deserialise MCPToolset from Redis cache hit, replace fastapi import in test

  • fix(mcp): evict name-cache on toolset mutation, 409 on rename conflict, warning-level list errors

  • fix(redis): regenerate GCP IAM token per connection for async cluster (fix(redis): regenerate GCP IAM token per connection for async cluster #24426)

  • fix(redis): regenerate GCP IAM token per connection for async cluster clients

Async RedisCluster was generating the IAM token once at startup and storing it as a static password. After the 1-hour GCP token TTL, any new connection (including to newly-discovered cluster nodes) would fail to authenticate.

Fix: introduce GCPIAMCredentialProvider that implements redis-py's CredentialProvider protocol. It calls _generate_gcp_iam_access_token() on every new connection, matching what the sync redis_connect_func already does. async_redis.RedisCluster accepts a credential_provider kwarg which is invoked per-connection.

  • refactor(redis): move GCPIAMCredentialProvider to its own file

Extract GCPIAMCredentialProvider and _generate_gcp_iam_access_token into litellm/_redis_credential_provider.py. _redis.py imports them from there, keeping the public API unchanged.

  • fix: address Greptile review issues
  • GCPIAMCredentialProvider now inherits from redis.credentials.CredentialProvider so redis-py's async path calls get_credentials_async() properly
  • move _redis_credential_provider import to top of _redis.py (PEP 8)
  • remove dead else-branch that silently no-oped (gcp_service_account from redis_kwargs.get() was always None since it's popped by _get_redis_client_logic)
  • remove mid-function 'from litellm import get_secret_str' inline import
  • remove unused 'call' import from test_redis.py
  • chore: retrigger CI/review

  • chore: sync schema.prisma copies from root

  • chore: sync schema.prisma copies from root

  • fix(proxy_server): use bounded asyncio.Queue with maxsize to prevent unbounded growth

  • fix(a2a/pydantic_ai): make api_base Optional to match base class signature

  • fix(a2a/pydantic_ai): make api_base Optional in handler and guard against None

  • fix(mcp): remove unused get_all_mcp_servers import

  • fix(mcp): remove unused MCPToolset import

  • refactor(mcp): extract toolset permission logic to reduce statement count below PLR0915 limit

  • fix(tests): update reload_servers_from_database tests to mock prisma directly


Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

* feat(mcp): MCP Toolsets — curated tool subsets from one or more MCP servers (#24335)

* feat(mcp): add LiteLLM_MCPToolsetTable and mcp_toolsets to ObjectPermissionTable

* feat(mcp): add prisma migration for MCPToolset table

* feat(mcp): add MCPToolset Python types

* feat(mcp): add toolset_db.py with CRUD helpers for MCPToolset

* feat(mcp): add toolset CRUD endpoints to mcp_management_endpoints

* fix(mcp): skip allow_all_keys servers when explicit mcp_servers permission is set (toolset scope fix)

* feat(mcp): add _apply_toolset_scope and toolset route handling in server.py

* fix(mcp): resolve toolset names in responses API before fetching tools

* feat(mcp): add mcp_toolsets field to LiteLLM_ObjectPermissionTable type

* feat(mcp): register LiteLLM_MCPToolsetTable in prisma client initialization

* feat(mcp): validate mcp_toolsets in key-vs-team permission check

* feat(mcp): register toolset routes in proxy_server.py

* feat(mcp): add MCPToolset and MCPToolsetTool TypeScript types

* feat(mcp): add fetchMCPToolsets, createMCPToolset, updateMCPToolset, deleteMCPToolset API functions

* feat(mcp): add useMCPToolsets React Query hook

* feat(mcp): add toolsets (purple) as third option type in MCPServerSelector

* feat(mcp): extract toolsets from combined MCP field in key form

* feat(mcp): extract toolsets from combined MCP field in team form

* feat(mcp): show toolsets section in MCPServerPermissions read view

* feat(mcp): pass mcp_toolsets through object_permissions_view

* feat(mcp): add MCPToolsetsTab component for creating and managing toolsets

* feat(mcp): add Toolsets tab to mcp_servers.tsx

* feat(mcp): pass mcpToolsets to playground chat and responses API calls

* feat(mcp): generate correct server_url for toolsets in playground API calls

* docs(mcp): add MCP Toolsets documentation

* docs(mcp): add mcp_toolsets to sidebar

* fix(mcp): replace x-mcp-toolset-id header with ContextVar to prevent client forgery

* fix(mcp): use ContextVar + StreamingResponse for toolset MCP routes (fixes SSE streaming)

* fix(mcp): cache toolset permission lookups to avoid per-request DB calls

* test(mcp): add tests for toolset scope enforcement, ContextVar isolation, and access control

* fix(mcp): cache toolset name lookups in MCPServerManager to avoid per-request DB calls

* fix(mcp): prevent body_iter deadlock + use cached toolset lookup in responses API

- _stream_mcp_asgi_response: add done callback to handler_task that puts
  the EOF sentinel on body_queue when the task exits, preventing body_iter
  from hanging forever if the handler raises after headers are sent.
- litellm_proxy_mcp_handler: replace raw get_mcp_toolset_by_name() DB call
  with global_mcp_server_manager.get_toolset_by_name_cached() so toolset
  resolution uses the 60s TTL cache added for this purpose instead of
  hitting the DB on every responses-API request.

* fix(mcp): toolset access control, asyncio fix, and real unit tests

- server.py: _apply_toolset_scope now enforces that non-admin keys must
  have the requested toolset_id in their mcp_toolsets grant list;
  admin keys always bypass the check.
- mcp_management_endpoints.py: three access-control fixes:
  * fetch_mcp_toolsets: non-admin keys with mcp_toolsets=None now
    return [] instead of all toolsets (only admins get 'all' when
    the field is absent)
  * fetch_mcp_toolset: non-admin keys that haven't been granted the
    requested toolset_id now get 403 instead of the full result
  * add_mcp_toolset: duplicate toolset_name now returns 409 Conflict
    instead of an opaque 500
- proxy_server.py: use asyncio.get_running_loop() instead of
  get_event_loop() inside an already-running coroutine (Python 3.10+).
- test_mcp_toolset_scope.py: replace four hollow tests that only
  asserted local variable properties with real tests that call the
  production fetch_mcp_toolsets() and handle_streamable_http_mcp()
  functions with mocked dependencies.

* fix(mcp): add mcp_toolsets to ObjectPermissionBase, fix multi-toolset overwrite, fix delete 404, allow standalone key toolsets

* fix(mcp): add auth check on toolset resolution in responses API; union mcp_servers in _merge_toolset_permissions

* fix(mcp): handle RecordNotFoundError in update_mcp_toolset; union direct servers with toolset servers

* fix(mcp): use _user_has_admin_view; deny None mcp_toolsets for non-admin; use direct RecordNotFoundError import; fix docstring

* fix(mcp): add @default(now()) to MCPToolsetTable.updated_at; fix test for non-admin toolset access

* fix: use UniqueViolationError import; guard _ensure_eof for error/cancel only

* fix(mcp): preserve mcp_access_groups in toolset scope, use shared Redis cache for toolset perms

- Remove mcp_access_groups=[] from _apply_toolset_scope (server.py) and the
  responses API toolset path (litellm_proxy_mcp_handler.py). A key's access-group
  grants remain valid even when the request is scoped to a single toolset; clearing
  them silently revoked legitimate entitlements.

- Switch resolve_toolset_tool_permissions and get_toolset_by_name_cached to use
  user_api_key_cache (Redis-backed DualCache in production) instead of per-instance
  in-memory dicts. Cache entries are now shared across workers, eliminating the
  per-worker stale-toolset-permission window flagged as a P1 by Greptile.

- Use union merge (set union of tool names per server) when applying toolset
  permissions in the responses API path so direct-server tool restrictions are not
  overwritten by toolset permissions.

* fix(mcp): return 404 when edit_mcp_toolset target does not exist

* fix(mcp): align mcp_toolsets default to None in LiteLLM_ObjectPermissionTable

* fix(mcp): admin toolset visibility, in-place tool name mutation, test helper coercion

* fix(mcp): treat None/[] team mcp_toolsets as no restriction in key validation

* fix(mcp): allow_all_keys backward compat, blocked_tools API write-path, efficient startup query

* fix(mcp): use _mcp_active_toolset_id ContextVar to detect toolset scope, avoiding DB-default false-positive

* fix(mcp): remove dead toolset cache stubs, log invalidation failures, align schema updated_at defaults

* fix(mcp): deserialise MCPToolset from Redis cache hit, replace fastapi import in test

* fix(mcp): evict name-cache on toolset mutation, 409 on rename conflict, warning-level list errors

* fix(redis): regenerate GCP IAM token per connection for async cluster (#24426)

* fix(redis): regenerate GCP IAM token per connection for async cluster clients

Async RedisCluster was generating the IAM token once at startup and
storing it as a static password. After the 1-hour GCP token TTL, any
new connection (including to newly-discovered cluster nodes) would fail
to authenticate.

Fix: introduce GCPIAMCredentialProvider that implements redis-py's
CredentialProvider protocol. It calls _generate_gcp_iam_access_token()
on every new connection, matching what the sync redis_connect_func
already does. async_redis.RedisCluster accepts a credential_provider
kwarg which is invoked per-connection.

* refactor(redis): move GCPIAMCredentialProvider to its own file

Extract GCPIAMCredentialProvider and _generate_gcp_iam_access_token
into litellm/_redis_credential_provider.py. _redis.py imports them
from there, keeping the public API unchanged.

* fix: address Greptile review issues

- GCPIAMCredentialProvider now inherits from redis.credentials.CredentialProvider
  so redis-py's async path calls get_credentials_async() properly
- move _redis_credential_provider import to top of _redis.py (PEP 8)
- remove dead else-branch that silently no-oped (gcp_service_account from
  redis_kwargs.get() was always None since it's popped by _get_redis_client_logic)
- remove mid-function 'from litellm import get_secret_str' inline import
- remove unused 'call' import from test_redis.py

* chore: retrigger CI/review

* chore: sync schema.prisma copies from root

* chore: sync schema.prisma copies from root

* fix(proxy_server): use bounded asyncio.Queue with maxsize to prevent unbounded growth

* fix(a2a/pydantic_ai): make api_base Optional to match base class signature

* fix(a2a/pydantic_ai): make api_base Optional in handler and guard against None

* fix(mcp): remove unused get_all_mcp_servers import

* fix(mcp): remove unused MCPToolset import

* refactor(mcp): extract toolset permission logic to reduce statement count below PLR0915 limit

* fix(tests): update reload_servers_from_database tests to mock prisma directly

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 22:17 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 22:17 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 22:17 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-redis-postgres April 4, 2026 22:17 — with GitHub Actions Inactive
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 4, 2026 11:16pm

Request Review

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Apr 4, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing litellm_ishaan_march23_2 (a5d9236) with main (5187629)

Open in CodSpeed

@@ -0,0 +1,524 @@
import React, { useState, useCallback } from "react";
import { Button, Text, Title } from "@tremor/react";
import { Modal, Form, Input, message, Spin, Card, Typography, Space } from "antd";

Check notice

Code scanning / CodeQL

Unused variable, import, function or class Note

Unused imports Card, Space.

Copilot Autofix

AI 13 days ago

In general, to fix unused import issues, remove the unused symbols from the import statement (or remove the entire import if nothing from it is used). This eliminates dead code, reduces bundle size slightly, and removes confusion.

Here, the best fix without changing functionality is to edit the "antd" import on line 3 and simply delete Card and Space from the destructuring list, leaving all other imported components intact. No other code changes or new imports are required, since we’re only removing unused items. All changes occur in ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx at the top import section.

Suggested changeset 1
ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx b/ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx
--- a/ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx
+++ b/ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx
@@ -1,6 +1,6 @@
 import React, { useState, useCallback } from "react";
 import { Button, Text, Title } from "@tremor/react";
-import { Modal, Form, Input, message, Spin, Card, Typography, Space } from "antd";
+import { Modal, Form, Input, message, Spin, Typography } from "antd";
 import { PlusIcon, PencilIcon, TrashIcon } from "@heroicons/react/outline";
 import { ColumnDef } from "@tanstack/react-table";
 import { useMCPToolsets } from "@/app/(dashboard)/hooks/mcpServers/useMCPToolsets";
EOF
@@ -1,6 +1,6 @@
import React, { useState, useCallback } from "react";
import { Button, Text, Title } from "@tremor/react";
import { Modal, Form, Input, message, Spin, Card, Typography, Space } from "antd";
import { Modal, Form, Input, message, Spin, Typography } from "antd";
import { PlusIcon, PencilIcon, TrashIcon } from "@heroicons/react/outline";
import { ColumnDef } from "@tanstack/react-table";
import { useMCPToolsets } from "@/app/(dashboard)/hooks/mcpServers/useMCPToolsets";
Copilot is powered by AI and may make mistakes. Always verify output.
} from "../networking";
import { MCPToolset, MCPToolsetTool } from "./types";

const { Text: AntdText } = Typography;

Check notice

Code scanning / CodeQL

Unused variable, import, function or class Note

Unused variable AntdText.

Copilot Autofix

AI 13 days ago

In general, unused variables should be removed to improve readability and avoid confusion, as they provide no functional benefit and may indicate incomplete refactoring. Here, the best fix is to delete the unused AntdText alias while leaving the Typography import untouched, since other members of Typography might be used elsewhere.

Specifically, in ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx, delete the line that defines AntdText:

const { Text: AntdText } = Typography;

No additional imports, methods, or definitions are needed, and this change will not affect existing functionality since AntdText is not used.

Suggested changeset 1
ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx b/ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx
--- a/ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx
+++ b/ui/litellm-dashboard/src/components/mcp_tools/MCPToolsetsTab.tsx
@@ -16,8 +16,6 @@
 } from "../networking";
 import { MCPToolset, MCPToolsetTool } from "./types";
 
-const { Text: AntdText } = Typography;
-
 interface MCPToolsetsTabProps {
   accessToken: string | null;
   userRole: string | null;
EOF
@@ -16,8 +16,6 @@
} from "../networking";
import { MCPToolset, MCPToolsetTool } from "./types";

const { Text: AntdText } = Typography;

interface MCPToolsetsTabProps {
accessToken: string | null;
userRole: string | null;
Copilot is powered by AI and may make mistakes. Always verify output.
return;
}
// Resolve the real server ID (toolsets use toolset: prefix)
const mcpServerId = rawSelected.startsWith("toolset:") ? rawSelected : rawSelected;

Check notice

Code scanning / CodeQL

Unused variable, import, function or class Note

Unused variable mcpServerId.

Copilot Autofix

AI 13 days ago

In general, unused variables should be removed to improve readability and avoid confusion about intent. If the value they compute is actually needed, then the code should be updated to use it; if not, the computation and variable should both be removed.

Here, mcpServerId is assigned as:

const mcpServerId = rawSelected.startsWith("toolset:") ? rawSelected : rawSelected;

which simplifies to just rawSelected and is never read. There is already logic below that uses rawSelected and toolsetForSelected, so nothing depends on mcpServerId. The minimal, behavior-preserving fix is to delete this line entirely and leave the rest of the MCP handling code intact. No additional methods, imports, or definitions are needed.

Concretely, in ui/litellm-dashboard/src/components/playground/chat_ui/ChatUI.tsx, inside the if (endpointType === EndpointType.MCP) block around lines 592–636, delete the line that declares mcpServerId. All other lines remain unchanged.

Suggested changeset 1
ui/litellm-dashboard/src/components/playground/chat_ui/ChatUI.tsx

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/ui/litellm-dashboard/src/components/playground/chat_ui/ChatUI.tsx b/ui/litellm-dashboard/src/components/playground/chat_ui/ChatUI.tsx
--- a/ui/litellm-dashboard/src/components/playground/chat_ui/ChatUI.tsx
+++ b/ui/litellm-dashboard/src/components/playground/chat_ui/ChatUI.tsx
@@ -600,7 +600,6 @@
         return;
       }
       // Resolve the real server ID (toolsets use toolset: prefix)
-      const mcpServerId = rawSelected.startsWith("toolset:") ? rawSelected : rawSelected;
       if (!selectedMCPDirectTool) {
         NotificationsManager.fromBackend("Please select an MCP tool to call");
         return;
EOF
@@ -600,7 +600,6 @@
return;
}
// Resolve the real server ID (toolsets use toolset: prefix)
const mcpServerId = rawSelected.startsWith("toolset:") ? rawSelected : rawSelected;
if (!selectedMCPDirectTool) {
NotificationsManager.fromBackend("Please select an MCP tool to call");
return;
Copilot is powered by AI and may make mistakes. Always verify output.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 4, 2026

Greptile Summary

This PR introduces MCP Toolsets — named, curated subsets of tools from one or more MCP servers — accessible at /toolset/{name}/mcp and /{name}/mcp. It also fixes GCP IAM token expiry for async Redis cluster clients by introducing a GCPIAMCredentialProvider that regenerates the 1-hour token on every new connection.

Key changes:

  • New LiteLLM_MCPToolsetTable Prisma model with migration; mcp_toolsets column added to LiteLLM_ObjectPermissionTable
  • _mcp_active_toolset_id ContextVar set server-side in route handlers only — client-supplied x-mcp-toolset-id headers are explicitly stripped, preventing forgery
  • _apply_toolset_scope restricts a key's object_permission to the toolset's server/tool pairs; non-admin keys must have the toolset ID in their mcp_toolsets grant list or receive HTTP 403
  • Toolset permission lookups cached in user_api_key_cache (Redis-backed DualCache) and shared across workers; in-memory layer is eagerly evicted after create/update/delete mutations
  • _stream_mcp_asgi_response SSE bridging helper with a bounded asyncio.Queue(maxsize=1024) and a done-callback EOF sentinel
  • Full toolset CRUD endpoints (admin-only writes, filtered reads, 409 on duplicate name, 404 on missing)
  • GCPIAMCredentialProvider inheriting redis.credentials.CredentialProvider with get_credentials_async() for per-connection token refresh
  • Toolset name resolution and union-merged permissions in the Responses API path

Minor issues found:

  • toolset_db.py imports RecordNotFoundError inline inside except blocks rather than at module level — the identical pattern is handled correctly with try/except ImportError at the top of mcp_management_endpoints.py
  • invalidate_toolset_cache silently skips eviction without any log output when the in-memory cache's cache_dict attribute is absent

Confidence Score: 5/5

Safe to merge — both remaining findings are minor style and observability issues with no correctness or security impact

Previously flagged P1 issues (per-worker cache stale toolset permissions and _ensure_eof QueueFull hang) have been resolved. The ContextVar-based toolset scope isolation is correct and well-tested. Access-control logic is properly enforced at multiple layers. The two new P2 findings (inline import style in toolset_db.py and silent cache-miss logging in invalidate_toolset_cache) do not affect runtime correctness or security.

litellm/proxy/_experimental/mcp_server/toolset_db.py (inline RecordNotFoundError imports), litellm/proxy/_experimental/mcp_server/mcp_server_manager.py (invalidate_toolset_cache silent skip)

Important Files Changed

Filename Overview
litellm/_redis_credential_provider.py New GCPIAMCredentialProvider inheriting CredentialProvider; generates a fresh IAM token per connection via get_credentials/get_credentials_async, fixing 1-hour token expiry for async Redis cluster
litellm/_redis.py Refactored to import GCPIAMCredentialProvider from new module; passes credential_provider to async RedisCluster; removes dead else-branch and mid-function inline import
litellm/proxy/_experimental/mcp_server/mcp_context.py New module exposing _mcp_active_toolset_id ContextVar; set server-side only by proxy route handlers, never from client headers, preventing forgery
litellm/proxy/_experimental/mcp_server/toolset_db.py New Prisma CRUD helpers for MCPToolset; inline RecordNotFoundError imports inside except blocks should be module-level per project style
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py Adds resolve_toolset_tool_permissions and get_toolset_by_name_cached (Redis-backed DualCache shared across workers), invalidate_toolset_cache, and ContextVar-based toolset scope detection in get_allowed_mcp_servers
litellm/proxy/_experimental/mcp_server/server.py Adds _apply_toolset_scope (enforce grant list, resolve permissions) and _merge_toolset_permissions; strips client x-mcp-toolset-id headers; reads ContextVar to detect toolset scope
litellm/proxy/management_endpoints/mcp_management_endpoints.py Adds CRUD REST endpoints for toolsets: admin-only writes (409 on duplicate name, 404 on missing), filtered reads for non-admin keys, cache invalidation after mutations
litellm/proxy/proxy_server.py Adds _stream_mcp_asgi_response SSE bridging helper with bounded queue, toolset_mcp_route at /toolset/{name}/mcp, and toolset fallback in dynamic_mcp_route
litellm/responses/mcp/litellm_proxy_mcp_handler.py Resolves toolset names in responses API path, union-merges permissions across multiple toolsets, enforces non-admin access control before applying permissions
litellm/proxy/_types.py Adds mcp_toolsets and blocked_tools to LiteLLM_ObjectPermissionBase and LiteLLM_ObjectPermissionTable; majority of changes are Black formatter reformatting
litellm-proxy-extras/litellm_proxy_extras/migrations/20260321000000_add_mcp_toolsets/migration.sql Creates LiteLLM_MCPToolsetTable with unique toolset_name index; adds mcp_toolsets TEXT[] DEFAULT ARRAY[]::TEXT[] to LiteLLM_ObjectPermissionTable
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_toolset_scope.py New tests for toolset scope enforcement: _apply_toolset_scope admin bypass, non-admin 403, fetch_mcp_toolsets filtering, and ContextVar isolation with header stripping
tests/test_litellm/test_redis.py Adds GCPIAMCredentialProvider unit tests verifying per-call token regeneration and async cluster credential_provider wiring; formatting cleanup

Sequence Diagram

sequenceDiagram
    participant Client
    participant Proxy as proxy_server.py
    participant Stream as _stream_mcp_asgi_response
    participant Handler as handle_streamable_http_mcp
    participant Apply as _apply_toolset_scope
    participant Mgr as MCPServerManager
    participant Cache as DualCache (Redis)
    participant DB as Prisma DB

    Client->>Proxy: POST /toolset/{name}/mcp
    Proxy->>Proxy: get_toolset_by_name_cached(name)
    Proxy->>Proxy: _mcp_active_toolset_id.set(toolset_id)
    Proxy->>Stream: await _stream_mcp_asgi_response(handle_fn, scope, receive)
    Stream->>Handler: asyncio.create_task (copies ContextVar snapshot)
    Handler->>Handler: Strip x-mcp-toolset-id from scope headers
    Handler->>Handler: active_toolset_id = _mcp_active_toolset_id.get()
    Handler->>Apply: _apply_toolset_scope(user_auth, toolset_id)
    Apply->>Apply: Check mcp_toolsets grant list (HTTP 403 if absent)
    Apply->>Mgr: resolve_toolset_tool_permissions([toolset_id])
    Mgr->>Cache: async_get_cache("toolset_perms:id")
    alt cache miss
        Cache-->>Mgr: None
        Mgr->>DB: list_mcp_toolsets(toolset_ids)
        DB-->>Mgr: MCPToolset rows
        Mgr->>Cache: async_set_cache(key, permissions, ttl)
    else cache hit
        Cache-->>Mgr: {server_id: [tool_names]}
    end
    Mgr-->>Apply: {server_id: [tool_names]}
    Apply-->>Handler: UserAPIKeyAuth (mcp_servers + mcp_tool_permissions restricted)
    Handler->>Handler: set_auth_context(restricted_auth)
    Handler-->>Stream: SSE chunks via bridging_send
    Stream-->>Proxy: StreamingResponse
    Proxy-->>Client: Streaming MCP session (toolset scope)
    Proxy->>Proxy: finally: _mcp_active_toolset_id.reset(token)
Loading

Reviews (4): Last reviewed commit: "fix(tests): add get_mcp_server_by_name t..." | Re-trigger Greptile

Comment on lines +13863 to +13865
def _ensure_eof(task: asyncio.Task) -> None:
if task.cancelled() or task.exception() is not None:
body_queue.put_nowait(None)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 _ensure_eof may silently drop EOF sentinel when queue is full

If the bounded body_queue (maxsize=1024) happens to be full at the exact moment the handler task fails or is cancelled, body_queue.put_nowait(None) raises asyncio.QueueFull. asyncio silently discards exceptions raised inside done callbacks, so the EOF sentinel is never placed and body_iter() will hang indefinitely waiting for it.

A simple guard prevents this:

Suggested change
def _ensure_eof(task: asyncio.Task) -> None:
if task.cancelled() or task.exception() is not None:
body_queue.put_nowait(None)
def _ensure_eof(task: asyncio.Task) -> None:
if task.cancelled() or task.exception() is not None:
try:
body_queue.put_nowait(None)
except asyncio.QueueFull:
pass # body_iter's finally block will cancel the task anyway

Comment on lines 119 to 171

return mcp_tools_with_litellm_proxy, other_tools

@staticmethod
async def _apply_toolset_permissions(
resolved_toolset_ids: List[str],
resolved_mcp_servers: List[str],
user_api_key_auth: Any,
) -> Any:
"""Apply resolved toolset permissions to user_api_key_auth and return updated auth."""
from litellm.proxy._types import LiteLLM_ObjectPermissionTable

try:
from litellm.proxy._experimental.mcp_server.mcp_server_manager import (
global_mcp_server_manager,
)

tool_permissions = (
await global_mcp_server_manager.resolve_toolset_tool_permissions(
toolset_ids=resolved_toolset_ids
)
)
all_server_ids = list(
set(tool_permissions.keys()) | set(resolved_mcp_servers)
)
existing_op = user_api_key_auth.object_permission
if existing_op is not None:
merged_tool_perms = dict(existing_op.mcp_tool_permissions or {})
for server_id, tool_names in tool_permissions.items():
existing_tools = merged_tool_perms.get(server_id, [])
merged_tool_perms[server_id] = list(
set(existing_tools) | set(tool_names)
)
updated_op = existing_op.model_copy(
update={
"mcp_servers": all_server_ids,
"mcp_tool_permissions": merged_tool_perms,
"mcp_toolsets": [],
}
)
else:
updated_op = LiteLLM_ObjectPermissionTable(
object_permission_id="toolset-scope",
mcp_servers=all_server_ids,
mcp_tool_permissions=tool_permissions,
)
return user_api_key_auth.model_copy(update={"object_permission": updated_op})
except Exception as _e:
verbose_logger.debug(f"Could not apply toolset permissions: {_e}")
return user_api_key_auth

@staticmethod
async def _get_mcp_tools_from_manager(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Silent fallback on toolset permission resolution failure broadens effective permissions

If resolve_toolset_tool_permissions raises (e.g. DB error), the broad except Exception in _apply_toolset_permissions returns the original user_api_key_auth unchanged. The caller has already passed the access-grant check (the key is permitted to use the toolset), but without the resolved mcp_tool_permissions, the key's object_permission.mcp_tool_permissions remains empty — meaning the MCP server grants access to all tools on the servers it has access to rather than just the toolset-defined subset.

Consider raising here (or returning a specific error response) so the failure is visible rather than silently expanding permissions:

except Exception as _e:
    verbose_logger.warning(f"Could not apply toolset permissions: {_e}")
    raise  # surface the failure rather than silently broadening access

@ishaan-berri ishaan-berri temporarily deployed to integration-redis-postgres April 4, 2026 22:52 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 22:52 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 22:52 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 22:52 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 22:52 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri enabled auto-merge (squash) April 4, 2026 23:09
@yuneng-berri yuneng-berri self-requested a review April 4, 2026 23:13
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 23:14 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-redis-postgres April 4, 2026 23:14 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 23:14 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 23:14 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 23:14 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri disabled auto-merge April 4, 2026 23:23
@ishaan-berri ishaan-berri merged commit 693ad49 into main Apr 4, 2026
97 of 116 checks passed
@ishaan-berri ishaan-berri deleted the litellm_ishaan_march23_2 branch April 4, 2026 23:23
fede-kamel pushed a commit to fede-kamel/litellm that referenced this pull request Apr 5, 2026
… (BerriAI#25155)

* Litellm ishaan march23 - MCP Toolsets + GCP Caching fix  (BerriAI#25146)

* feat(mcp): MCP Toolsets — curated tool subsets from one or more MCP servers (BerriAI#24335)

* feat(mcp): add LiteLLM_MCPToolsetTable and mcp_toolsets to ObjectPermissionTable

* feat(mcp): add prisma migration for MCPToolset table

* feat(mcp): add MCPToolset Python types

* feat(mcp): add toolset_db.py with CRUD helpers for MCPToolset

* feat(mcp): add toolset CRUD endpoints to mcp_management_endpoints

* fix(mcp): skip allow_all_keys servers when explicit mcp_servers permission is set (toolset scope fix)

* feat(mcp): add _apply_toolset_scope and toolset route handling in server.py

* fix(mcp): resolve toolset names in responses API before fetching tools

* feat(mcp): add mcp_toolsets field to LiteLLM_ObjectPermissionTable type

* feat(mcp): register LiteLLM_MCPToolsetTable in prisma client initialization

* feat(mcp): validate mcp_toolsets in key-vs-team permission check

* feat(mcp): register toolset routes in proxy_server.py

* feat(mcp): add MCPToolset and MCPToolsetTool TypeScript types

* feat(mcp): add fetchMCPToolsets, createMCPToolset, updateMCPToolset, deleteMCPToolset API functions

* feat(mcp): add useMCPToolsets React Query hook

* feat(mcp): add toolsets (purple) as third option type in MCPServerSelector

* feat(mcp): extract toolsets from combined MCP field in key form

* feat(mcp): extract toolsets from combined MCP field in team form

* feat(mcp): show toolsets section in MCPServerPermissions read view

* feat(mcp): pass mcp_toolsets through object_permissions_view

* feat(mcp): add MCPToolsetsTab component for creating and managing toolsets

* feat(mcp): add Toolsets tab to mcp_servers.tsx

* feat(mcp): pass mcpToolsets to playground chat and responses API calls

* feat(mcp): generate correct server_url for toolsets in playground API calls

* docs(mcp): add MCP Toolsets documentation

* docs(mcp): add mcp_toolsets to sidebar

* fix(mcp): replace x-mcp-toolset-id header with ContextVar to prevent client forgery

* fix(mcp): use ContextVar + StreamingResponse for toolset MCP routes (fixes SSE streaming)

* fix(mcp): cache toolset permission lookups to avoid per-request DB calls

* test(mcp): add tests for toolset scope enforcement, ContextVar isolation, and access control

* fix(mcp): cache toolset name lookups in MCPServerManager to avoid per-request DB calls

* fix(mcp): prevent body_iter deadlock + use cached toolset lookup in responses API

- _stream_mcp_asgi_response: add done callback to handler_task that puts
  the EOF sentinel on body_queue when the task exits, preventing body_iter
  from hanging forever if the handler raises after headers are sent.
- litellm_proxy_mcp_handler: replace raw get_mcp_toolset_by_name() DB call
  with global_mcp_server_manager.get_toolset_by_name_cached() so toolset
  resolution uses the 60s TTL cache added for this purpose instead of
  hitting the DB on every responses-API request.

* fix(mcp): toolset access control, asyncio fix, and real unit tests

- server.py: _apply_toolset_scope now enforces that non-admin keys must
  have the requested toolset_id in their mcp_toolsets grant list;
  admin keys always bypass the check.
- mcp_management_endpoints.py: three access-control fixes:
  * fetch_mcp_toolsets: non-admin keys with mcp_toolsets=None now
    return [] instead of all toolsets (only admins get 'all' when
    the field is absent)
  * fetch_mcp_toolset: non-admin keys that haven't been granted the
    requested toolset_id now get 403 instead of the full result
  * add_mcp_toolset: duplicate toolset_name now returns 409 Conflict
    instead of an opaque 500
- proxy_server.py: use asyncio.get_running_loop() instead of
  get_event_loop() inside an already-running coroutine (Python 3.10+).
- test_mcp_toolset_scope.py: replace four hollow tests that only
  asserted local variable properties with real tests that call the
  production fetch_mcp_toolsets() and handle_streamable_http_mcp()
  functions with mocked dependencies.

* fix(mcp): add mcp_toolsets to ObjectPermissionBase, fix multi-toolset overwrite, fix delete 404, allow standalone key toolsets

* fix(mcp): add auth check on toolset resolution in responses API; union mcp_servers in _merge_toolset_permissions

* fix(mcp): handle RecordNotFoundError in update_mcp_toolset; union direct servers with toolset servers

* fix(mcp): use _user_has_admin_view; deny None mcp_toolsets for non-admin; use direct RecordNotFoundError import; fix docstring

* fix(mcp): add @default(now()) to MCPToolsetTable.updated_at; fix test for non-admin toolset access

* fix: use UniqueViolationError import; guard _ensure_eof for error/cancel only

* fix(mcp): preserve mcp_access_groups in toolset scope, use shared Redis cache for toolset perms

- Remove mcp_access_groups=[] from _apply_toolset_scope (server.py) and the
  responses API toolset path (litellm_proxy_mcp_handler.py). A key's access-group
  grants remain valid even when the request is scoped to a single toolset; clearing
  them silently revoked legitimate entitlements.

- Switch resolve_toolset_tool_permissions and get_toolset_by_name_cached to use
  user_api_key_cache (Redis-backed DualCache in production) instead of per-instance
  in-memory dicts. Cache entries are now shared across workers, eliminating the
  per-worker stale-toolset-permission window flagged as a P1 by Greptile.

- Use union merge (set union of tool names per server) when applying toolset
  permissions in the responses API path so direct-server tool restrictions are not
  overwritten by toolset permissions.

* fix(mcp): return 404 when edit_mcp_toolset target does not exist

* fix(mcp): align mcp_toolsets default to None in LiteLLM_ObjectPermissionTable

* fix(mcp): admin toolset visibility, in-place tool name mutation, test helper coercion

* fix(mcp): treat None/[] team mcp_toolsets as no restriction in key validation

* fix(mcp): allow_all_keys backward compat, blocked_tools API write-path, efficient startup query

* fix(mcp): use _mcp_active_toolset_id ContextVar to detect toolset scope, avoiding DB-default false-positive

* fix(mcp): remove dead toolset cache stubs, log invalidation failures, align schema updated_at defaults

* fix(mcp): deserialise MCPToolset from Redis cache hit, replace fastapi import in test

* fix(mcp): evict name-cache on toolset mutation, 409 on rename conflict, warning-level list errors

* fix(redis): regenerate GCP IAM token per connection for async cluster (BerriAI#24426)

* fix(redis): regenerate GCP IAM token per connection for async cluster clients

Async RedisCluster was generating the IAM token once at startup and
storing it as a static password. After the 1-hour GCP token TTL, any
new connection (including to newly-discovered cluster nodes) would fail
to authenticate.

Fix: introduce GCPIAMCredentialProvider that implements redis-py's
CredentialProvider protocol. It calls _generate_gcp_iam_access_token()
on every new connection, matching what the sync redis_connect_func
already does. async_redis.RedisCluster accepts a credential_provider
kwarg which is invoked per-connection.

* refactor(redis): move GCPIAMCredentialProvider to its own file

Extract GCPIAMCredentialProvider and _generate_gcp_iam_access_token
into litellm/_redis_credential_provider.py. _redis.py imports them
from there, keeping the public API unchanged.

* fix: address Greptile review issues

- GCPIAMCredentialProvider now inherits from redis.credentials.CredentialProvider
  so redis-py's async path calls get_credentials_async() properly
- move _redis_credential_provider import to top of _redis.py (PEP 8)
- remove dead else-branch that silently no-oped (gcp_service_account from
  redis_kwargs.get() was always None since it's popped by _get_redis_client_logic)
- remove mid-function 'from litellm import get_secret_str' inline import
- remove unused 'call' import from test_redis.py

* chore: retrigger CI/review

* chore: sync schema.prisma copies from root

* chore: sync schema.prisma copies from root

* fix(proxy_server): use bounded asyncio.Queue with maxsize to prevent unbounded growth

* fix(a2a/pydantic_ai): make api_base Optional to match base class signature

* fix(a2a/pydantic_ai): make api_base Optional in handler and guard against None

* fix(mcp): remove unused get_all_mcp_servers import

* fix(mcp): remove unused MCPToolset import

* refactor(mcp): extract toolset permission logic to reduce statement count below PLR0915 limit

* fix(tests): update reload_servers_from_database tests to mock prisma directly

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix(toolset_db): lazy-import prisma to avoid ImportError when prisma not installed

* fix(tests): update UI tests for toolset tab and updated empty state text

* fix(tests): add get_mcp_server_by_name to fake_manager stub

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
harish876 pushed a commit to harish876/litellm that referenced this pull request Apr 8, 2026
… (BerriAI#25155)

* Litellm ishaan march23 - MCP Toolsets + GCP Caching fix  (BerriAI#25146)

* feat(mcp): MCP Toolsets — curated tool subsets from one or more MCP servers (BerriAI#24335)

* feat(mcp): add LiteLLM_MCPToolsetTable and mcp_toolsets to ObjectPermissionTable

* feat(mcp): add prisma migration for MCPToolset table

* feat(mcp): add MCPToolset Python types

* feat(mcp): add toolset_db.py with CRUD helpers for MCPToolset

* feat(mcp): add toolset CRUD endpoints to mcp_management_endpoints

* fix(mcp): skip allow_all_keys servers when explicit mcp_servers permission is set (toolset scope fix)

* feat(mcp): add _apply_toolset_scope and toolset route handling in server.py

* fix(mcp): resolve toolset names in responses API before fetching tools

* feat(mcp): add mcp_toolsets field to LiteLLM_ObjectPermissionTable type

* feat(mcp): register LiteLLM_MCPToolsetTable in prisma client initialization

* feat(mcp): validate mcp_toolsets in key-vs-team permission check

* feat(mcp): register toolset routes in proxy_server.py

* feat(mcp): add MCPToolset and MCPToolsetTool TypeScript types

* feat(mcp): add fetchMCPToolsets, createMCPToolset, updateMCPToolset, deleteMCPToolset API functions

* feat(mcp): add useMCPToolsets React Query hook

* feat(mcp): add toolsets (purple) as third option type in MCPServerSelector

* feat(mcp): extract toolsets from combined MCP field in key form

* feat(mcp): extract toolsets from combined MCP field in team form

* feat(mcp): show toolsets section in MCPServerPermissions read view

* feat(mcp): pass mcp_toolsets through object_permissions_view

* feat(mcp): add MCPToolsetsTab component for creating and managing toolsets

* feat(mcp): add Toolsets tab to mcp_servers.tsx

* feat(mcp): pass mcpToolsets to playground chat and responses API calls

* feat(mcp): generate correct server_url for toolsets in playground API calls

* docs(mcp): add MCP Toolsets documentation

* docs(mcp): add mcp_toolsets to sidebar

* fix(mcp): replace x-mcp-toolset-id header with ContextVar to prevent client forgery

* fix(mcp): use ContextVar + StreamingResponse for toolset MCP routes (fixes SSE streaming)

* fix(mcp): cache toolset permission lookups to avoid per-request DB calls

* test(mcp): add tests for toolset scope enforcement, ContextVar isolation, and access control

* fix(mcp): cache toolset name lookups in MCPServerManager to avoid per-request DB calls

* fix(mcp): prevent body_iter deadlock + use cached toolset lookup in responses API

- _stream_mcp_asgi_response: add done callback to handler_task that puts
  the EOF sentinel on body_queue when the task exits, preventing body_iter
  from hanging forever if the handler raises after headers are sent.
- litellm_proxy_mcp_handler: replace raw get_mcp_toolset_by_name() DB call
  with global_mcp_server_manager.get_toolset_by_name_cached() so toolset
  resolution uses the 60s TTL cache added for this purpose instead of
  hitting the DB on every responses-API request.

* fix(mcp): toolset access control, asyncio fix, and real unit tests

- server.py: _apply_toolset_scope now enforces that non-admin keys must
  have the requested toolset_id in their mcp_toolsets grant list;
  admin keys always bypass the check.
- mcp_management_endpoints.py: three access-control fixes:
  * fetch_mcp_toolsets: non-admin keys with mcp_toolsets=None now
    return [] instead of all toolsets (only admins get 'all' when
    the field is absent)
  * fetch_mcp_toolset: non-admin keys that haven't been granted the
    requested toolset_id now get 403 instead of the full result
  * add_mcp_toolset: duplicate toolset_name now returns 409 Conflict
    instead of an opaque 500
- proxy_server.py: use asyncio.get_running_loop() instead of
  get_event_loop() inside an already-running coroutine (Python 3.10+).
- test_mcp_toolset_scope.py: replace four hollow tests that only
  asserted local variable properties with real tests that call the
  production fetch_mcp_toolsets() and handle_streamable_http_mcp()
  functions with mocked dependencies.

* fix(mcp): add mcp_toolsets to ObjectPermissionBase, fix multi-toolset overwrite, fix delete 404, allow standalone key toolsets

* fix(mcp): add auth check on toolset resolution in responses API; union mcp_servers in _merge_toolset_permissions

* fix(mcp): handle RecordNotFoundError in update_mcp_toolset; union direct servers with toolset servers

* fix(mcp): use _user_has_admin_view; deny None mcp_toolsets for non-admin; use direct RecordNotFoundError import; fix docstring

* fix(mcp): add @default(now()) to MCPToolsetTable.updated_at; fix test for non-admin toolset access

* fix: use UniqueViolationError import; guard _ensure_eof for error/cancel only

* fix(mcp): preserve mcp_access_groups in toolset scope, use shared Redis cache for toolset perms

- Remove mcp_access_groups=[] from _apply_toolset_scope (server.py) and the
  responses API toolset path (litellm_proxy_mcp_handler.py). A key's access-group
  grants remain valid even when the request is scoped to a single toolset; clearing
  them silently revoked legitimate entitlements.

- Switch resolve_toolset_tool_permissions and get_toolset_by_name_cached to use
  user_api_key_cache (Redis-backed DualCache in production) instead of per-instance
  in-memory dicts. Cache entries are now shared across workers, eliminating the
  per-worker stale-toolset-permission window flagged as a P1 by Greptile.

- Use union merge (set union of tool names per server) when applying toolset
  permissions in the responses API path so direct-server tool restrictions are not
  overwritten by toolset permissions.

* fix(mcp): return 404 when edit_mcp_toolset target does not exist

* fix(mcp): align mcp_toolsets default to None in LiteLLM_ObjectPermissionTable

* fix(mcp): admin toolset visibility, in-place tool name mutation, test helper coercion

* fix(mcp): treat None/[] team mcp_toolsets as no restriction in key validation

* fix(mcp): allow_all_keys backward compat, blocked_tools API write-path, efficient startup query

* fix(mcp): use _mcp_active_toolset_id ContextVar to detect toolset scope, avoiding DB-default false-positive

* fix(mcp): remove dead toolset cache stubs, log invalidation failures, align schema updated_at defaults

* fix(mcp): deserialise MCPToolset from Redis cache hit, replace fastapi import in test

* fix(mcp): evict name-cache on toolset mutation, 409 on rename conflict, warning-level list errors

* fix(redis): regenerate GCP IAM token per connection for async cluster (BerriAI#24426)

* fix(redis): regenerate GCP IAM token per connection for async cluster clients

Async RedisCluster was generating the IAM token once at startup and
storing it as a static password. After the 1-hour GCP token TTL, any
new connection (including to newly-discovered cluster nodes) would fail
to authenticate.

Fix: introduce GCPIAMCredentialProvider that implements redis-py's
CredentialProvider protocol. It calls _generate_gcp_iam_access_token()
on every new connection, matching what the sync redis_connect_func
already does. async_redis.RedisCluster accepts a credential_provider
kwarg which is invoked per-connection.

* refactor(redis): move GCPIAMCredentialProvider to its own file

Extract GCPIAMCredentialProvider and _generate_gcp_iam_access_token
into litellm/_redis_credential_provider.py. _redis.py imports them
from there, keeping the public API unchanged.

* fix: address Greptile review issues

- GCPIAMCredentialProvider now inherits from redis.credentials.CredentialProvider
  so redis-py's async path calls get_credentials_async() properly
- move _redis_credential_provider import to top of _redis.py (PEP 8)
- remove dead else-branch that silently no-oped (gcp_service_account from
  redis_kwargs.get() was always None since it's popped by _get_redis_client_logic)
- remove mid-function 'from litellm import get_secret_str' inline import
- remove unused 'call' import from test_redis.py

* chore: retrigger CI/review

* chore: sync schema.prisma copies from root

* chore: sync schema.prisma copies from root

* fix(proxy_server): use bounded asyncio.Queue with maxsize to prevent unbounded growth

* fix(a2a/pydantic_ai): make api_base Optional to match base class signature

* fix(a2a/pydantic_ai): make api_base Optional in handler and guard against None

* fix(mcp): remove unused get_all_mcp_servers import

* fix(mcp): remove unused MCPToolset import

* refactor(mcp): extract toolset permission logic to reduce statement count below PLR0915 limit

* fix(tests): update reload_servers_from_database tests to mock prisma directly

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix(toolset_db): lazy-import prisma to avoid ImportError when prisma not installed

* fix(tests): update UI tests for toolset tab and updated empty state text

* fix(tests): add get_mcp_server_by_name to fake_manager stub

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants