Skip to content

fix(mcp): is_tool_name_prefixed validates against known server prefixes#25085

Merged
krrish-berri-2 merged 1 commit intoBerriAI:litellm_oss_staging_04_04_2026from
voidborne-d:fix/is-tool-name-prefixed-false-positive
Apr 5, 2026
Merged

fix(mcp): is_tool_name_prefixed validates against known server prefixes#25085
krrish-berri-2 merged 1 commit intoBerriAI:litellm_oss_staging_04_04_2026from
voidborne-d:fix/is-tool-name-prefixed-false-positive

Conversation

@voidborne-d
Copy link
Copy Markdown
Contributor

Summary

Fixes #25081.

is_tool_name_prefixed() checked for the presence of MCP_TOOL_PREFIX_SEPARATOR (default "-") anywhere in the tool name. Any non-MCP tool whose name contains a hyphen (e.g. text-to-speech, code-review) was silently misclassified as an MCP-prefixed tool.

When mcp_semantic_tool_filter is enabled, these tools get routed through the semantic filter and potentially dropped — a silent data loss bug.

Root Cause

def is_tool_name_prefixed(tool_name: str) -> bool:
    return MCP_TOOL_PREFIX_SEPARATOR in tool_name  # "-" in "text-to-speech" → True!

Fix

Added an optional known_server_prefixes parameter. When supplied, the function extracts the candidate prefix (text before the first separator) and checks it against the normalised set of registered server prefixes. Only a genuine match returns True.

is_tool_name_prefixed("text-to-speech", known_server_prefixes={"myserver"})  # → False ✅
is_tool_name_prefixed("myserver-get_weather", known_server_prefixes={"myserver"})  # → True ✅

Without the set, legacy behaviour is preserved for backward compatibility.

Updated _get_mcp_server_from_tool_name() to build the prefix set from the live registry and pass it through.

Changes

  • litellm/proxy/_experimental/mcp_server/utils.pyis_tool_name_prefixed() now accepts known_server_prefixes: Optional[set]
  • litellm/proxy/_experimental/mcp_server/mcp_server_manager.py_get_mcp_server_from_tool_name() builds the prefix set from the registry
  • 9 new tests covering both legacy and new behaviour

Testing

tests/test_litellm/proxy/_experimental/mcp_server/test_is_tool_name_prefixed.py — 9 passed

Fixes BerriAI#25081.

is_tool_name_prefixed() checked for the presence of MCP_TOOL_PREFIX_SEPARATOR
(default '-') anywhere in the tool name.  Any non-MCP tool whose name
contains a hyphen (e.g. 'text-to-speech', 'code-review') was silently
misclassified as an MCP-prefixed tool.  When the semantic tool filter is
enabled, these tools would be routed through semantic matching and
potentially dropped.

Fix: accept an optional known_server_prefixes set.  When supplied, the
function extracts the candidate prefix (text before the first separator)
and checks it against the normalised set of registered server prefixes.
Only a genuine match returns True.  Without the set, legacy behaviour is
preserved for backward compatibility.

Updated _get_mcp_server_from_tool_name() to build the prefix set from
the live registry and pass it through.

9 new tests.
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 3, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 3, 2026 3:12pm

Request Review

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Apr 3, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing voidborne-d:fix/is-tool-name-prefixed-false-positive (8ed3ceb) with main (d4a3a5e)

Open in CodSpeed

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 3, 2026

Greptile Summary

This PR fixes a silent false-positive bug in is_tool_name_prefixed() where any tool name containing a hyphen (e.g. text-to-speech, code-review) was incorrectly classified as an MCP-prefixed tool, causing non-MCP tools to be routed through the semantic filter and potentially dropped.

Key changes:

  • is_tool_name_prefixed() in utils.py now accepts an optional known_server_prefixes set; when provided it extracts the candidate prefix (text before the first separator) and checks it against the set of normalised registered server prefixes, returning False for non-MCP hyphenated tool names.
  • _get_mcp_server_from_tool_name() in mcp_server_manager.py builds the prefix set from the live registry (alias → server_name → server_id) before calling the updated function.
  • Legacy behaviour (heuristic separator check) is preserved when known_server_prefixes is not supplied, maintaining backward compatibility for any call sites that cannot easily access the registry.
  • 9 new unit tests cover both the legacy and new paths, confirming the fix for the reported issue.

Confidence Score: 5/5

Safe to merge — focused, backward-compatible bug fix with full test coverage and no regressions.

All remaining findings are P2 style suggestions (bare set annotation, double get_server_prefix call per server in comprehension). No correctness, data-integrity, or security issues were found.

No files require special attention.

Important Files Changed

Filename Overview
litellm/proxy/_experimental/mcp_server/utils.py Adds optional known_server_prefixes param to is_tool_name_prefixed; logic is correct and backward-compatible; minor: bare set type annotation instead of Set[str].
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py Builds registry-derived prefix set before calling is_tool_name_prefixed, correctly scoping the check to known servers; minor: get_server_prefix invoked twice per server in the set comprehension.
tests/test_litellm/proxy/_experimental/mcp_server/test_is_tool_name_prefixed.py 9 new unit tests covering legacy heuristic, known-prefix matching, empty-prefix-set, normalisation, and non-MCP hyphenated names; no real network calls; covers all key branches.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["is_tool_name_prefixed(tool_name, known_server_prefixes)"] --> B{MCP_TOOL_PREFIX_SEPARATOR\nin tool_name?}
    B -- No --> C["return False"]
    B -- Yes --> D{known_server_prefixes\nprovided?}
    D -- No --> E["Legacy: return True\n(heuristic — any hyphen)"]
    D -- Yes --> F["candidate_prefix = tool_name.split(sep, 1)[0]"]
    F --> G["normalize_server_name(candidate_prefix)\nin known_server_prefixes?"]
    G -- Yes --> H["return True ✅\n(genuine MCP prefix)"]
    G -- No --> I["return False ✅\n(e.g. 'text-to-speech')"]
Loading

Reviews (1): Last reviewed commit: "fix(mcp): is_tool_name_prefixed validate..." | Re-trigger Greptile

def is_tool_name_prefixed(tool_name: str) -> bool:
def is_tool_name_prefixed(
tool_name: str,
known_server_prefixes: Optional[set] = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unparameterized set type annotation

known_server_prefixes: Optional[set] uses a bare, unparameterized set. The codebase uses Optional[Set[str]] elsewhere (e.g. with from typing import Set). Bare set loses the element-type information for mypy and IDEs.

Suggested change
known_server_prefixes: Optional[set] = None,
known_server_prefixes: Optional[Set[str]] = None,

You'll also need to add Set to the existing from typing import line.

Comment on lines +2392 to +2396
known_prefixes = {
normalize_server_name(get_server_prefix(s))
for s in self.get_registry().values()
if get_server_prefix(s)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 get_server_prefix called twice per server in set comprehension

get_server_prefix(s) is invoked once in the if guard and once to produce the value, so it's called twice for every server in the registry. For a small registry this is negligible, but it's cleaner to call it once:

Suggested change
known_prefixes = {
normalize_server_name(get_server_prefix(s))
for s in self.get_registry().values()
if get_server_prefix(s)
}
known_prefixes = {
normalized
for s in self.get_registry().values()
for prefix in (get_server_prefix(s),)
if prefix
for normalized in (normalize_server_name(prefix),)
}

Alternatively a simple loop:

known_prefixes: Set[str] = set()
for s in self.get_registry().values():
    prefix = get_server_prefix(s)
    if prefix:
        known_prefixes.add(normalize_server_name(prefix))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the double call. However, get_server_prefix is a pure O(1) string split — no I/O, no allocation beyond the prefix string itself. The set comprehension runs once at call time (not in a hot loop), and the number of MCP servers is typically single-digit. The readability win of the guard + value pattern outweighs caching here IMO, but happy to add a walrus operator (if (p := get_server_prefix(s)) is not None) if the team prefers.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@krrish-berri-2 krrish-berri-2 changed the base branch from main to litellm_oss_staging_04_04_2026 April 5, 2026 01:23
@krrish-berri-2 krrish-berri-2 merged commit cf94f4d into BerriAI:litellm_oss_staging_04_04_2026 Apr 5, 2026
59 of 62 checks passed
@klhq
Copy link
Copy Markdown

klhq commented Apr 13, 2026

The merge commit cf94f4d8 landed in litellm_oss_staging_04_04_2026 on 2026-04-05, but that staging branch ended up diverging from main (234 behind, 7 ahead) and the fix never reached main. Current utils.py still has the original 1-line is_tool_name_prefixed() without known_server_prefixes.

Happy to open a replacement PR that reapplies this change (with Co-authored-by: @voidborne-d) — or let me know if a cherry-pick to the next staging branch is already planned.

cc @krrish-berri-2 (original merger), @voidborne-d (original author)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: is_tool_name_prefixed can misclassify non-MCP tools with hyphens

3 participants