Skip to content

fix(mcp): preserve non-MCP tools in semantic filter#24986

Open
klhq wants to merge 9 commits intoBerriAI:mainfrom
klhq:fix/semantic-filter-zero-match-fallback
Open

fix(mcp): preserve non-MCP tools in semantic filter#24986
klhq wants to merge 9 commits intoBerriAI:mainfrom
klhq:fix/semantic-filter-zero-match-fallback

Conversation

@klhq
Copy link
Copy Markdown

@klhq klhq commented Apr 2, 2026

Relevant issues

Fixes #24984

Builds on #25085 (landed in main via #25192), which added the
optional known_server_prefixes parameter to is_tool_name_prefixed().

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

Type

🐛 Bug Fix

Changes

The semantic filter hook treated all tools equally — it didn't distinguish between MCP tools (which it should filter) and non-MCP built-in tools (which should pass through untouched). This caused non-MCP tools to be dropped on match, and all tools to be returned on zero match (exceeding provider tool limits).

Hook behavior

  • Move MCP/non-MCP separation into the hook (hook.py) via a single-pass partition — only MCP tools go through the semantic router; non-MCP tools pass through untouched
  • filter_tools() returns an empty list on zero matches instead of all tools; the hook adds non-MCP tools back afterwards
  • Top-k fallback when the request has only MCP tools and zero semantic matches, to avoid an empty tool list

Correct MCP classification

The hook's MCP/non-MCP split needs a reliable classifier. The legacy is_tool_name_prefixed() did a naive "-" in tool_name check, which misclassifies hyphenated non-MCP tools (e.g. text-to-speech, code-review) as MCP. That bug was fixed in utils.py by #25085 (merged to main via #25192 on 2026-04-14), which added an optional known_server_prefixes parameter for a strict check against the live registry.

  • Pass known_server_prefixes built from global_mcp_server_manager.get_registry() into the hook's is_tool_name_prefixed() calls, so hyphenated non-MCP tools are no longer misclassified

Tests

  • Add an autouse fixture in test_semantic_tool_filter.py that registers a fake "server" MCP prefix, so existing hook tests using server-tool_{i} names continue to classify as MCP under the strict check
  • Add test_hook_passes_through_hyphenated_non_mcp_tools — regression test proving text-to-speech and code-review survive the filter end-to-end
  • Update test_semantic_tool_filter_e2e.py: rename tool names to hyphen-prefixed MCP form and monkeypatch 10 fake servers into the registry so the test still exercises the filter with the new strict classifier

When the semantic tool filter finds no matches for a user query, it
previously returned all available tools. With many MCP servers this
can exceed provider tool limits. Now it drops MCP tools (prefixed)
and returns only non-MCP tools on zero matches.
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 2, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 15, 2026 9:08am

Request Review

@klhq
Copy link
Copy Markdown
Author

klhq commented Apr 2, 2026

@greptileai

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented Apr 2, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing klhq:fix/semantic-filter-zero-match-fallback (32c3204) with main (72a461b)

Open in CodSpeed

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 2, 2026

Greptile Summary

This PR fixes a bug where the MCP semantic filter incorrectly dropped non-MCP tools (e.g. text-to-speech, code-review) because the legacy is_tool_name_prefixed() used a naive hyphen-presence check. The fix partitions tools into MCP and non-MCP at the hook level using the live server registry for accurate prefix matching, sends only MCP tools through the semantic router, and adds a top_k fallback when the filter produces zero matches with no non-MCP tools available.

Confidence Score: 5/5

Safe to merge — logic is correct, well-tested, and no P0/P1 issues found.

All prior review concerns have been addressed. The only remaining finding is a P2 style suggestion (walrus operator for a double function call). The contract change to filter_tools is intentional and safe since the only caller is the updated hook. Tests cover the regression, the top-k fallback, the all-non-MCP pass-through, and the hyphenated non-MCP tool preservation end-to-end.

No files require special attention.

Important Files Changed

Filename Overview
litellm/proxy/hooks/mcp_semantic_filter/hook.py Core fix: partitions tools into MCP/non-MCP using registry-aware is_tool_name_prefixed, filters only MCP tools semantically, and adds top-k fallback for all-MCP zero-match case. Logic is correct; minor style issue with double get_server_prefix call in the set comprehension.
litellm/proxy/_experimental/mcp_server/semantic_tool_filter.py One-line contract change: filter_tools now returns [] on zero semantic matches instead of available_tools. This intentional change is safe because the only caller (hook.py) is updated to handle the new behaviour and adds non-MCP tools back explicitly.
tests/test_litellm/proxy/_experimental/mcp_server/test_semantic_tool_filter.py Good additions: autouse fixture registers a fake "server" prefix so existing tool names continue to classify as MCP under the strict check; new tests cover non-MCP pass-through, hyphenated non-MCP regression, all-MCP zero-match fallback, and the new empty-list return from filter_tools.
tests/mcp_tests/test_semantic_tool_filter_e2e.py Updated tool names to hyphen-prefixed MCP form and monkeypatched 10 fake servers into the registry; formatting-only changes otherwise. Test still skips without OPENAI_API_KEY, so no new real-network-call concern for unit CI.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[async_pre_call_hook] --> B{tools in request?}
    B -- No --> Z[return None]
    B -- Yes --> C{MCP references?}
    C -- Yes --> D[_expand_mcp_tools]
    D --> E
    C -- No --> E[build known_prefixes\nfrom registry]
    E --> F[partition tools:\nmcp_tools / non_mcp_tools]
    F --> G{mcp_tools empty?}
    G -- Yes --> Z
    G -- No --> H[filter.filter_tools\nmcp_tools only]
    H --> I{filtered_mcp_tools\nempty?}
    I -- No --> K[filtered = filtered_mcp + non_mcp]
    I -- Yes --> J{non_mcp_tools\nempty?}
    J -- No --> K
    J -- Yes --> L[fallback: mcp_tools\nslice top_k + warn]
    L --> K
    K --> M[data tools = filtered\nreturn data]
Loading

Reviews (6): Last reviewed commit: "test(mcp): mock MCP registry in semantic..." | Re-trigger Greptile

Comment thread litellm/proxy/_experimental/mcp_server/semantic_tool_filter.py Outdated
Comment thread litellm/proxy/_experimental/mcp_server/semantic_tool_filter.py Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

The semantic filter treated all tools equally — it didn't distinguish
between MCP tools (which it should filter) and non-MCP built-in tools
(which should pass through untouched). This caused two failures:

- On match: non-MCP tools were dropped, losing built-in functionality
- On zero match: all tools returned unfiltered, exceeding provider limits

Move MCP/non-MCP separation into the hook. The hook now passes only
MCP tools (identified by server-name prefix) to filter_tools() and
recombines with non-MCP tools after. filter_tools() returns empty
list on zero matches instead of all tools.
@klhq klhq changed the title fix(mcp): return only non-MCP tools on zero semantic matches fix(mcp): preserve non-MCP tools in semantic filter Apr 2, 2026
@klhq
Copy link
Copy Markdown
Author

klhq commented Apr 2, 2026

@greptileai

…CP tools

When all tools are MCP and zero semantic matches are found, return the
first top_k MCP tools instead of an empty list. Avoids sending an empty
tool list to the LLM.
@ShashankFC
Copy link
Copy Markdown

Agentic Code Review

🚨 Needs Changes — 🚨 2 critical


litellm/proxy/_experimental/mcp_server/semantic_tool_filter.py

  • 🚨 [CRITICAL] Lines 194–196: return [] on zero semantic matches drops ALL tools including non-MCP tools, contradicting the PR's stated goal of "preserve non-MCP tools in semantic filter". The hook (hook.py) sets data["tools"] = filtered_tools without any step to re-add non-MCP tools after filter_tools() returns, so a zero-match scenario silently removes every built-in tool from the LLM request.

    confidence=7 file_read of hook.py lines 226-232 confirms filtered_tools = await self.filter.filter_tools(...) followed immediately by data["tools"] = filtered_tools with no non-MCP recombination. The PR description states the hook 'always recombines with non-MCP tools after', but no such code exists in hook.py.

tests/test_litellm/proxy/_experimental/mcp_server/test_semantic_tool_filter.py

  • 🚨 [CRITICAL] Lines 308–342: The tool names tool_{i} contain no - separator, so is_tool_name_prefixed() returns False for all of them. With the new hook logic that only filters MCP-prefixed tools, all 10 tools are treated as non-MCP and the hook returns None instead of filtered data — causing assert result is not None on line 340 to fail. The PR description shows the intent was to rename these to server-tool_{i}, but the committed file still has the old name.

    confidence=7 file_read of line 308 shows MCPTool(name=f"tool_{i}", ...) (no prefix). is_tool_name_prefixed checks for MCP_TOOL_PREFIX_SEPARATOR ('-') per utils.py line 113. hook.py separates MCP vs non-MCP tools before calling filter_tools(); with 0 MCP tools, hook skips and returns None.


Generated by Agentic Reviewer · model: us.anthropic.claude-sonnet-4-6

@ShashankFC
Copy link
Copy Markdown

Agentic Code Review

🚨 Needs Changes — 🚨 2 critical ⚠️ 1 major


litellm/proxy/_experimental/mcp_server/semantic_tool_filter.py

  • ⚠️ [MAJOR] Lines 194–199: The zero-match path uses is_tool_name_prefixed() to distinguish non-MCP tools, but that function returns True for ANY tool name containing the separator character (default "-"). A non-MCP built-in tool named web-search, code-interpreter, or any hyphenated name will be treated as an MCP tool and silently dropped when there are zero semantic matches, even though it should always pass through.

    confidence=7 file_read of utils.py line 13 shows MCP_TOOL_PREFIX_SEPARATOR = os.environ.get("MCP_TOOL_PREFIX_SEPARATOR", "-"), and line 113 shows return MCP_TOOL_PREFIX_SEPARATOR in tool_name — a simple substring check on "-" with no server-name registry lookup. The test at line 426-427 only uses non-hyphenated names (web_search, code_interpreter) so the false-positive case is untested.

tests/test_litellm/proxy/_experimental/mcp_server/test_semantic_tool_filter.py

  • 🚨 [CRITICAL] Lines 396–455: The test test_semantic_filter_hook_skips_non_mcp_only_tools asserts result is None when all tools are non-MCP (no prefix), but the hook has no early-return path for 'all tools are non-MCP'. The hook calls filter_tools, which (with no _build_router called and tool_router=None) returns available_tools unchanged, and the hook returns data (not None). The test will fail because it asserts a skip behavior that doesn't exist in the hook.

    confidence=7 file_read of hook.py lines 159-254 shows no check for 'are all tools non-MCP before proceeding'; file_read of semantic_tool_filter.py lines 181-185 shows that when tool_router is None, filter_tools returns available_tools unchanged, and hook returns data.

  • 🚨 [CRITICAL] Lines 463–502: The test test_hook_falls_back_to_top_k_when_only_mcp_and_zero_matches asserts len(result['tools']) == 3 (a 'fallback to top_k' behavior), but neither filter_tools nor the hook implement any such fallback. When all tools are MCP-prefixed and zero semantic matches are found, filter_tools returns [] (no non-MCP tools to preserve), and the hook sets data['tools'] = [] and returns data. This test will fail, and it documents behavior the code doesn't implement.

    confidence=7 file_read of semantic_tool_filter.py lines 193-199 shows zero-match path returns [t for t in available_tools if not is_tool_name_prefixed(...)] (non-MCP only, no top_k fallback). file_read of hook.py line 232 shows data['tools'] = filtered_tools with no top_k fallback added by the hook either.


Generated by Agentic Reviewer · model: us.anthropic.claude-sonnet-4-6

@ShashankFC
Copy link
Copy Markdown

Agentic Code Review

🚨 Needs Changes — 🚨 1 critical ⚠️ 3 major


litellm/proxy/_experimental/mcp_server/semantic_tool_filter.py

  • ⚠️ [MAJOR] Lines 196–199: The zero-match fallback uses is_tool_name_prefixed() which checks MCP_TOOL_PREFIX_SEPARATOR in tool_name (defaults to -). Any legitimate non-MCP tool whose name contains a hyphen (e.g., web-search, get-weather) will be falsely classified as an MCP tool and silently dropped, defeating the goal of preserving non-MCP tools.

    confidence=7 file_read of utils.py line 113 confirms return MCP_TOOL_PREFIX_SEPARATOR in tool_name where MCP_TOOL_PREFIX_SEPARATOR defaults to - (os.environ.get). The new test (test_semantic_filter_zero_matches_returns_only_non_mcp_tools) only uses underscore-named non-MCP tools (web_search, code_interpreter) — it does not cover hyphenated non-MCP tool names and would not catch this false-positive.

litellm/proxy/hooks/mcp_semantic_filter/__init__.py

  • ⚠️ [MAJOR] Lines 7–9: The hook.py imported here passes all tools (MCP + non-MCP) to filter_tools() and sets data["tools"] = filtered_tools directly without recombining non-MCP tools afterward. After the PR, filter_tools()._get_tools_by_names() only returns tools whose names appear in semantic matches — non-MCP tools (not prefixed) will be silently dropped from the request whenever the semantic filter produces any matches.

    confidence=7 file_read of hook.py lines 226-232 confirms: filter_tools() result is assigned to data["tools"] directly with no non-MCP recombination. file_read of semantic_tool_filter.py lines 222-235 confirms _get_tools_by_names only returns tools in matched_tool_names, with no pass-through for non-prefixed tools.

tests/test_litellm/proxy/_experimental/mcp_server/test_semantic_tool_filter.py

  • 🚨 [CRITICAL] Lines 307–344: The diff claims tool names in test_semantic_filter_hook_triggers_on_completion were changed to the prefixed format server-tool_{i}, but the actual file at line 308 still has tool_{i} (no hyphen prefix). Because is_tool_name_prefixed checks for - in the name, all 10 tools will be treated as non-MCP, the hook will return None (no MCP tools to filter), and the assertion assert result is not None will fail.

    confidence=7 file_read of test file lines 307-310 shows MCPTool(name=f"tool_{i}", ...) — the hyphen-prefix rename that the diff shows (server-tool_{i}) did not land. The hook's new logic skips requests with no MCP tools, making the test's result is not None assertion incorrect.

  • ⚠️ [MAJOR] Lines 395–602: Four tests described as added in this PR (test_semantic_filter_hook_skips_non_mcp_only_tools, test_hook_falls_back_to_top_k_when_only_mcp_and_zero_matches, test_semantic_filter_zero_matches_returns_empty, test_hook_preserves_non_mcp_tools) are absent from the actual file — the file ends at line 444 containing only test_semantic_filter_zero_matches_returns_only_non_mcp_tools. The hook's new MCP/non-MCP separation logic and the zero-match empty-list behavior are not exercised by any test.

    confidence=7 bash wc -l shows file is 444 lines; grep -n 'def test_' finds only 8 tests, none matching the four test names added in the diff. The PR description states these 4 tests were added but they did not land.


Generated by Agentic Reviewer · model: us.anthropic.claude-sonnet-4-6

@ShashankFC
Copy link
Copy Markdown

Agentic Code Review

🚨 Needs Changes — 🚨 1 critical ⚠️ 2 major


litellm/proxy/hooks/mcp_semantic_filter/hook.py

  • ⚠️ [MAJOR] Lines 236–238: is_tool_name_prefixed() uses MCP_TOOL_PREFIX_SEPARATOR (default "-") to detect MCP tools via a simple in check, so any non-MCP tool whose name contains a hyphen (e.g. "web-search", "code-interpreter") is misclassified as an MCP tool and subjected to semantic filtering instead of being passed through unconditionally. This defeats the stated goal of "always pass non-MCP tools through untouched" for hyphenated tool names.

    confidence=7 file_read of litellm/proxy/_experimental/mcp_server/utils.py lines 103-113 confirmed: is_tool_name_prefixed returns MCP_TOOL_PREFIX_SEPARATOR in tool_name where MCP_TOOL_PREFIX_SEPARATOR = os.environ.get('MCP_TOOL_PREFIX_SEPARATOR', '-'). A tool named web-search passes this check and lands in mcp_tools, not non_mcp_tools.

tests/mcp_tests/test_semantic_tool_filter_e2e.py

  • ⚠️ [MAJOR] Lines 56–90: The E2E test uses tool names without the MCP dash-prefix convention (e.g. gmail_send, calendar_create) so is_tool_name_prefixed() returns False for every tool. With the PR's new zero-match path in filter_tools(), if the real embedding call finds no matches above the threshold, the function returns all 10 tools (none are filtered by the not is_tool_name_prefixed() guard). The test assertion len(result['tools']) < len(tools) then fails, making the test flaky against real OpenAI embeddings. Tool names should use the dash-prefixed MCP format (e.g. gmail-send_email) consistent with how the new unit tests in test_semantic_tool_filter.py were updated.

    confidence=7 file_read of e2e test lines 57-66 shows tool names gmail_send, calendar_create, etc. (no -). fetch_definition of is_tool_name_prefixed in utils.py shows it checks MCP_TOOL_PREFIX_SEPARATOR ("-") is present. semantic_tool_filter.py lines 193-199: zero-match path returns [t for t in available_tools if not is_tool_name_prefixed(...)] — all 10 tools pass this filter since none contain -. The PR description explicitly states the unit test was updated to use prefixed names but the e2e test was not.

tests/test_litellm/proxy/_experimental/mcp_server/test_semantic_tool_filter.py

  • 🚨 [CRITICAL] Lines 460–492: The test asserts len(result['tools']) == 3 (top_k fallback) on zero semantic matches with MCP-only tools, but neither filter_tools nor async_pre_call_hook in the source implements a top-k fallback — on zero matches filter_tools returns [t for t in available_tools if not is_tool_name_prefixed(...)], which for all-MCP-prefixed tools produces an empty list. The hook then sets data['tools'] = [] and returns data, so len(result['tools']) will be 0, not 3, causing this test to fail.

    confidence=7 file_read of hook.py lines 226-248 shows data['tools'] = filtered_tools with no top-k fallback; file_read of semantic_tool_filter.py lines 193-199 shows zero-match path returns non-MCP tools only (empty for all-MCP input)


Generated by Agentic Reviewer · model: us.anthropic.claude-sonnet-4-6

klhq added 4 commits April 15, 2026 16:46
Rename the 10 test tool names to hyphen-prefixed form (e.g.
gmail-send_email, calendar-create_event) so the hook correctly
classifies them as MCP tools. Without the prefix, is_tool_name_prefixed
returns False for all tools, the hook early-returns None, and the
assertion `result and len(result["tools"]) < len(tools)` fails.

Addresses Agentic Reviewer finding on BerriAI#24986.
…lter hook

Build the known-prefix set from global_mcp_server_manager.get_registry()
and pass it to is_tool_name_prefixed() when classifying tools as MCP vs
non-MCP in the semantic filter hook.

Without this, the hook inherited the hyphen false-positive bug from the
1-line is_tool_name_prefixed() utility (fixed in the previous commit),
and would still misclassify non-MCP tools like "text-to-speech" as MCP.

Also switches the partition to a single-pass loop instead of two list
comprehensions.

Adds test_hook_passes_through_hyphenated_non_mcp_tools regression test
and an autouse fixture that registers a fake "server" MCP prefix so
existing tests using `server-tool_{i}` names continue to classify as MCP
tools with the strict check.

Addresses Agentic Reviewer P1 finding on BerriAI#24986.
The hook now validates MCP tool prefixes against the live server
registry (via `known_server_prefixes`). Without a mock, `get_registry()`
returns an empty dict, `known_prefixes` is empty, and all tools are
classified as non-MCP — the hook early-returns None and the e2e
assertion fails.

Register 10 fake MCP servers matching the tool prefixes (`gmail`,
`calendar`, `files`, etc.) via monkeypatch so the hyphen-prefixed tool
names pass the registry check and actually exercise the semantic filter.
@gitguardian
Copy link
Copy Markdown

gitguardian Bot commented Apr 15, 2026

⚠️ GitGuardian has uncovered 3 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard.
Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secrets in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
29203054 Triggered Generic High Entropy Secret 04b772e tests/test_litellm/test_secret_redaction.py View secret
29203053 Triggered Generic Password 04b772e .circleci/config.yml View secret
29203056 Triggered Generic High Entropy Secret 04b772e tests/test_litellm/test_secret_redaction.py View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secrets safely. Learn here the best practices.
  3. Revoke and rotate these secrets.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@klhq
Copy link
Copy Markdown
Author

klhq commented Apr 15, 2026

@ShashankFC — replies to your 2026-04-10 Agentic Review findings:

Finding 1 — hyphen false-positive (MAJOR)

Now fixed via #25085 (landed in main via #25192 on 2026-04-14). This PR's hook update uses known_server_prefixes in commit 508867e, plus a new regression test test_hook_passes_through_hyphenated_non_mcp_tools that exercises text-to-speech and code-review end-to-end.

Finding 2 — e2e test (MAJOR)

Fixed in commits 2833734 (rename tool names to hyphen-prefixed MCP form) and 32c3204 (monkeypatch the MCP registry so the hyphen-prefixed names actually pass the new strict classifier).

Finding 3 — top-k fallback (CRITICAL)

False positive — the fallback is at hook.py:248-256. The bot appears to have read hook.py:226-248 and stopped before the fallback block. The top-k test (test_hook_falls_back_to_top_k_when_only_mcp_and_zero_matches) still passes locally (21/21 unit tests green).

@klhq
Copy link
Copy Markdown
Author

klhq commented Apr 15, 2026

Re: GitGuardian alert — false positive, artifact of the Merge branch 'main'
commit (04b772e9c6) I made to pull in #25085.
None of this PR's own commits touch the flagged files:

  • tests/test_litellm/test_secret_redaction.py — fake test fixtures for the
    secret-redaction unit tests, already in main since Litellm ishaan march30 (#24887) #25151
  • .circleci/config.yml — CI service password, already in main via the
    recent supply-chain hardening commits

Happy to drop the merge commit and rebase onto main if preferred, but that
would force-push the branch and invalidate the existing Greptile review
anchors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: mcp_semantic_tool_filter drops non-MCP tools and returns all tools on zero matches

2 participants