fix(mcp): preserve non-MCP tools in semantic filter#24986
fix(mcp): preserve non-MCP tools in semantic filter#24986klhq wants to merge 9 commits intoBerriAI:mainfrom
Conversation
When the semantic tool filter finds no matches for a user query, it previously returned all available tools. With many MCP servers this can exceed provider tool limits. Now it drops MCP tools (prefixed) and returns only non-MCP tools on zero matches.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR fixes a bug where the MCP semantic filter incorrectly dropped non-MCP tools (e.g. Confidence Score: 5/5Safe to merge — logic is correct, well-tested, and no P0/P1 issues found. All prior review concerns have been addressed. The only remaining finding is a P2 style suggestion (walrus operator for a double function call). The contract change to No files require special attention.
|
| Filename | Overview |
|---|---|
| litellm/proxy/hooks/mcp_semantic_filter/hook.py | Core fix: partitions tools into MCP/non-MCP using registry-aware is_tool_name_prefixed, filters only MCP tools semantically, and adds top-k fallback for all-MCP zero-match case. Logic is correct; minor style issue with double get_server_prefix call in the set comprehension. |
| litellm/proxy/_experimental/mcp_server/semantic_tool_filter.py | One-line contract change: filter_tools now returns [] on zero semantic matches instead of available_tools. This intentional change is safe because the only caller (hook.py) is updated to handle the new behaviour and adds non-MCP tools back explicitly. |
| tests/test_litellm/proxy/_experimental/mcp_server/test_semantic_tool_filter.py | Good additions: autouse fixture registers a fake "server" prefix so existing tool names continue to classify as MCP under the strict check; new tests cover non-MCP pass-through, hyphenated non-MCP regression, all-MCP zero-match fallback, and the new empty-list return from filter_tools. |
| tests/mcp_tests/test_semantic_tool_filter_e2e.py | Updated tool names to hyphen-prefixed MCP form and monkeypatched 10 fake servers into the registry; formatting-only changes otherwise. Test still skips without OPENAI_API_KEY, so no new real-network-call concern for unit CI. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[async_pre_call_hook] --> B{tools in request?}
B -- No --> Z[return None]
B -- Yes --> C{MCP references?}
C -- Yes --> D[_expand_mcp_tools]
D --> E
C -- No --> E[build known_prefixes\nfrom registry]
E --> F[partition tools:\nmcp_tools / non_mcp_tools]
F --> G{mcp_tools empty?}
G -- Yes --> Z
G -- No --> H[filter.filter_tools\nmcp_tools only]
H --> I{filtered_mcp_tools\nempty?}
I -- No --> K[filtered = filtered_mcp + non_mcp]
I -- Yes --> J{non_mcp_tools\nempty?}
J -- No --> K
J -- Yes --> L[fallback: mcp_tools\nslice top_k + warn]
L --> K
K --> M[data tools = filtered\nreturn data]
Reviews (6): Last reviewed commit: "test(mcp): mock MCP registry in semantic..." | Re-trigger Greptile
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
The semantic filter treated all tools equally — it didn't distinguish between MCP tools (which it should filter) and non-MCP built-in tools (which should pass through untouched). This caused two failures: - On match: non-MCP tools were dropped, losing built-in functionality - On zero match: all tools returned unfiltered, exceeding provider limits Move MCP/non-MCP separation into the hook. The hook now passes only MCP tools (identified by server-name prefix) to filter_tools() and recombines with non-MCP tools after. filter_tools() returns empty list on zero matches instead of all tools.
…CP tools When all tools are MCP and zero semantic matches are found, return the first top_k MCP tools instead of an empty list. Avoids sending an empty tool list to the LLM.
Agentic Code Review🚨 Needs Changes — 🚨 2 critical
|
Agentic Code Review🚨 Needs Changes — 🚨 2 critical
|
Agentic Code Review🚨 Needs Changes — 🚨 1 critical
|
Agentic Code Review🚨 Needs Changes — 🚨 1 critical
|
Rename the 10 test tool names to hyphen-prefixed form (e.g. gmail-send_email, calendar-create_event) so the hook correctly classifies them as MCP tools. Without the prefix, is_tool_name_prefixed returns False for all tools, the hook early-returns None, and the assertion `result and len(result["tools"]) < len(tools)` fails. Addresses Agentic Reviewer finding on BerriAI#24986.
…lter hook
Build the known-prefix set from global_mcp_server_manager.get_registry()
and pass it to is_tool_name_prefixed() when classifying tools as MCP vs
non-MCP in the semantic filter hook.
Without this, the hook inherited the hyphen false-positive bug from the
1-line is_tool_name_prefixed() utility (fixed in the previous commit),
and would still misclassify non-MCP tools like "text-to-speech" as MCP.
Also switches the partition to a single-pass loop instead of two list
comprehensions.
Adds test_hook_passes_through_hyphenated_non_mcp_tools regression test
and an autouse fixture that registers a fake "server" MCP prefix so
existing tests using `server-tool_{i}` names continue to classify as MCP
tools with the strict check.
Addresses Agentic Reviewer P1 finding on BerriAI#24986.
The hook now validates MCP tool prefixes against the live server registry (via `known_server_prefixes`). Without a mock, `get_registry()` returns an empty dict, `known_prefixes` is empty, and all tools are classified as non-MCP — the hook early-returns None and the e2e assertion fails. Register 10 fake MCP servers matching the tool prefixes (`gmail`, `calendar`, `files`, etc.) via monkeypatch so the hyphen-prefixed tool names pass the registry check and actually exercise the semantic filter.
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 29203054 | Triggered | Generic High Entropy Secret | 04b772e | tests/test_litellm/test_secret_redaction.py | View secret |
| 29203053 | Triggered | Generic Password | 04b772e | .circleci/config.yml | View secret |
| 29203056 | Triggered | Generic High Entropy Secret | 04b772e | tests/test_litellm/test_secret_redaction.py | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
|
@ShashankFC — replies to your 2026-04-10 Agentic Review findings: Finding 1 — hyphen false-positive (MAJOR) Now fixed via #25085 (landed in main via #25192 on 2026-04-14). This PR's hook update uses Finding 2 — e2e test (MAJOR) Fixed in commits Finding 3 — top-k fallback (CRITICAL) False positive — the fallback is at hook.py:248-256. The bot appears to have read |
|
Re: GitGuardian alert — false positive, artifact of the
Happy to drop the merge commit and rebase onto |
Relevant issues
Fixes #24984
Builds on #25085 (landed in
mainvia #25192), which added theoptional
known_server_prefixesparameter tois_tool_name_prefixed().Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewDelays in PR merge?
If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).
Type
🐛 Bug Fix
Changes
The semantic filter hook treated all tools equally — it didn't distinguish between MCP tools (which it should filter) and non-MCP built-in tools (which should pass through untouched). This caused non-MCP tools to be dropped on match, and all tools to be returned on zero match (exceeding provider tool limits).
Hook behavior
hook.py) via a single-pass partition — only MCP tools go through the semantic router; non-MCP tools pass through untouchedfilter_tools()returns an empty list on zero matches instead of all tools; the hook adds non-MCP tools back afterwardsCorrect MCP classification
The hook's MCP/non-MCP split needs a reliable classifier. The legacy
is_tool_name_prefixed()did a naive"-" in tool_namecheck, which misclassifies hyphenated non-MCP tools (e.g.text-to-speech,code-review) as MCP. That bug was fixed inutils.pyby #25085 (merged tomainvia #25192 on 2026-04-14), which added an optionalknown_server_prefixesparameter for a strict check against the live registry.known_server_prefixesbuilt fromglobal_mcp_server_manager.get_registry()into the hook'sis_tool_name_prefixed()calls, so hyphenated non-MCP tools are no longer misclassifiedTests
test_semantic_tool_filter.pythat registers a fake"server"MCP prefix, so existing hook tests usingserver-tool_{i}names continue to classify as MCP under the strict checktest_hook_passes_through_hyphenated_non_mcp_tools— regression test provingtext-to-speechandcode-reviewsurvive the filter end-to-endtest_semantic_tool_filter_e2e.py: rename tool names to hyphen-prefixed MCP form andmonkeypatch10 fake servers into the registry so the test still exercises the filter with the new strict classifier