feat: Search transforms for tool discovery by jlowin · Pull Request #3154 · PrefectHQ/fastmcp

jlowin · 2026-02-11T18:51:31Z

This may hold for 3.1.

When a server exposes hundreds of tools, sending the full catalog to an LLM wastes tokens and hurts selection accuracy. Search transforms solve this by replacing list_tools() with a search interface — the LLM discovers tools on demand instead of seeing everything upfront.

Two strategies, both zero-dependency: RegexSearchTransform for pattern matching and BM25SearchTransform for natural-language relevance ranking. Adding one collapses the entire catalog into two synthetic tools:

from fastmcp import FastMCP
from fastmcp.server.transforms.search import RegexSearchTransform

mcp = FastMCP("Server")

@mcp.tool
def search_database(query: str) -> str: ...

@mcp.tool  
def send_email(to: str, subject: str, body: str) -> str: ...

# Clients now see only search_tools + call_tool
mcp.add_transform(RegexSearchTransform())

Search results respect the full auth pipeline — middleware, visibility transforms, session-level disable_components, and component auth checks all filter what's discoverable. The search tool queries list_tools() through the complete pipeline at search time using a contextvar bypass that only skips its own hiding behavior.

coderabbitai · 2026-02-11T18:54:04Z

Walkthrough

This pull request introduces a new Tool Search feature for FastMCP. The implementation adds a search transform system that replaces large tool catalogs with on-demand search via two concrete implementations: BM25SearchTransform (relevance-based ranking with in-memory indexing) and RegexSearchTransform (pattern matching with zero overhead). The search transforms cause list_tools to return two synthetic tools—search_tools and call_tool—enabling LLMs to discover and interact with tools dynamically. The change also removes four Protocol exports (GetPromptNext, GetResourceNext, GetResourceTemplateNext, GetToolNext) from the transforms module's public API. Comprehensive documentation is added covering usage patterns, customization options, and interaction with authentication and visibility middleware.

Possibly related PRs

PR #2917: The BaseSearchTransform's _get_visible_tools method interacts with Context-filtered tool visibility, which directly relates to session-scoped visibility changes in this PR.
PR #2836: Introduces the transform system foundations (Transform, ToolTransform, Visibility, provider transform chain) that this PR extends with new search transform classes.
PR #2942: Modifies server transforms surface exports and list/get transform behavior patterns that overlap with this PR's changes to transforms/init.py and transform control flow.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 63.64% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: introducing search transforms as a new feature for tool discovery.
Description check	✅ Passed	The PR description clearly explains the problem (token waste with large catalogs), solution (search transforms), implementation details (two strategies, synthetic tools, auth pipeline integration), and provides working code example.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/tool-search-transforms

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cb88a50213

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-02-11T18:57:30Z

src/fastmcp/server/transforms/search/base.py

+
+            Use this to execute tools discovered via search_tools.
+            """
+            return await ctx.fastmcp.call_tool(name, arguments)


Block call_tool from invoking itself

The proxy forwards any requested tool name directly to ctx.fastmcp.call_tool, so a request like call_tool(name="call_tool") resolves the same synthetic tool again and recurses until timeout/recursion failure. This is an easy request-level DoS path (and can be triggered by LLM mis-selection), so the proxy should explicitly reject self-references to its own synthetic name before dispatching.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-02-11T18:57:30Z

src/fastmcp/server/transforms/search/bm25.py

+        current_hash = _catalog_hash(tools)
+        if current_hash != self._last_hash:


Rebuild BM25 index when tool metadata changes

The rebuild gate is keyed only by _catalog_hash(tools), and that hash is based on names, so catalogs with unchanged names but updated descriptions/parameter schemas are treated as unchanged. In dynamic providers or list-tools middleware that mutates tool metadata per request, BM25 will keep ranking against stale documents and return stale Tool objects from the previous snapshot; the staleness key should include searchable metadata, not just names.

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

docs/docs.json (1)

572-583: ⚠️ Potential issue | 🟠 Major

Add search transform SDK pages to docs.json navigation.

The four new search transform documentation files exist in docs/python-sdk/ but are not referenced in docs.json under the transforms group (lines 574–582):

fastmcp-server-transforms-search-__init__.mdx

fastmcp-server-transforms-search-base.mdx

fastmcp-server-transforms-search-bm25.mdx

fastmcp-server-transforms-search-regex.mdx

Per the documentation guidelines, these files must be included in docs.json to be published. Add the missing entries to the transforms navigation group, or confirm whether the documentation bot auto-updates this file.

🧹 Nitpick comments (6)

src/fastmcp/server/transforms/search/base.py (1)

162-176: Bypass + filter logic is sound, but consider the multi-transform stacking scenario.

If two BaseSearchTransform subclasses are stacked, _search_bypass is shared (module-level ContextVar). When the inner transform's search tool calls _get_visible_tools, the bypass causes both transforms to pass through, which is correct — the inner search sees the full catalog. Worth a brief note in the docstring if this stacking pattern is expected to be supported.
src/fastmcp/server/transforms/search/bm25.py (1)
93-97: Consider explicit keyword arguments instead of **kwargs.

BM25SearchTransform.__init__ accepts **kwargs and forwards them to super().__init__(). This obscures the accepted parameters from type checkers and IDE autocompletion. Explicitly declaring max_results, always_visible, search_tool_name, and call_tool_name would improve discoverability.
Proposed fix
-    def __init__(self, **kwargs: Any) -> None:
-        super().__init__(**kwargs)
+    def __init__(
+        self,
+        *,
+        max_results: int = 5,
+        always_visible: list[str] | None = None,
+        search_tool_name: str = "search_tools",
+        call_tool_name: str = "call_tool",
+    ) -> None:
+        super().__init__(
+            max_results=max_results,
+            always_visible=always_visible,
+            search_tool_name=search_tool_name,
+            call_tool_name=call_tool_name,
+        )
src/fastmcp/server/transforms/search/regex.py (1)
43-55: Consider ReDoS mitigation for untrusted regex patterns.

The query string is provided by the LLM/client and compiled directly as a regex. Malicious or pathological patterns (e.g., (a+)+$) can cause catastrophic backtracking in Python's re engine. While the searchable text is short (server-defined tool metadata), this is still a potential denial-of-service vector if the server is exposed to untrusted clients.

A lightweight mitigation would be to apply a timeout or use re2 (if available), or simply set a maximum pattern length. Even a length cap goes a long way:
🛡️ Optional: add a pattern length cap
     async def _search(self, tools: Sequence[Tool], query: str) -> Sequence[Tool]:
+        if len(query) > 200:
+            return []
         try:
             compiled = re.compile(query, re.IGNORECASE)
         except re.error:
             return []
Otherwise, the search logic is clean: the re.error catch for invalid patterns is good, and the early break on _max_results avoids unnecessary iteration.
docs/servers/transforms/tool-search.mdx (3)

36-59: Code example is clear but uses ... for function bodies.

The guideline calls for "complete, runnable code examples that users can copy and execute." The ... ellipsis bodies are a reasonable shorthand for tools whose implementation doesn't matter to the example, but worth noting that a user copy-pasting this won't get a working demo. Consider adding minimal return values (e.g., return [], return True) so the snippet is directly executable.

63-73: Client-side example lacks context for client.

This snippet uses await client.call_tool(...) without showing how client is created or that it runs inside an async function. A reader unfamiliar with the client setup may not be able to run this. Consider adding a brief note (e.g., "Assuming an existing Client session") or linking to client documentation. As per coding guidelines, code blocks should be "fully runnable with all necessary imports."

115-136: Add a closing section with next steps or related information.

The page ends at line 136 without a conclusion. Per the MDX documentation guidelines, sections should end with next steps or related information. Consider adding a brief "Next Steps" or "Related Topics" section that links to the transforms overview, authorization/visibility documentation, or the Python SDK reference for BaseSearchTransform.

coderabbitai · 2026-02-11T18:58:41Z

src/fastmcp/server/transforms/search/bm25.py

+def _catalog_hash(tools: Sequence[Tool]) -> str:
+    """SHA256 hash of sorted tool names for staleness detection."""
+    key = "|".join(sorted(t.name for t in tools))
+    return hashlib.sha256(key.encode()).hexdigest()


⚠️ Potential issue | 🟡 Minor

Catalog hash uses only tool names — description/parameter changes won't trigger reindex.

_catalog_hash hashes sorted tool names, so if a tool's description or parameters change (without adding/removing tools), the BM25 index will serve stale results. This is documented, but worth calling out since tool descriptions can be dynamically generated.

If this is intentional to keep the check cheap, consider adding a one-line comment in the hash function body noting the tradeoff.

whatevertogo · 2026-02-12T23:09:05Z

Sorry, wrong link earlier — please ignore 😅
We’ve been waiting for this feature for a long time.

jlowin · 2026-02-25T23:47:23Z

The review flagged removing GetToolNext, GetResourceNext, GetResourceTemplateNext, and GetPromptNext from __all__ as a breaking change. That removal is intentional and correct — these are Protocol types for the transform system's internal call_next plumbing, not part of the public surface. Authors subclassing Transform don't need to import them explicitly; the base class method signatures carry the types already. If someone does want to import them directly, they're still importable from fastmcp.server.transforms — they're just no longer advertised as public API via __all__.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a0c239ee6a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-02-25T23:51:36Z

src/fastmcp/server/transforms/search/base.py

+            tools = await ctx.fastmcp.list_tools()
+        finally:
+            _search_bypass.reset(token)
+        return [t for t in tools if t.name not in self._always_visible]


Deduplicate visible tools before building search results

_get_visible_tools forwards ctx.fastmcp.list_tools() directly into search/serialization, but FastMCP.list_tools() returns all versions while MCP tools/list deduplicates by name in the wire handler. With versioned tools, search can return multiple entries for one name (including older schemas), but the call_tool proxy only accepts a name and will execute the highest version, so clients can select arguments from a stale schema and hit validation failures.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-02-25T23:51:36Z

src/fastmcp/server/transforms/search/base.py

+        if name == self._search_tool_name:
+            return self._make_search_tool()
+        if name == self._call_tool_name:
+            return self._make_call_tool()


Reject synthetic-name collisions with real tools

get_tool always intercepts search_tool_name and call_tool_name, so any existing real tool with either of those names (or with colliding custom names) becomes unreachable once this transform is enabled. Because the proxy also blocks calls to synthetic names, the collision silently removes valid tools from use instead of failing fast, which is a breaking behavior for affected servers.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f82a1fd43e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-02-27T01:30:47Z

src/fastmcp/server/transforms/search/base.py

+
+            Use this to execute tools discovered via search_tools.
+            """
+            if name in {transform._call_tool_name, transform._search_tool_name}:


Block transformed aliases from re-entering call proxy

The self-call guard only rejects transform._call_tool_name and transform._search_tool_name, so it misses aliases introduced by downstream name-rewriting transforms. For example, when Namespace is added after this transform, clients see ns_call_tool; calling ns_call_tool with name="ns_call_tool" bypasses this check and ctx.fastmcp.call_tool(...) resolves the same proxy again, causing unbounded recursion/timeouts. Fresh evidence: Namespace.list_tools() prefixes every tool name, including synthetic ones, so transformed aliases are reachable in normal configuration.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-02-27T01:30:47Z

src/fastmcp/server/transforms/search/bm25.py

+
+def _catalog_hash(tools: Sequence[Tool]) -> str:
+    """SHA256 hash of sorted tool searchable text for staleness detection."""
+    key = "|".join(sorted(_extract_searchable_text(t) for t in tools))


Include full schema in BM25 staleness key

The BM25 rebuild key hashes only _extract_searchable_text, which includes names/descriptions and parameter descriptions but omits other schema fields (for example parameter types, required fields, defaults, and output schema). When a dynamic provider updates those non-text fields without changing searchable text, _search() reuses stale _indexed_tools objects and returns outdated tool schemas, so clients can generate arguments that no longer validate against the current tool. Fresh evidence: _catalog_hash now uses searchable text (not just names), but still does not fingerprint the full tool schema.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fb6236872f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-02-27T01:45:26Z

src/fastmcp/server/transforms/search/base.py

+                raise ValueError(
+                    f"'{name}' is a synthetic search tool and cannot be called via the call_tool proxy"
+                )
+            return await ctx.fastmcp.call_tool(name, arguments)


Preserve task context when forwarding through call_tool

The proxy always forwards discovered tools with ctx.fastmcp.call_tool(name, arguments) and never propagates task execution metadata, so any discovered tool configured with task mode required will fail at runtime (the task router rejects synchronous calls without task metadata). This only appears when servers use search transform with task-augmented tools, but in that setup those tools become effectively unusable through the documented search_tools → call_tool flow.

Useful? React with 👍 / 👎.

RegexSearchTransform and BM25SearchTransform collapse large tool catalogs into a search interface so LLMs discover tools on demand instead of receiving the full listing.

…scriptions

Transforms that replace list_tools() with synthetic components (like search) need to read the real catalog at call time without triggering their own replacement logic. CatalogTransform handles the re-entrant bypass via per-instance ContextVar, exposing transform_tools() as the subclass hook and get_tool_catalog() for catalog access.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 727e430660

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

src/fastmcp/server/transforms/search/base.py

jlowin added the feature Major new functionality. Reserved for 2-4 significant PRs per release. Not for issues. label Feb 11, 2026

marvin-context-protocol bot added the server Related to FastMCP server implementation or server-side functionality. label Feb 11, 2026

mintlify bot deployed to staging - docs February 11, 2026 18:52 View deployment

jlowin added the DON'T MERGE PR is not ready for merging. Used by authors to prevent premature merging. label Feb 11, 2026

jlowin added this to the 3.1 milestone Feb 11, 2026

mintlify bot deployed to staging - docs February 11, 2026 18:56 View deployment

chatgpt-codex-connector bot reviewed Feb 11, 2026

View reviewed changes

coderabbitai bot reviewed Feb 11, 2026

View reviewed changes

whatevertogo mentioned this pull request Feb 12, 2026

Tool search tool CoplayDev/unity-mcp#560

Closed

chatgpt-codex-connector bot reviewed Feb 25, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Feb 27, 2026

View reviewed changes

jlowin removed the DON'T MERGE PR is not ready for merging. Used by authors to prevent premature merging. label Feb 27, 2026

chatgpt-codex-connector bot reviewed Feb 27, 2026

View reviewed changes

jlowin and others added 10 commits February 26, 2026 22:29

feat: Add search transforms for tool discovery

bb9596b

RegexSearchTransform and BM25SearchTransform collapse large tool catalogs into a search interface so LLMs discover tools on demand instead of receiving the full listing.

chore: Update SDK documentation

de30961

fix: call_tool recursion guard, atomic BM25 rebuild, hash includes de…

454568e

…scriptions

Add search transform examples for regex and BM25

deee792

Add README for search transform examples

3ec063c

Polish search example clients with rich output

8f873a2

Remove hardcoded tool counts from search example subtitles

bbccc21

Clarify that review bot feedback should be evaluated on its merits

daa3767

Expand search transform docs with proper hierarchy

727e430

jlowin force-pushed the feat/tool-search-transforms branch from 7beb80e to 727e430 Compare February 27, 2026 03:32

chatgpt-codex-connector bot reviewed Feb 27, 2026

View reviewed changes

src/fastmcp/server/transforms/search/base.py Show resolved Hide resolved

jlowin merged commit c96c040 into main Feb 27, 2026
8 of 9 checks passed

jlowin deleted the feat/tool-search-transforms branch February 27, 2026 03:42

jlowin mentioned this pull request Feb 27, 2026

Add experimental CodeMode transform #3297

Merged

		current_hash = _catalog_hash(tools)
		if current_hash != self._last_hash:

Conversation

jlowin commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Possibly related PRs

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

whatevertogo commented Feb 12, 2026

Uh oh!

jlowin commented Feb 25, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jlowin commented Feb 11, 2026 •

edited

Loading

coderabbitai bot commented Feb 11, 2026 •

edited

Loading