[tool] fix: add timeout for MCP tool calls#5361
[tool] fix: add timeout for MCP tool calls#5361denismegerle wants to merge 1 commit intoverl-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
The pull request introduces a timeout for MCP tool calls using asyncio.wait_for, which is a necessary improvement to prevent indefinite hangs. However, the current implementation only wraps the tool execution itself. To fully address the risk of hangs, the timeout should ideally encompass the connection phase and the rate-limiting wait, as these stages can also block indefinitely in networked or congested environments.
| client = self.get_client_with_tool_name(tool_name) | ||
| async with client: | ||
| return await client.call_tool_mcp(tool_name, parameters) | ||
| return await asyncio.wait_for(client.call_tool_mcp(tool_name, parameters), timeout=timeout) |
There was a problem hiding this comment.
The timeout is currently applied only to the client.call_tool_mcp execution. It does not cover the async with client block, which handles the connection to the MCP server. If the connection itself hangs (e.g., due to network issues or server unresponsiveness during the handshake), the tool call will still block indefinitely, failing the PR's objective. Additionally, the rate-limiting wait (lines 60-61) is also outside the timeout. Consider wrapping the entire operation in a way that the timeout covers both connection and execution.
What does this PR do?
Checklist Before Starting
[{modules}] {type}: {description}(This will be checked by the CI){modules}includefsdp,megatron,veomni,sglang,vllm,rollout,trainer,ci,training_utils,recipe,hardware,deployment,ray,worker,single_controller,misc,perf,model,algo,env,tool,ckpt,doc,data,cfg,reward,like[megatron, fsdp, doc]{type}is infeat,fix,refactor,chore,test[BREAKING]to the beginning of the title.[BREAKING][fsdp, megatron] feat: dynamic batchingTest
API and Usage Example
Design & Code Changes
client.call_tool_mcp(...)inasyncio.wait_for(..., timeout=timeout).Checklist Before Submitting
Important
Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.
pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=alwaysci-requestchannel in theverlSlack workspace. (If not accessible, please try the Feishu group (飞书群).)recipesubmodule, please also update the reference to the submodule commit viagit submodule update --remoteorcd recipe && git pull origin main.