SDK MCP tools can fail / “hallucinate” under background subagents due to message-queue backpressure

*Note that this was fully written by OpenAI Codex*

## Summary

When using `claude-agent-sdk` with **SDK MCP servers** (`mcp_servers={...,"type":"sdk"}`), tool calls can become unavailable or fail in scenarios where **subagents keep running “in background” after the parent response completes**. In the transcript this often appears as the model emitting a *plain-text* `<function_calls><invoke ...></invoke></function_calls>` block instead of a real `tool_use` / `tool_result` pair, i.e. it behaves as if the MCP tool is missing.

This seems correlated with the SDK’s internal message buffering/backpressure: if the application stops consuming messages after the parent `ResultMessage` (e.g. uses `receive_response()` and returns), later streaming output from background subagents can fill the SDK’s internal queue and block the transport reader, which then blocks the control protocol needed for SDK MCP bridging (`mcp_message` control requests).

## Environment

- `claude-agent-sdk`: `0.1.17`
- Claude Code CLI: `2.0.70` (from stream-json transcript)
- Python: `3.12.3`
- `mcp`: `1.21.1`
- `anyio`: `4.11.0`
- `include_partial_messages`: `True` (in our usage; increases message volume)

## What we see in practice

**Working (foreground)**: real tool call:

- assistant emits a `tool_use` block: `mcp__action_manager__persist_character_design`
- then a `tool_result` is delivered back

**Failing (background)**: tool call “hallucinated” as plain text:

- assistant message is just a `text` block that contains `<function_calls><invoke name="mcp__...">...`
- no `tool_use` / `tool_result` blocks appear, but the assistant text claims success

This matches the behavior when the model does not actually have the tool schema available or cannot complete the tool call.

## Reproduction sketch (minimal)

I don’t have a single deterministic prompt-only repro yet, but the pattern is:

1. Configure `ClaudeSDKClient` with an SDK MCP server:
   - `mcp_servers={"action_manager": create_sdk_mcp_server(...)}`
2. Ensure the model uses a **subagent/background** mechanism that can produce output *after* the parent response `ResultMessage` (e.g., Task/subagent jobs that continue running while the parent returns).
3. In application code, send `client.query(...)` and then consume messages **only until** the first `ResultMessage` (e.g., `async for m in client.receive_response(): ...` which terminates at `ResultMessage`).
4. Don’t keep draining `client.receive_messages()` afterwards.
5. If the CLI keeps producing additional events/messages (especially with `include_partial_messages=True`), the SDK’s internal queue can fill, causing backpressure and breaking the ability to service later control requests (including SDK MCP).

## Expected behavior

Even if the application uses `receive_response()` (and therefore stops consuming after the parent `ResultMessage`), SDK MCP tool availability and tool execution should remain reliable for any background/subagent work that is still ongoing within the same session.

At minimum, the SDK should not deadlock/control-protocol-starve when the app temporarily isn’t consuming transcript messages.

## Actual behavior

After the parent response completes, background/subagent work that tries to call SDK MCP tools may:

- not see tool schemas / not be able to call tools
- emit a plain-text “function_calls/invoke” block (hallucinated tool call)
- or otherwise fail to get tool results

## Suspected root cause (SDK-side)

In `claude_agent_sdk/_internal/query.py`:

- The SDK uses an internal memory stream with a small buffer:
  - `anyio.create_memory_object_stream(max_buffer_size=100)`
- `_read_messages()` forwards **all non-control** messages into this buffer via:
  - `await self._message_send.send(message)`

If the application stops consuming messages (e.g., stops after `ResultMessage`), and the CLI continues emitting messages (common with partial streaming and/or subagent background output), then:

1. `_message_send.send(...)` blocks once the buffer reaches capacity.
2. `_read_messages()` stops draining stdout from the Claude Code CLI process.
3. Control protocol messages that arrive later on stdout (including `control_request` subtype `mcp_message` used for SDK MCP bridging) are not read/handled promptly.
4. SDK MCP becomes unreliable, which manifests as missing tool schema or missing tool results.

This is particularly surprising because `receive_response()` is presented as a convenience API; users may reasonably expect it to be safe in sessions that use SDK MCP servers.

## Proposed fixes / improvements

One or more of:

1. **Never block `_read_messages()` on delivery to the user queue**
   - Use `send_nowait()` / `move_on_after(0)` for non-control messages.
   - If the queue is full, drop messages (or drop only low-value messages like partial `StreamEvent`).
   - The priority should be “keep draining CLI stdout + keep servicing control protocol”.

2. **Make the internal message buffer size configurable**
   - e.g., `ClaudeAgentOptions.max_message_queue` (separate from `max_buffer_size` which currently guards JSON line buffering).

3. **Add an SDK-managed background drain/pump**
   - If SDK MCP servers or hooks are configured, keep draining messages after `receive_response()` returns so the control channel remains healthy.
   - Or provide a documented helper/pattern for this.

4. **Documentation**
   - Explicitly warn that if you use SDK MCP servers (or expect background/subagent output), you must continue consuming `receive_messages()` or you may starve the control channel.

## Why this matters

SDK MCP servers are a key feature for “in-process tools”. Background subagents (e.g., Task tool patterns) are also a core workflow. If `receive_response()` usage can cause hidden backpressure that breaks tool execution, it’s very easy for users to end up with brittle systems and hard-to-debug “hallucinated tool results”.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SDK MCP tools can fail / “hallucinate” under background subagents due to message-queue backpressure #425

Summary

Environment

What we see in practice

Reproduction sketch (minimal)

Expected behavior

Actual behavior

Suspected root cause (SDK-side)

Proposed fixes / improvements

Why this matters

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

SDK MCP tools can fail / “hallucinate” under background subagents due to message-queue backpressure #425

Description

Summary

Environment

What we see in practice

Reproduction sketch (minimal)

Expected behavior

Actual behavior

Suspected root cause (SDK-side)

Proposed fixes / improvements

Why this matters

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions