Skip to content

control_cancel_request messages are silently ignored, causing hook desync and AbortError noise #739

@hashdaddyd

Description

@hashdaddyd

SDK version: claude-agent-sdk 0.1.48 (Python), Bundled CLI 2.1.78

Description

When the CLI sends a control_cancel_request message to cancel an in-flight hook callback, the Python SDK ignores it entirely. The handler at _internal/query.py L205-208 is a no-op:

elif msg_type == "control_cancel_request":
    # Handle cancel requests
    # TODO: Implement cancellation support
    continue

This causes three problems:

  1. CLI-side AbortError noise -- The CLI fires its abort signal, rejects the pending hook request with AbortError, and logs Error in hook callback hook_N: ... AbortError to stderr on every cancelled hook. This is noisy but non-fatal.

  2. Python runs cancelled callbacks -- Since Python never receives the cancel, it continues executing hook callbacks that the CLI has already abandoned. The eventual response write either gets dropped silently or hits a closed transport.

  3. Shutdown desync -- During close(), in-flight hooks that should have been cancelled are still running. Combined with the ExceptionGroup issue from CLIConnectionError: ProcessTransport is not ready for writing when using SDK MCP servers with string prompts #578 / PR fix: handle transport close race condition in control request responses #492, this creates cascading failures.

Root Cause

_read_messages() receives control_cancel_request but takes no action. Meanwhile, _handle_control_request() tasks spawned via self._tg.start_soon() (L202) are not tracked by request ID, so there is no mechanism to cancel a specific in-flight request even if the handler were implemented.

The CLI sends control_cancel_request when:

  • A subagent completes while a parent-level hook is still pending
  • The overall query's abort controller fires
  • A hook callback exceeds the CLI's internal timeout

Steps to Reproduce

  1. Create a multi-agent pipeline with SDK hooks (PreToolUse/PostToolUse)
  2. Dispatch multiple subagents via the Agent tool
  3. When subagents complete or during shutdown, observe stderr:
Error in hook callback hook_2: ...
AbortError:
      at A (/$bunfs/root/src/entrypoints/cli.js:13318:473)
      at K (/$bunfs/root/src/entrypoints/cli.js:6723:7433)

Minimal reproduction:

import asyncio
from claude_agent_sdk import (
    query, ClaudeAgentOptions, HookMatcher,
    AgentDefinition,
)

async def slow_hook(input_data, tool_use_id, context):
    """Hook that takes time -- will be cancelled by CLI."""
    await asyncio.sleep(2)
    return {}

async def main():
    options = ClaudeAgentOptions(
        permission_mode="bypassPermissions",
        hooks={
            "PostToolUse": [
                HookMatcher(matcher=".*", hooks=[slow_hook]),
            ],
        },
        agents={
            "researcher": AgentDefinition(
                description="Quick research task",
                prompt="Search the web for 'Python asyncio' and summarize in one sentence.",
            ),
        },
    )

    async for msg in query(
        prompt="Use the researcher agent to look up Python asyncio, then summarize.",
        options=options,
    ):
        print(type(msg).__name__)

asyncio.run(main())

Expected Behavior

When the CLI sends control_cancel_request:

  1. The SDK cancels the matching in-flight _handle_control_request task
  2. The cancelled task does not attempt to write a response
  3. No AbortError noise in stderr
  4. Clean shutdown without orphaned hook callbacks

Actual Behavior

Suggested Fix

Track in-flight _handle_control_request tasks by request ID using anyio.CancelScope, and cancel the matching scope when control_cancel_request arrives:

# In __init__:
self._inflight_tasks: dict[str, anyio.CancelScope] = {}

# In _read_messages, replace the TODO:
elif msg_type == "control_cancel_request":
    cancel_id = message.get("request_id")
    if cancel_id:
        scope = self._inflight_tasks.get(cancel_id)
        if scope:
            scope.cancel()
    continue

# In _handle_control_request, wrap body in CancelScope:
async def _handle_control_request(self, request):
    if self._closed:
        return
    request_id = request["request_id"]
    with anyio.CancelScope() as scope:
        self._inflight_tasks[request_id] = scope
        try:
            # ... existing dispatch logic ...
            if scope.cancel_called or self._closed:
                return
            await self.transport.write(json.dumps(success_response) + "\n")
        except Exception as e:
            if self._closed or scope.cancel_called:
                return
            # ... error response ...
        finally:
            self._inflight_tasks.pop(request_id, None)

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions