fix: resolve cross-task cancel scope RuntimeError on async generator cleanup (#454) by qing-ant · Pull Request #746 · anthropics/claude-agent-sdk-python

qing-ant · 2026-03-26T17:54:59Z

Problem

When users break out of the async for loop over query(), Python may finalize the async generator in a different task than the one that created the task group. This causes close() to call TaskGroup.__aexit__() from a different task than start() called __aenter__(), triggering:

RuntimeError: Attempted to exit cancel scope in a different task than it was entered in

Fixes #454.

Root cause

The Query class was using anyio's TaskGroup with manual __aenter__/__aexit__ calls. anyio's cancel scopes have task affinity — they must be exited by the same async task that entered them. During async generator finalization, Python can schedule the generator's cleanup in a different task, violating this invariant.

Why PR #364 doesn't fix it

PR #364 introduces an "owner task pattern" that wraps the inner task group in a dedicated owner task. However, it still creates an outer task group (_outer_tg) using the same manual __aenter__/__aexit__ pattern, so the cross-task error just moves one level up. The tests in that PR call start() and close() from the same task, so they don't reproduce the actual failure scenario.

Solution

Replace anyio TaskGroup with asyncio.create_task() for background task management. asyncio.create_task() has no cancel scope, so close() can cancel tasks from any task context without triggering the RuntimeError.

Changes:

query.py: Replace _tg (anyio TaskGroup) with _read_task (asyncio Task) and _child_tasks (set of asyncio Tasks). Add spawn_task() method as the replacement for _tg.start_soon().
client.py / _internal/client.py: Update callers to use spawn_task() instead of _tg.start_soon().
test_query.py: Add tests that reproduce the cross-task cleanup scenario.

Test plan

All 356 existing tests pass
New test test_close_from_different_task_does_not_raise verifies cross-task cleanup works
New test test_close_from_same_task_still_works verifies normal cleanup still works
Linting (ruff) and type checking (mypy) pass

…ncel scope error (#454) Replace manual anyio TaskGroup.__aenter__/__aexit__ calls with asyncio.create_task() for background task management in Query. The anyio TaskGroup pattern required cancel scopes to be entered and exited in the same async task. When users break from the async generator returned by query(), Python may finalize the generator in a different task, causing close() to call __aexit__ from a different task than start() called __aenter__. This produced a RuntimeError: 'Attempted to exit cancel scope in a different task than it was entered in' The fix uses asyncio.create_task() which has no cancel scope affinity, allowing close() to cancel the read task from any task context. A new spawn_task() method replaces _tg.start_soon() for child tasks. :house: Remote-Dev: homespace

src/claude_agent_sdk/_internal/query.py

qing-ant · 2026-03-26T18:32:45Z

E2E Test Results

Test Script

"""E2E test for PR #746 / Issue #454: cross-task cancel scope RuntimeError on async generator cleanup.

When breaking out of `async for` over query(), Python may finalize the async
generator in a different task than the one that entered the anyio cancel scope,
causing:
  RuntimeError: Attempted to exit cancel scope in a different task than it was entered in

This test breaks early from the async generator and checks stderr for the error.
"""

import asyncio
import sys
import io
import logging
import warnings


async def run_test():
    from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage

    # Capture warnings and stderr to detect the RuntimeError
    stderr_capture = io.StringIO()
    handler = logging.StreamHandler(stderr_capture)
    handler.setLevel(logging.DEBUG)
    logging.getLogger().addHandler(handler)

    old_stderr = sys.stderr
    sys.stderr = io.TextIOWrapper(io.BytesIO(), write_through=True)

    error_detected = False

    try:
        messages = []
        async for msg in query(
            prompt="Say hello in exactly 3 words",
            options=ClaudeAgentOptions(model="claude-sonnet-4-20250514"),
        ):
            messages.append(msg)
            if isinstance(msg, ResultMessage):
                result_preview = (msg.result or "")[:80]
                print(f"Got ResultMessage, breaking early. Result: {result_preview}...")
                break

        # Give the event loop a chance to process any pending callbacks/finalizers
        # that would trigger the cross-task cancel scope error
        await asyncio.sleep(0.5)

        # Force garbage collection to trigger async generator finalization
        import gc
        gc.collect()
        await asyncio.sleep(0.5)

    except Exception as e:
        print(f"Exception during query: {type(e).__name__}: {e}")
        error_detected = True

    finally:
        # Restore stderr and check for errors
        captured_stderr_bytes = sys.stderr.buffer.getvalue() if hasattr(sys.stderr, 'buffer') else b""
        sys.stderr = old_stderr
        captured_stderr = captured_stderr_bytes.decode("utf-8", errors="replace")
        captured_logs = stderr_capture.getvalue()

    # Check for the specific RuntimeError in captured output
    all_output = captured_stderr + captured_logs
    if "cancel scope" in all_output.lower() or "RuntimeError" in all_output:
        error_detected = True
        print(f"\n--- CAPTURED STDERR/LOGS ---")
        print(all_output.strip())
        print(f"--- END CAPTURED ---")

    return error_detected, messages


def main():
    # Also install a custom exception handler to catch "Task exception was never retrieved"
    exceptions_found = []

    def custom_exception_handler(loop, context):
        msg = context.get("message", "")
        exc = context.get("exception", None)
        detail = f"{msg}: {exc}" if exc else msg
        exceptions_found.append(detail)
        print(f"[Exception handler] {detail}", file=sys.__stderr__)

    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)
    loop.set_exception_handler(custom_exception_handler)

    try:
        error_detected, messages = loop.run_until_complete(run_test())
        # Give time for any deferred task exceptions
        loop.run_until_complete(asyncio.sleep(1.0))
    finally:
        # Run pending callbacks
        loop.run_until_complete(asyncio.sleep(0.2))
        loop.close()

    # Check for cross-task cancel scope errors in the exception handler output
    cancel_scope_errors = [e for e in exceptions_found if "cancel scope" in e.lower() or "RuntimeError" in e]
    any_task_exceptions = len(exceptions_found) > 0

    print(f"\nMessages received before break: {len(messages)}")
    print(f"Exception handler caught {len(exceptions_found)} exception(s)")
    for e in exceptions_found:
        print(f"  - {e}")

    if error_detected or cancel_scope_errors:
        print("\n>>> FAIL: RuntimeError about cancel scope detected <<<")
        sys.exit(1)
    elif any_task_exceptions:
        print(f"\n>>> FAIL: Task exceptions detected (not cancel scope, but still errors) <<<")
        sys.exit(1)
    else:
        print("\n>>> PASS: No cross-task cancel scope errors <<<")
        sys.exit(0)


if __name__ == "__main__":
    main()

Regression Test (main branch) - FAIL as expected

SDK installed from origin/main (76cb292). Breaking early from the async generator triggers the cross-task cancel scope RuntimeError:

[Exception handler] Task exception was never retrieved: Attempted to exit cancel scope in a different task than it was entered in
Got ResultMessage, breaking early. Result: Hello there friend!...
Traceback (most recent call last):
  File "/tmp/e2e-746-test.py", line 120, in <module>
    main()
  File "/tmp/e2e-746-test.py", line 91, in main
    error_detected, messages = loop.run_until_complete(run_test())
  File "asyncio/base_events.py", line 725, in run_until_complete
    return future.result()
  File "/tmp/e2e-746-test.py", line 46, in run_test
    await asyncio.sleep(0.5)
  File "asyncio/tasks.py", line 718, in sleep
    return await future
asyncio.exceptions.CancelledError: Cancelled by cancel scope 7f9c466835c0

Exit code: 1 (FAIL)

Fixed Branch Test (`qing/fix-454-cross-task-cancel-scope`) - PASS

SDK installed from the fix branch (b7dddce). Breaking early from the async generator works cleanly with no errors:

Got ResultMessage, breaking early. Result: Hello there friend!...

Messages received before break: 5
Exception handler caught 0 exception(s)

>>> PASS: No cross-task cancel scope errors <<<

Exit code: 0 (PASS)

Verdict

PASS - The fix correctly resolves the cross-task cancel scope RuntimeError. Replacing the anyio TaskGroup with direct asyncio.Task management eliminates the cancel scope that caused the cross-task finalization error when Python's async generator cleanup runs in a different task.

bogini

stamped 🐎

## Summary Implements `control_cancel_request` handling in the Python SDK. Previously, these messages from the CLI were silently ignored via a TODO placeholder at `_internal/query.py:210-213`. Fixes #739 ## Problem When the CLI sends `control_cancel_request` to cancel an in-flight hook callback (e.g., when a subagent completes while a parent-level hook is still pending, or during query shutdown), the SDK takes no action. This causes: 1. **CLI-side AbortError noise** — The CLI fires its abort signal, rejects the pending hook request, and logs `Error in hook callback hook_N: ... AbortError` to stderr on every cancelled hook. 2. **Python runs cancelled callbacks** — Hook callbacks continue executing after the CLI has abandoned them. The eventual response write either gets dropped silently or hits a closed transport. 3. **Shutdown desync** — During `close()`, in-flight hooks that should have been cancelled are still running. ## Fix - **`__init__`**: Add `self._inflight_requests: dict[str, asyncio.Task]` to track control request handlers by `request_id` - **`_read_messages`**: When spawning `_handle_control_request` tasks, register them in `_inflight_requests` with a done-callback that removes them on completion. When `control_cancel_request` arrives, look up the task by `request_id` and cancel it. - **`_handle_control_request`**: Catch and re-raise `asyncio.CancelledError` before the generic `Exception` handler, so cancelled tasks don't attempt to write error responses for requests the CLI has already abandoned. The issue's suggested fix used `anyio.CancelScope`, but PR #746 replaced the anyio TaskGroup with plain `asyncio.Task` tracking, so this fix uses the simpler `asyncio.Task.cancel()` approach that matches the current architecture. ## Verification **Unit tests (3 new):** - `test_cancel_request_cancels_inflight_hook` — slow hook gets cancelled, `CancelledError` raised, no response written - `test_cancel_request_for_unknown_id_is_noop` — unknown `request_id` doesn't raise - `test_completed_request_is_removed_from_inflight` — completed handlers are cleaned up from tracking dict **End-to-end with live SDK instance:** ``` === Structural check === PASS: control_cancel_request handler implemented, TODO removed PASS: _handle_control_request re-raises CancelledError without writing PASS: _inflight_requests dict initialized === Live E2E: hooks still work after fix === ResultMessage: is_error=False, turns=1 Hook called 2 times: ['Agent', 'Write'] PASS: Hooks work correctly after fix ``` **Test suite:** - 407 tests pass (2 pre-existing trio backend failures on main) - `ruff check` + `ruff format` clean - `mypy src/` clean

tjni · 2026-04-02T16:40:43Z

Hi @qing-ant, is it possible to resolve this in a way that continues to use anyio in order to support trio?

claude bot reviewed Mar 26, 2026

View reviewed changes

src/claude_agent_sdk/_internal/query.py Show resolved Hide resolved

qing-ant enabled auto-merge (squash) March 26, 2026 23:12

bogini approved these changes Mar 26, 2026

View reviewed changes

qing-ant merged commit f39ebeb into main Mar 26, 2026
10 checks passed

qing-ant deleted the qing/fix-454-cross-task-cancel-scope branch March 26, 2026 23:26

This was referenced Mar 27, 2026

Query.close() can hang indefinitely causing 100% CPU usage due to missing timeout on task group cleanup #378

Closed

fix: implement control_cancel_request handling #751

Merged

qing-ant mentioned this pull request Mar 30, 2026

RuntimeError: Attempted to exit cancel scope in a different task on query() cancellation #776

Closed

davidcyze mentioned this pull request Apr 12, 2026

SubprocessCLITransport stderr task group leaks cancel scope on query() completion — same bug #454 / #776, fix #746 incomplete #810

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: resolve cross-task cancel scope RuntimeError on async generator cleanup (#454)#746

fix: resolve cross-task cancel scope RuntimeError on async generator cleanup (#454)#746
qing-ant merged 1 commit intomainfrom
qing/fix-454-cross-task-cancel-scope

qing-ant commented Mar 26, 2026

Uh oh!

Uh oh!

qing-ant commented Mar 26, 2026

Uh oh!

bogini left a comment

Uh oh!

Uh oh!

tjni commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

qing-ant commented Mar 26, 2026

Problem

Root cause

Why PR #364 doesn't fix it

Solution

Test plan

Uh oh!

Uh oh!

qing-ant commented Mar 26, 2026

E2E Test Results

Test Script

Regression Test (main branch) - FAIL as expected

Fixed Branch Test (qing/fix-454-cross-task-cancel-scope) - PASS

Verdict

Uh oh!

bogini left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tjni commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fixed Branch Test (`qing/fix-454-cross-task-cancel-scope`) - PASS