Skip to content

Subagent results silently overflow context, causing unrecoverable session crash #23463

@treygoff24

Description

@treygoff24

1 ## Bug Description
2
3 When multiple Task tool subagents complete and return large results to the parent agent, the combined results can overflow the parent's context window. This causes the parent to enter a terminal "Prompt is too long" loop — it cannot process any incoming messages, cannot summarize or act on the results, and the session becomes permanently unresponsive until the user force-quits.
4
5 From the user's perspective, the session simply freezes and then crashes, with no indication of what happened and no way to recover the work. The user sees nothing — no error message, no partial results, no graceful degradation.
6
7 ## Reproduction Steps
8
9 1. Start a Claude Code session on a large codebase (~200+ files)
10 2. Use the Task tool to spawn 7 parallel subagents (e.g., subagent_type=general-purpose), each tasked with auditing a different module against a spec
11 3. Each agent reads many files and produces a detailed report (15K–37K chars each)
12 4. All 7 agents complete and return their results as task-notification messages to the parent
13 5. The parent's context now contains the full conversation history PLUS all 7 agent results (~150K chars of agent output alone)
14 6. The parent agent responds with "Prompt is too long" to every subsequent message
15 7. The session is permanently stuck — no way to recover
16
17 ## Observed Behavior
18
19 From the JSONL session log (4a5b2a9a-4416-41b5-ad5e-573db03dba2b.jsonl):
20
21 - 209 total messages in the session before crash
22 - Session log size: 6.2 MB
23 - 7 agent reports returned totaling 150,511 characters of output
24 - 8 consecutive "Prompt is too long" errors — one for each incoming agent notification, plus one extra
25 - The parent agent could not execute a single tool call or produce any meaningful response after the agents started returning
26 - The session became completely unresponsive
27
28 Timeline from the log:
29 30 [Line 194] assistant: "Prompt is too long" ← First failure 31 [Line 195] user: <task-notification> Agent "Audit auth + infrastructure" completed (19,606 chars) 32 [Line 196] assistant: "Prompt is too long" 33 [Line 197] user: <task-notification> Agent "Audit tasks + projects + checklist" completed (21,383 chars) 34 [Line 198] assistant: "Prompt is too long" 35 [Line 199] user: <task-notification> Agent "Audit notes + tags module" completed (15,031 chars) 36 [Line 200] assistant: "Prompt is too long" 37 [Line 201] user: <task-notification> Agent "Audit medical module" completed (18,664 chars) 38 [Line 202] assistant: "Prompt is too long" 39 [Line 204] user: <task-notification> Agent "Audit hobbies + UI shell + today" completed (23,267 chars) 40 [Line 205] assistant: "Prompt is too long" 41 [Line 207] user: <task-notification> Agent "Audit finance module" completed (36,786 chars) 42 [Line 208] assistant: "Prompt is too long" ← Session permanently dead 43
44
45 Each incoming agent result made the problem worse, but the system kept delivering them with no backpressure.
46
47 ## Expected Behavior
48
49 Several things should happen instead:
50
51 1. Prevent the overflow in the first place: The Task tool / agent coordinator should track the parent's remaining context budget and either truncate or summarize agent results before injecting them into the parent's context. A 37K-char agent result does not need to be delivered verbatim — a summary with a pointer to the full output file would suffice.
52
53 2. Graceful degradation when context is near-full: When the parent is approaching context limits, it should be able to trigger compaction/summarization of older messages BEFORE the context is completely full, not after it's too late.
54
55 3. Stop delivering messages to a dead session: Once the parent hits "Prompt is too long", the system should not keep delivering more agent results that make the situation worse. There should be backpressure or circuit-breaking.
56
57 4. Surface the error to the user: The user saw nothing — the session just froze and crashed. There should be a clear error message like: "Session context limit reached. Agent results have been saved to [path]. Please start a new session."
58
59 5. Auto-save results on crash: Agent results that caused the overflow should be automatically written to disk (they already exist in the JSONL log, but are not surfaced). I was able to recover the data by manually parsing the JSONL, but most users would assume it's all lost.
60
61 ## Recovery (Manual)
62
63 The agent results were recoverable from the JSONL session log at:
64 65 ~/.claude/projects/-Users-treygoff-Code-goff-family-dashboard/4a5b2a9a-4416-41b5-ad5e-573db03dba2b.jsonl 66
67
68 By parsing the <task-notification> / <result> tags from user-type messages. But this required writing custom Python to extract them — there's no built-in recovery mechanism.
69
70 ## Environment
71
72 - Claude Code version: 2.1.32
73 - OS: macOS Darwin 25.2.0
74 - Model: claude-opus-4-6
75 - Session ID: 4a5b2a9a-4416-41b5-ad5e-573db03dba2b
76
77 ## Suggested Fixes (Priority Order)
78
79 1. Agent result size limits / summarization: Cap the size of results injected into the parent context. Write full results to a file and give the parent a summary + file path.
80 2. Context budget tracking: Before delivering agent results, check if they'll fit. If not, summarize or queue them.
81 3. Compaction trigger: When context usage exceeds ~80%, proactively compact older messages to make room.
82 4. Circuit breaker: After the first "Prompt is too long" error, stop delivering additional agent results and surface the error to the user with recovery instructions.
83 5. Crash recovery UX: On session restart, detect the previous crash and offer to show recovered agent outputs.
84
85 ## Impact
86
87 This is a high-severity UX bug for power users. The subagent pattern is one of Claude Code's most powerful features, and it's the recommended approach for large codebase audits. Having it silently crash with no recovery path when agents produce thorough results is extremely frustrating — it penalizes users for getting good results from their agents.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions