Skip to content

Session becomes unresumable after JSONL writer drops assistant entry during parallel tool calls #31328

@ppiankov

Description

@ppiankov

Summary

A session becomes permanently stuck with API error 400 when the JSONL writer drops an assistant message during parallel tool call execution. The dropped entry creates an orphaned tool_result that breaks the parent chain, making the session unresumable from both CLI (claude --resume) and Claude for Mac.

Reproduction

  1. Run a session with multiple concurrent subagents (3+ parallel tool calls)
  2. One assistant message containing a tool_use block is never written to the JSONL
  3. The corresponding tool_result (user entry) IS written, referencing a tool_use_id with no matching tool_use
  4. Close and resume the session
  5. Every subsequent API call fails with:
400 {"type":"error","error":{"type":"invalid_request_error",
"message":"messages.0.content.0: unexpected tool_use_id found in tool_result blocks: toolu_XXXX.
Each tool_result block must have a corresponding tool_use block in the previous message."}}

Root Cause

The JSONL writer is not atomic with respect to multi-tool-use responses when concurrent agent instances write to the same session file. The assistant message (containing tool_use) is lost, but the subsequent user message (containing tool_result) is persisted.

Evidence from affected session

  • Session: 45MB, 5064 entries, 15 compactions
  • 11 orphaned tool_result entries found (parent assistant messages missing)
  • 10 were in dead sidechains (no user impact)
  • 1 landed in the active parent chain → session permanently broken
  • Debug log shows 3 concurrent agent instances (cc_version hashes: .7e0, .3ae, .1a9) writing simultaneously at the time of corruption
  • The dropped message was a git commit tool_use — the Bash hook fired and completed, but the assistant entry was never written

Additional issue: resume hangs on large sessions

After manually repairing the parent chain (removing the orphaned entry, re-parenting descendants), both claude --resume <id> and Claude for Mac fail to load the session:

  • CLI hangs indefinitely (no debug log output, no API call made)
  • Claude for Mac removes the session from the sidebar entirely
  • The JSONL parses correctly in <0.2s with Python, no cycles, no duplicate UUIDs
  • Session size: 45MB / 5060 entries — large but not unreasonable for a multi-day session

Expected behavior

  1. Writer atomicity: Assistant messages and their tool_result responses should be written atomically, or tool_results should validate that their parent tool_use exists before persisting
  2. Graceful resume: Large or corrupted sessions should show an error message, not hang silently
  3. Self-healing: On resume, detect orphaned tool_results in the active chain and skip them (they contain no user-authored content)

Environment

  • Claude Code CLI: 2.1.69
  • Claude for Mac (claude-desktop): 2.1.51
  • macOS 15.3.1 (Darwin 25.2.0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:corebugSomething isn't workinghas reproHas detailed reproduction stepsplatform:macosIssue specifically occurs on macOSstaleIssue is inactive

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions