fix: avoid ACP apply_patch deadlock#214
Closed
osolmaz wants to merge 1 commit into
Closed
Conversation
Member
|
Error case should be bypassed in current main. Will do a release shortly |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Clanker generated report, let me know if it's not clear enough:
Summary
apply_patchverification over ACP-backed filesystem readsReliable repro
This PR now includes a deterministic repro in
src/local_spawner.rs:apply_patch_verification_does_not_deadlock_over_acp_fsapply_patch_verification_childWhat the repro does:
AcpFsrather than the normal local filesystem.codex_apply_patch::maybe_parse_apply_patch_verified(...)on a patch that updates an existing file.Why this is the right repro:
acpxcodex-acpapply_patchuses in real sessionsBehavior:
apply_patch ACP fs roundtrip deadlockedUser-visible symptom
When
codex-acpis connected to an ACP host that serves file contents over ACP, anapply_patchcall that needs to verify an existing file can hang forever.From the host side, the symptom looks like this:
codex-acprequestsfs/read_text_file.codex-acpnever finishes theapply_patchturn.That is the behavior I first saw in the wild. The host-specific details are not important to the bug; the important part is that ACP FS reads were working, but
codex-acpstopped making progress after the response.Root cause
ApplyPatchHandlerusescodex_apply_patch::maybe_parse_apply_patch_verified(...), which can synchronously call back intoAcpFs::read_to_string()while verifying the patch against current file contents.Before this change, the agent-side ACP I/O loop was running on the same local executor as tool execution. That means:
In practice, that deadlocks
apply_patchverification at the ACP filesystem boundary.Fix
Run the ACP I/O loop on its own dedicated OS thread with its own Tokio runtime instead of awaiting it directly on the main
LocalSet.That keeps ACP request/response handling alive even while a tool handler is inside a blocking ACP FS roundtrip.
Validation
cargo test apply_patch_verification_does_not_deadlock_over_acp_fs -- --nocapturecargo testcargo fmt --check