Update Codex ACP to support most of Codex's latest features and improve overall UX#206
Draft
sam-ulrich1 wants to merge 10 commits intozed-industries:mainfrom
Draft
Update Codex ACP to support most of Codex's latest features and improve overall UX#206sam-ulrich1 wants to merge 10 commits intozed-industries:mainfrom
sam-ulrich1 wants to merge 10 commits intozed-industries:mainfrom
Conversation
P0 reliability fixes: - Add warning logs for orphaned tool calls in exec_command_output_delta, exec_command_end, and terminal_interaction when call_id is not found - Support multiple concurrent web searches by converting active_web_search from Option<String> to HashMap<String, ()> - Replace all drop(tx.send(...)) with logged error handling throughout thread.rs (25+ call sites) - Replace panicking .expect()/.unwrap() on mutexes with .unwrap_or_else(|e| e.into_inner()) in thread.rs, codex_agent.rs, and local_spawner.rs P2 event surfacing (12 previously-ignored event types now handled): - HookStarted/HookCompleted: surfaced as agent text with hook name and completion status - ImageGenerationBegin/End: surfaced as tool call lifecycle - AgentReasoningRawContent: forwarded as agent thought chunk - ThreadRolledBack: surfaced as agent text with turn count - BackgroundEvent: forwarded as agent text - DeprecationNotice: forwarded as bold warning text with details - PlanDelta: forwarded as agent thought chunk for incremental updates - CollabAgentSpawnBegin/End: surfaced as tool call lifecycle - CollabAgentInteractionBegin/End: surfaced as tool call lifecycle - CollabWaitingBegin/End: surfaced as agent text status messages P3 slash command additions (9 new commands): - /diff: show git diff including untracked files - /status: show session configuration and token usage - /stop: Op::CleanBackgroundTerminals to stop background terminals - /rename: rename current thread - /mention: file mention as @path - /feedback: collect and send diagnostic logs - /mcp: list configured MCP tools - /skills: list available skills - /debug-config: show config layers for debugging Tests: 16 new tests added (37 total, all passing): - 5 slash command tests (stop, diff, status, rename, mention) - 9 event surfacing tests (hooks, image gen, background, deprecation, rollback, reasoning, plan delta, collab spawn, collab interaction) - 2 reliability tests (concurrent web searches, builtin command list) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fork session: - Implement fork_session() in codex_agent.rs using codex-core's ThreadManager::fork_thread() API - Advertise SessionForkCapabilities in initialize response - Fork creates a new thread with the source session's full history, assigns a new session ID, and returns modes/models/config_options - No SDK upgrade needed — the Rust ACP SDK 0.10.2 already has ForkSessionRequest/Response behind the "unstable" feature flag Usage accumulation: - Add AccumulatedUsage struct to PromptState tracking input_tokens, output_tokens, cached_input_tokens, and reasoning_output_tokens - Accumulate from TokenCountEvent.info.last_token_usage on each event - Enrich UsageUpdate notifications with Cost when credits balance is available from rate_limits.credits.balance - Cost is reported in USD using the ACP SDK's Cost type Tests: 1 new test (38 total, all passing): - test_usage_accumulation_with_cost: verifies two TokenCount events produce two UsageUpdate notifications with correct context window size, cost attached to first (with credits), no cost on second Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
18 new tests (56 total) covering every gap identified in the coverage audit. All tests pass with zero warnings. Missing slash command tests (4): - test_slash_feedback: verifies Op and notification - test_slash_mcp: verifies Op and notification - test_slash_skills: verifies Op and notification - test_slash_debug_config: verifies Op and notification Edge case tests for commands requiring arguments (3): - test_slash_rename_no_arg_falls_through: /rename with no arg falls through to default handler producing UserInput - test_slash_mention_no_arg_falls_through: same for /mention - test_slash_review_branch_no_arg_falls_through: same for /review-branch Orphaned tool call regression test (1): - test_orphaned_tool_call_events_handled_gracefully: sends ExecCommandOutputDelta, ExecCommandEnd, and TerminalInteraction for a call_id that was never registered via ExecCommandBegin — verifies no panic, clean completion CollabWaiting event test (1): - test_collab_waiting_events_surfaced: verifies exact message text for both WaitingBegin and WaitingEnd Usage accumulation edge cases (3): - test_token_count_with_none_info: TokenCount with info=None sends no UsageUpdate notification - test_token_count_with_no_context_window: TokenCount with model_context_window=None sends no UsageUpdate - test_token_count_with_unparseable_balance: credits balance "not-a-number" gracefully results in no cost field Tightened event notification assertions (5): - test_hook_events_exact_format: verifies hook started/completed message format including hook id and status message - test_image_generation_exact_content: verifies ToolCall title, kind (Other), status (InProgress), and completed update content - test_background_event_exact_format: exact string match - test_deprecation_notice_exact_format: verifies markdown bold format with summary and details - test_thread_rollback_exact_format: verifies plural "turns" Slash command notification verification (1): - test_slash_commands_generate_notifications: parametrized test covering all 8 UserInput-based slash commands, verifying both the Op text content and that AgentMessageChunk notifications are actually sent Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Author
|
One important note, I made this patch so I could use Codex in Zed the way I use it in the terminal. If this PR is too big, ugly, or just not the direction y'all want things to go that is fine. No hard feelings. If y'all do want it I am willing to put in effort to bring it up to spec |
|
Am sorry for changing too much accounts the problems by fore getting emaill before. Now needed add all email in one application |
Author
What??? |
Author
|
Maintainers, are y'all interested in this or should I close? If it makes any difference to you it took Codex in Zed from practically unusable to my daily driver |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR makes the ACP bridge more complete and more trustworthy for editor clients. The changes fall into four main buckets: session lifecycle support, ACP-native command handling, richer event/UI surfacing, and reliability hardening around long-running interactions.
The first few commits were made by Claude Opus 4.6 (via the claude acp in zed) to bootstrap past the annoying Codex ACP bugs I kept running into. After I got past the majority of the bugs I switched back to Codex using 5.4-xhigh for the remainder. This PR summary was written by Codex to make review as easy as possible. I apologize for the very large PR but there was a lot of stuff blocking my normal Codex use in zed so this brings it much closer to the current state. I am opening this first as a draft to get feedback and while I test drive it in Zed. I expect to find bugs and UX problems as I use it which I will fix before making this a real PR. This gives the maintainers the opportunity to fully reject this PR if they are not interested in it or to provide critiques that will allow me to patch before making a full PR attempt.
Many of the decisions were made cross referencing the codex repo, agent protocol sdk, and the implementation designs of the claude acp repo. There are some places where I opted for UX over SDK support like in the auto compaction handling. I did not want to have 2 outstanding PRs at once so that is definitely a hot point where the implementation diverged intentionally for better UX.
I have not tested every single fix in a real workflow yet which is a major reason this is starting as a draft. I will test them all but it will take me a few days.
Why
list_sessionscalls where the client omitscwd.What Changed
initializeand implementingfork_session()from stored rollout history.list_sessionsnow targets VS Code-backed sessions, falls back to the most recent known root whencwdis missing, normalizes paths before filtering, and prefers persisted thread names over first-message-derived titles./diffACP-native by running git locally and returning staged, unstaged, and untracked changes directly./statusACP-native by formatting current session configuration locally instead of routing through the model./renameuseOp::SetThreadNameand immediately confirms the rename back to the client./mcpand/skillsuse structured backend ops and formats their async responses in ACP./stopsupport for cleaning up background terminals.plansession mode backed by Codex collaboration mode rather than treating it like a normal approval preset./mention,/feedback, and/debug-configfrom the branch’s final state.UsageUpdatenotifications with USD cost when credits balance data is available.drop(tx.send(...))paths with logged handling.unwrap()calls with recovery viainto_inner().agent-client-protocolto the SDKmainbranch to match the unstable ACP surface used by these changes.Review Guide
src/codex_agent.rsfirst for capability advertisement,list_sessionscleanup, andfork_session().src/thread.rsnext for slash-command flow,planmode handling, and MCP/skills response routing.src/thread.rsevent-handling changes after that for reliability, usage updates, and sub-agent lifecycle surfacing.Cargo.tomlandCargo.lockat the end for the ACP SDK source change.Commit Guide
63cefc5lays the reliability and event-handling foundation.9301035adds session forking plus usage/cost plumbing.7e40fd8expands coverage around the new flows and edge cases.2aa4948refines session behavior and slash-command handling into the current ACP-facing shape.6104e28finishes the UI story by surfacing live auto-compaction progress.Testing
cargo testpasses locally.60 passed, 0 failed.