Problem
When a background process is running (e.g. cargo build, cargo test), Codex enters a polling loop where each status check triggers a full API round-trip with the entire conversation history. This burns tokens/credits proportional to history size × poll count, even though no meaningful work is being done.
Root Cause (from source analysis)
The turn loop in codex.rs:4869 works as follows:
- Model issues
exec_command → local process spawns → partial output returned
needs_follow_up = true (stream_events_utils.rs:133)
- Loop back →
clone_history().for_prompt() sends full history to API (codex.rs:4901-4905)
- Model sees process still running → issues
write_stdin with empty input to poll
process_manager.rs waits MIN_EMPTY_YIELD_TIME_MS (5s) then returns "(no new output)"
needs_follow_up = true again → goto step 3
Each poll = 1 full API request with complete conversation history. A 60-second cargo build generates ~12 polling turns. With 250+ items in history, this is extremely wasteful.
The model also "reasons" about whether to wait or do something else during each poll, consuming additional output tokens for no useful purpose.
Observed Behavior
From proxy debug logs, a single Codex session shows items growing: 237 → 239 → 242 → ... → 300+ over routine tool calls. Each turn re-transmits the entire history. During background waits, the cadence is ~10s per poll with no substantive new content.
Confirmed by Community
Issue #10957 reporter noted:
"jobs are running in background but not sure implementation is very 'token' friendly as it keep polling and reasoning about it instead of just waiting"
Proposed Solutions
Option A: Local wait before API turn (preferred)
When write_stdin returns no new output and the process hasn't exited, do not set needs_follow_up = true. Instead, wait locally (30-60s configurable) and only re-poll the API when:
- The process produces new output, OR
- The process exits, OR
- A timeout expires
This could be implemented in the tool handler layer — if write_stdin response has no output and no exit code, sleep locally before returning, rather than bouncing back to the model.
Option B: Increase MIN_EMPTY_YIELD_TIME_MS
Change MIN_EMPTY_YIELD_TIME_MS from 5s to 30-60s for truly empty polls. This is a minimal change but still wastes one API turn per interval.
Option C: Batch tool results
When the model issues write_stdin for a process that has no new output, return the tool result without setting needs_follow_up. The model only gets called again when the process finishes or produces output (via a notification/callback mechanism).
Environment
- Codex CLI via Responses API proxy
- Observed on both macOS and Windows (Windows worse due to slower builds)
- History sizes: 200-300+ items typical for medium sessions
- GAC (proxy) bills per payload KB, making this especially costly
Related Issues
Problem
When a background process is running (e.g.
cargo build,cargo test), Codex enters a polling loop where each status check triggers a full API round-trip with the entire conversation history. This burns tokens/credits proportional to history size × poll count, even though no meaningful work is being done.Root Cause (from source analysis)
The turn loop in
codex.rs:4869works as follows:exec_command→ local process spawns → partial output returnedneeds_follow_up = true(stream_events_utils.rs:133)clone_history().for_prompt()sends full history to API (codex.rs:4901-4905)write_stdinwith empty input to pollprocess_manager.rswaitsMIN_EMPTY_YIELD_TIME_MS(5s) then returns "(no new output)"needs_follow_up = trueagain → goto step 3Each poll = 1 full API request with complete conversation history. A 60-second
cargo buildgenerates ~12 polling turns. With 250+ items in history, this is extremely wasteful.The model also "reasons" about whether to wait or do something else during each poll, consuming additional output tokens for no useful purpose.
Observed Behavior
From proxy debug logs, a single Codex session shows items growing: 237 → 239 → 242 → ... → 300+ over routine tool calls. Each turn re-transmits the entire history. During background waits, the cadence is ~10s per poll with no substantive new content.
Confirmed by Community
Issue #10957 reporter noted:
Proposed Solutions
Option A: Local wait before API turn (preferred)
When
write_stdinreturns no new output and the process hasn't exited, do not setneeds_follow_up = true. Instead, wait locally (30-60s configurable) and only re-poll the API when:This could be implemented in the tool handler layer — if
write_stdinresponse has no output and no exit code, sleep locally before returning, rather than bouncing back to the model.Option B: Increase MIN_EMPTY_YIELD_TIME_MS
Change
MIN_EMPTY_YIELD_TIME_MSfrom 5s to 30-60s for truly empty polls. This is a minimal change but still wastes one API turn per interval.Option C: Batch tool results
When the model issues
write_stdinfor a process that has no new output, return the tool result without settingneeds_follow_up. The model only gets called again when the process finishes or produces output (via a notification/callback mechanism).Environment
Related Issues