Skip to content

Background process polling wastes tokens: each write_stdin poll triggers full API turn with complete history #13733

@jitlabs-sg

Description

@jitlabs-sg

Problem

When a background process is running (e.g. cargo build, cargo test), Codex enters a polling loop where each status check triggers a full API round-trip with the entire conversation history. This burns tokens/credits proportional to history size × poll count, even though no meaningful work is being done.

Root Cause (from source analysis)

The turn loop in codex.rs:4869 works as follows:

  1. Model issues exec_command → local process spawns → partial output returned
  2. needs_follow_up = true (stream_events_utils.rs:133)
  3. Loop back → clone_history().for_prompt() sends full history to API (codex.rs:4901-4905)
  4. Model sees process still running → issues write_stdin with empty input to poll
  5. process_manager.rs waits MIN_EMPTY_YIELD_TIME_MS (5s) then returns "(no new output)"
  6. needs_follow_up = true again → goto step 3

Each poll = 1 full API request with complete conversation history. A 60-second cargo build generates ~12 polling turns. With 250+ items in history, this is extremely wasteful.

The model also "reasons" about whether to wait or do something else during each poll, consuming additional output tokens for no useful purpose.

Observed Behavior

From proxy debug logs, a single Codex session shows items growing: 237 → 239 → 242 → ... → 300+ over routine tool calls. Each turn re-transmits the entire history. During background waits, the cadence is ~10s per poll with no substantive new content.

Confirmed by Community

Issue #10957 reporter noted:

"jobs are running in background but not sure implementation is very 'token' friendly as it keep polling and reasoning about it instead of just waiting"

Proposed Solutions

Option A: Local wait before API turn (preferred)

When write_stdin returns no new output and the process hasn't exited, do not set needs_follow_up = true. Instead, wait locally (30-60s configurable) and only re-poll the API when:

  • The process produces new output, OR
  • The process exits, OR
  • A timeout expires

This could be implemented in the tool handler layer — if write_stdin response has no output and no exit code, sleep locally before returning, rather than bouncing back to the model.

Option B: Increase MIN_EMPTY_YIELD_TIME_MS

Change MIN_EMPTY_YIELD_TIME_MS from 5s to 30-60s for truly empty polls. This is a minimal change but still wastes one API turn per interval.

Option C: Batch tool results

When the model issues write_stdin for a process that has no new output, return the tool result without setting needs_follow_up. The model only gets called again when the process finishes or produces output (via a notification/callback mechanism).

Environment

  • Codex CLI via Responses API proxy
  • Observed on both macOS and Windows (Windows worse due to slower builds)
  • History sizes: 200-300+ items typical for medium sessions
  • GAC (proxy) bills per payload KB, making this especially costly

Related Issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingrate-limitsIssues related to rate limits, quotas, and token usage reportingsessionIssues involving session (thread) management, resuming, forking, naming, archivingtool-callsIssues related to tool calling

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions