feat: add --prompt-retries for automatic retry on transient errors#196
feat: add --prompt-retries for automatic retry on transient errors#196
Conversation
When model APIs return transient errors (HTTP 400/500 surfacing as ACP internal errors), acpx now supports automatic retry with exponential backoff via the --prompt-retries flag. - Add isRetryablePromptError() to classify transient vs permanent errors - Retry only on ACP -32603 (internal) and -32700 (parse) errors - Never retry auth, permission, timeout, or session-not-found errors - Exponential backoff: 1s, 2s, 4s, 8s, capped at 10s - Skip retry if agent process crashed during prompt - Flow --prompt-retries through CLI → queue owner → prompt execution - Default: 0 (no retries, preserving existing behavior) Closes #137 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c212cca275
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| timeoutMs: task.timeoutMs, | ||
| suppressSdkConsoleErrors: task.suppressSdkConsoleErrors ?? options.suppressSdkConsoleErrors, | ||
| verbose: options.verbose, | ||
| promptRetries: options.promptRetries, |
There was a problem hiding this comment.
Pass prompt retries through queue requests
This queued prompt path always uses the queue-owner startup value for promptRetries, not a per-request value, so --prompt-retries becomes sticky to whichever process first spawned the owner. If a warm owner is already running, a new prompt --prompt-retries N invocation can silently ignore N (or inherit an older nonzero value), which makes retry behavior nondeterministic across prompts within the owner TTL window.
Useful? React with 👍 / 👎.
| try { | ||
| const promptStartedAt = Date.now(); | ||
| const promptPromise = client.prompt(activeSessionId, options.prompt); | ||
| if (attempt === 0 && options.onPromptActive) { |
There was a problem hiding this comment.
Invoke onPromptActive on every retry attempt
Calling onPromptActive only on attempt 0 drops deferred cancel handling for later attempts: if a cancel arrives during retry backoff, the queue owner can accept it as pending, but no subsequent hook call applies that pending cancel once the retry prompt starts. With --prompt-retries enabled, this can report cancellation as accepted while still running the next retry attempt.
Useful? React with 👍 / 👎.
Supersedes #142 to land the same change from an origin-hosted branch after the contributor fork PR became dirty against current
main.This replacement keeps the original prompt-retry feature and its two follow-up fixes, adds the required changelog entry, and preserves credit to the original contributors.
Original contribution:
Behavior is unchanged relative to the contributor branch:
--prompt-retriesretries transient prompt failures with exponential backoff, avoids replay after prompt side effects, and preserves strict JSON behavior.Follow-up after merge: