Skip to content

enh: add feature-flagged server-side compaction#13978

Closed
cooper-oai wants to merge 67 commits intomainfrom
cooper/server-side-compaction
Closed

enh: add feature-flagged server-side compaction#13978
cooper-oai wants to merge 67 commits intomainfrom
cooper/server-side-compaction

Conversation

@cooper-oai
Copy link
Copy Markdown

@cooper-oai cooper-oai commented Mar 8, 2026

summary

  • add a default-off server_side_compaction flag for openai responses auto-compaction
  • send context_management on eligible /responses requests instead of auto-calling the legacy /compact path
  • apply streamed compaction checkpoints locally so follow-up turns reuse the inline summary without losing current-turn state
  • keep manual compaction and non-openai providers on the existing client-side flow

implementation notes

  • this only changes auto-compaction on openai-compatible responses requests
  • compact_prompt is ignored for inline auto-compaction. this is not a regression: codex already ignores it for openai auto-compaction today, and manual /compact still uses the existing prompt-based endpoint
  • codex buffers streamed compaction items and rewrites history on response.completed so later streamed items keep their order and retries do not commit partial checkpoints
  • when inline server-side compaction is available, we skip previous-model preflight compaction so openai auto-compaction uses a single path

rollout

  • land behind server_side_compaction, default off
  • validate first against internal openai responses upstreams and app-server flows
  • watch codex.compaction and codex.compaction_checkpoint_applied
  • if stable, enable for a small internal cohort; manual compaction and non-openai providers continue on the existing flow

testing

  • ran unit tests
  • tested with the local binary pointed at an openai responses api upstream

@cooper-oai cooper-oai changed the title Add feature-flagged server-side compaction enh: add feature-flagged server-side compaction Mar 8, 2026
@cooper-oai cooper-oai marked this pull request as ready for review March 8, 2026 17:59
Copy link
Copy Markdown
Author

Known fast-follow: while testing this branch against a Responses API upstream configured via env_key, I hit a pre-existing request-compression/auth-gating bug that already exists on main.

Today, request compression is enabled based on cached ChatGPT auth state plus provider.is_openai(), while the actual outgoing bearer token can come from provider.env_key. On some Responses API upstreams that reject compressed /v1/responses bodies, that leads to invalid_json unless --disable enable_request_compression is set.

This is not introduced by the server-side compaction change; this branch just makes the issue easier to hit when validating against non-default OpenAI-shaped upstreams. I’m treating compression-capability/auth-source alignment as a fast-follow rather than part of this PR.

@cooper-oai cooper-oai requested a review from pakrym-oai March 9, 2026 18:01
shaqayeq-oai and others added 2 commits March 10, 2026 01:00
## Summary
Foundation PR only (base for PR #3).

This PR contains the SDK runtime foundation and generated artifacts:

- pinned runtime binary in `sdk/python/bin/` (`codex` or `codex.exe` by
platform)
- single maintenance script:
`sdk/python/scripts/update_sdk_artifacts.py`
- generated protocol/types artifacts under:
  - `sdk/python/src/codex_app_server/generated/protocol_types.py`
  - `sdk/python/src/codex_app_server/generated/schema_types.py`
  - `sdk/python/src/codex_app_server/generated/v2_all/*`
- generation-contract test wiring (`tests/test_contract_generation.py`)

## Release asset behavior
`update_sdk_artifacts.py` now:
- selects latest release by channel (`--channel stable|alpha`)
- resolves the correct asset for current OS/arch
- extracts platform binary (`codex` on macOS/Linux, `codex.exe` on
Windows)
- keeps runtime on single pinned binary source in `sdk/python/bin/`

## Scope boundary
- ✅ PR #2 = binary + generation pipeline + generated types foundation
- ❌ PR #2 does **not** include examples/integration logic polish (that
is PR #3)

## Validation
- Ran: `python scripts/update_sdk_artifacts.py --channel stable`
- Regenerated and committed resulting generated artifacts
- Local tests pass on branch
Addresses #13586

This doesn't affect our CI scripts. It was user-reported.

Summary
- add `wiremock::ResponseTemplate` and `body_string_contains` imports
behind `#[cfg(not(debug_assertions))]` in
`codex-rs/core/tests/suite/view_image.rs` so release builds only pull
the helpers they actually use
etraut-openai and others added 5 commits March 10, 2026 09:57
Replace the Unix shell lookup path in `codex-rs/core/src/shell.rs` to
use
`libc::getpwuid_r()` instead of `libc::getpwuid()` when resolving the
current
user's shell.

Why:
- `getpwuid()` can return pointers into libc-managed shared storage
- on the musl static Linux build, concurrent callers can race on that
storage
- this matches the crash pattern reported in tmux/Linux sessions with
parallel
  shell activity

Refs:
- Fixes #13842
There are some bug investigations that currently require us to ask users
for their user ID even though they've already uploaded logs and session
details via `/feedback`. This frustrates users and increases the time
for diagnosis.

This PR includes the ChatGPT user ID in the metadata uploaded for
`/feedback` (both the TUI and app-server).
Summary
- document output types for the various tool handlers and registry so
the API exposes richer descriptions
- update unified execution helpers and client tests to align with the
new output metadata
- clean up unused helpers across tool dispatch paths

Testing
- Not run (not requested)
## Summary
- run the split stdout/stderr PTY test through the normal shell helper
on every platform
- use a Windows-native command string instead of depending on Python to
emit split streams
- assert CRLF line endings on Windows explicitly

## Why this fixes the flake
The earlier PTY split-output test used a Python one-liner on Windows
while the rest of the file exercised shell-command behavior. That made
the test depend on runner-local Python availability and masked the real
Windows shell output shape. Using a native cmd-compatible command and
asserting the actual CRLF output makes the split stdout/stderr coverage
deterministic on Windows runners.
We already have a type to represent the MCP tool output, reuse it
instead of the custom McpHandlerOutput
cooper-oai and others added 23 commits March 11, 2026 02:48
…_files]

Keep current-turn inputs in local inline compaction checkpoints and remember known backend incompatibilities after a compat downgrade so later turns skip the failed inline request path.

Co-authored-by: Codex <noreply@openai.com>
…nged_files]

Co-authored-by: Codex <noreply@openai.com>
Preserve current-turn history when inline compaction downgrades fail and replace prior same-turn compaction checkpoints instead of stacking them.

Tests:
- cargo test -p codex-core codex::tests::build_server_side_compaction_replacement_history_keeps_current_turn_inputs -- --exact
- cargo test -p codex-core codex::tests::build_server_side_compaction_replacement_history_replaces_prior_same_turn_summary -- --exact
- cargo test -p codex-core codex::tests::downgrade_known_inline_compaction_error_restores_current_turn_when_fallback_fails -- --exact

Co-authored-by: Codex <noreply@openai.com>
…hanged_files]

Co-authored-by: Codex <noreply@openai.com>
Ignore compact_prompt for OpenAI inline auto-compaction, remove the legacy compat downgrade path, and keep /compact on the point-in-time endpoint. Also skip previous-model preflight remote compaction when inline server-side compaction is available.\n\nCo-authored-by: Codex <noreply@openai.com>
Handle repeated inline compactions on turns that started from empty history by stripping leading compaction items after prefix calculation, and add regression coverage for the fresh-session case.

Co-authored-by: Codex <noreply@openai.com>
…ci changed_files]

Co-authored-by: Codex <noreply@openai.com>
…ged_files]

Co-authored-by: Codex <noreply@openai.com>
…_files]

Co-authored-by: Codex <noreply@openai.com>
…ed_files]

Co-authored-by: Codex <noreply@openai.com>
…g [ci changed_files]

Co-authored-by: Codex <noreply@openai.com>
…iles]

Co-authored-by: Codex <noreply@openai.com>
…les]

Co-authored-by: Codex <noreply@openai.com>
- reduce the server-side compaction test matrix to the highest-signal cases
- add comments around the deferred checkpoint rewrite and inline/preflight split

Co-authored-by: Codex <noreply@openai.com>
Remove the redundant inline compaction request trace and clarify that streamed server-side compaction rebuilds replacement history on response.completed from the checkpoint snapshot.

Co-authored-by: Codex <noreply@openai.com>
…les]

Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
@cooper-oai cooper-oai force-pushed the cooper/server-side-compaction branch from 6612840 to 8be76fe Compare March 11, 2026 02:56
cooper-oai and others added 2 commits March 11, 2026 03:01
…iles]

Co-authored-by: Codex <noreply@openai.com>
@github-actions
Copy link
Copy Markdown
Contributor

Closing this pull request because it has had no updates for more than 14 days. If you plan to continue working on it, feel free to reopen or open a new PR.

@github-actions github-actions bot closed this Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

oai PRs contributed by OpenAI employees

Projects

None yet

Development

Successfully merging this pull request may close these issues.