feat(core): OpenAI Responses API (/v1/responses) native support by netbrah · Pull Request #2588 · QwenLM/qwen-code

netbrah · 2026-03-22T03:11:46Z

Summary

Full Codex-parity implementation of the OpenAI Responses API as a new provider type openai-responses, parallel to the existing openai (Chat Completions) provider. Users select it via authType: "openai-responses" in their model providers config.

What's new

New openai-responses auth type — routes to /v1/responses instead of /v1/chat/completions
SSE event converter — parses all Responses API streaming events into Gemini-format responses
previous_response_id + incremental items — server-side history, only new items sent per turn
prompt_cache_key — conversation-scoped server caching for faster TTFT
Native reasoning controls — effort, summary, include: ["reasoning.encrypted_content"] with full round-trip
truncation: {type: "auto"} — prevents hard 400 on context overflow
parallel_tool_calls, tool_choice: "auto" — Codex-parity tool handling
Dual-path compaction — remote (/v1/responses/compact) for openai-responses, inline for all others
Pipeline state reset after compaction to prevent stale previous_response_id
Pre-work — resolveModel(), tokenEstimationScaleFactor(), Turn.getResponseText(), cleanToolSchema(), tool response dedup, abort error handling

Test plan

76 new unit tests across converter, pipeline, and compaction client
4905 total tests passing (up from 4829)
Full build green (tsc + eslint + vscode companion)
Zero regressions to existing Chat Completions, Anthropic, or Gemini paths
Integration test against LLM proxy with a Codex model (manual)
Dogfooding with qwen --auth-type openai-responses

Made with Cursor

Adds Dockerfile.sea and supporting scripts to produce a single self-contained Node.js SEA binary for headless Linux deployment. - sea/sea-launch.cjs: CJS entry point that extracts embedded ESM bundle to a versioned temp dir and dynamically imports it - scripts/build_binary.js: generates SEA config, blob, and injects into a copy of the Node binary via postject - Dockerfile.sea: multi-stage build targeting linux/amd64 via buildx - .gitignore: exclude .bin/ build output and local scratch files Made-with: Cursor

Allows running multiple qwen-code instances with isolated configs. When QWEN_CODE_HOME is set, the global config dir uses that path instead of ~/.qwen/. Enables hermetic distribution (ontap-apex) to use ~/.apex/.qwen/ without conflicting with personal ~/.qwen/ config. Centralises global dir resolution through Storage.getGlobalQwenDir() in todoWrite, memoryDiscovery, and settings.ts instead of ad-hoc path.join(homeDir, QWEN_DIR) construction. Made-with: Cursor

Keep .bin/ directory structure in repo but ignore build artifacts (ontap-apex launcher, SEA binaries, etc.) that live there locally. Made-with: Cursor

…file TypeScript strict mode requires process.env['KEY'] not process.env.KEY for index signature access. Also set HUSKY=0 in Dockerfile.sea to skip git hooks during container builds without --ignore-scripts. Made-with: Cursor

When QWEN_CODE_BRAND is set (e.g. "APEX"), the header banner and window title use it instead of "Qwen Code" / "Qwen". Allows the hermetic distribution to show its own branding without source changes. Made-with: Cursor

SEA binaries can't accept V8 flags (--max-old-space-size) as argv — they get parsed by yargs as unknown CLI arguments. Detect SEA mode via node:sea module and skip the memory tuning relaunch entirely. Made-with: Cursor

relaunchAppInChildProcess passes process.argv[1] as the script path, but SEA binaries have no script — argv[1] is undefined, which gets stringified and interpreted by yargs as a one-shot prompt, preventing interactive mode from launching. Made-with: Cursor

When QWEN_CODE_BRAND=APEX, the splash screen shows an APEX ASCII art logo instead of the default QWEN logo. Extensible via brandLogos map for future brand variants. Made-with: Cursor

Made-with: Cursor

Add context budget trimming that truncates older conversation turns before hitting the provider API, preventing context-window overflows. Includes test expectations updated for the thinking-budget headroom logic that bumps max_tokens by 8000 when thinking is enabled. Made-with: Cursor

Mark getHistory() return as readonly Content[] to prevent accidental mutation. Update stripStartupContext to accept readonly arrays and spread at the arena boundary where mutable Content[] is required. Made-with: Cursor

Increase default maxAttempts from 7 to 10 and add detection for transient SSL/TLS errors (EPROTO, DEPTH_ZERO_SELF_SIGNED_CERT, etc.) and network errors (ECONNRESET, ETIMEDOUT) to trigger automatic retries with backoff. Made-with: Cursor

When a subagent attempts to use a tool not in its allowed list, return a structured error with the tool name and available alternatives instead of a generic "not found" message. Updated test expectation accordingly. Made-with: Cursor

Detects when the model generates lazy placeholder text like "(rest of methods ...)" or "// unchanged code ..." instead of actual code in edit and write_file tool calls. Prevents accidental code deletion by blocking these at validation time. - edit.ts: Compares old_string vs new_string placeholders, blocks new placeholders not present in original - write-file.ts: Blocks content with any omission placeholders - 6 unit tests: standalone, case-insensitive, multi-line, false positive avoidance, inline comments, unrelated ellipsis Made-with: Cursor

Add readOnlyTools boolean to MCPServerConfig. When true, all discovered tools default to readOnlyHint: true unless the server explicitly overrides with readOnlyHint: false in its annotations. This enables MCP tools from read-only servers to benefit from contiguous read-only tool parallelization. Fixes QwenLM#2564 Made-with: Cursor

…tree utils Ported from gemini-cli: - read_many_files tool with glob patterns - JIT context discovery (auto-inject AGENTS.md from subdirectories) - Git worktree utilities - Tool error type extensions Made-with: Cursor

- Read-only tool parallelization (Kind.Read, Kind.Search, Kind.Fetch) - Tool output masking service with telemetry - Dynamic tool output truncation based on context pressure - Token estimation utilities (tokenCalculation.ts) - Settings schema updates for tool output masking - Sync script for portable settings Made-with: Cursor

Full Codex-parity implementation of the OpenAI Responses API as a new provider type `openai-responses`, parallel to `openai` (Chat Completions). New provider: AuthType.USE_OPENAI_RESPONSES - SSE event converter (text, function calls, reasoning summaries) - HTTP streaming pipeline with previous_response_id + incremental items - prompt_cache_key, reasoning, verbosity, service_tier, tool_choice - truncation: auto, parallel_tool_calls, extra_body merge - encrypted_content round-trip for reasoning continuity Dual-path compaction: - Remote via POST /v1/responses/compact (openai-responses) - Inline via existing ChatCompressionService (all other providers) - Pipeline state reset after compaction - Auto-compact threshold 70% -> 90% (Codex default) Pre-work: resolveModel, tokenEstimationScaleFactor, Turn.getResponseText, cleanToolSchema, tool response dedup, abort error handling in stream consumer. 76 new tests, 4905 total passing, zero regressions. Made-with: Cursor

netbrah · 2026-03-22T03:20:04Z

Opened against upstream by mistake — this is a fork-internal review PR.

netbrah added 17 commits March 21, 2026 16:09

chore: add .bin/.gitkeep, exclude .bin/ contents from tracking

1405efa

Keep .bin/ directory structure in repo but ignore build artifacts (ontap-apex launcher, SEA binaries, etc.) that live there locally. Made-with: Cursor

feat: support QWEN_CODE_BRAND env var for startup banner rebrand

634ae2b

When QWEN_CODE_BRAND is set (e.g. "APEX"), the header banner and window title use it instead of "Qwen Code" / "Qwen". Allows the hermetic distribution to show its own branding without source changes. Made-with: Cursor

fix: skip V8 memory relaunch for SEA binaries

206b40b

SEA binaries can't accept V8 flags (--max-old-space-size) as argv — they get parsed by yargs as unknown CLI arguments. Detect SEA mode via node:sea module and skip the memory tuning relaunch entirely. Made-with: Cursor

feat: APEX ASCII logo for branded startup splash

9a39ec9

When QWEN_CODE_BRAND=APEX, the splash screen shows an APEX ASCII art logo instead of the default QWEN logo. Extensible via brandLogos map for future brand variants. Made-with: Cursor

commit

14fac4a

Made-with: Cursor

robustness(core): enforce history immutability with readonly types

0e1f243

Mark getHistory() return as readonly Content[] to prevent accidental mutation. Update stripStartupContext to accept readonly arrays and spread at the arena boundary where mutable Content[] is required. Made-with: Cursor

netbrah requested review from DennisYu07, DragonnZhang, LaZzyMan, Mingholy, gwinthis, pomelo-nwu and tanzhenxin as code owners March 22, 2026 03:11

netbrah force-pushed the feat/openai-responses-api branch from bd23459 to f493c6a Compare March 22, 2026 03:19

netbrah closed this Mar 22, 2026

github-actions bot mentioned this pull request Mar 23, 2026

📊 AI CLI 工具社区动态日报 2026-03-23 gsscsd/big_model_radar#80

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(core): OpenAI Responses API (/v1/responses) native support#2588

feat(core): OpenAI Responses API (/v1/responses) native support#2588
netbrah wants to merge 18 commits intoQwenLM:mainfrom
netbrah:feat/openai-responses-api

netbrah commented Mar 22, 2026

Uh oh!

netbrah commented Mar 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

netbrah commented Mar 22, 2026

Summary

What's new

Test plan

Uh oh!

netbrah commented Mar 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant