feat(core): OpenAI Responses API (/v1/responses) native support#2588
Closed
netbrah wants to merge 18 commits intoQwenLM:mainfrom
Closed
feat(core): OpenAI Responses API (/v1/responses) native support#2588netbrah wants to merge 18 commits intoQwenLM:mainfrom
netbrah wants to merge 18 commits intoQwenLM:mainfrom
Conversation
Adds Dockerfile.sea and supporting scripts to produce a single self-contained Node.js SEA binary for headless Linux deployment. - sea/sea-launch.cjs: CJS entry point that extracts embedded ESM bundle to a versioned temp dir and dynamically imports it - scripts/build_binary.js: generates SEA config, blob, and injects into a copy of the Node binary via postject - Dockerfile.sea: multi-stage build targeting linux/amd64 via buildx - .gitignore: exclude .bin/ build output and local scratch files Made-with: Cursor
Allows running multiple qwen-code instances with isolated configs. When QWEN_CODE_HOME is set, the global config dir uses that path instead of ~/.qwen/. Enables hermetic distribution (ontap-apex) to use ~/.apex/.qwen/ without conflicting with personal ~/.qwen/ config. Centralises global dir resolution through Storage.getGlobalQwenDir() in todoWrite, memoryDiscovery, and settings.ts instead of ad-hoc path.join(homeDir, QWEN_DIR) construction. Made-with: Cursor
Keep .bin/ directory structure in repo but ignore build artifacts (ontap-apex launcher, SEA binaries, etc.) that live there locally. Made-with: Cursor
…file TypeScript strict mode requires process.env['KEY'] not process.env.KEY for index signature access. Also set HUSKY=0 in Dockerfile.sea to skip git hooks during container builds without --ignore-scripts. Made-with: Cursor
When QWEN_CODE_BRAND is set (e.g. "APEX"), the header banner and window title use it instead of "Qwen Code" / "Qwen". Allows the hermetic distribution to show its own branding without source changes. Made-with: Cursor
SEA binaries can't accept V8 flags (--max-old-space-size) as argv — they get parsed by yargs as unknown CLI arguments. Detect SEA mode via node:sea module and skip the memory tuning relaunch entirely. Made-with: Cursor
relaunchAppInChildProcess passes process.argv[1] as the script path, but SEA binaries have no script — argv[1] is undefined, which gets stringified and interpreted by yargs as a one-shot prompt, preventing interactive mode from launching. Made-with: Cursor
When QWEN_CODE_BRAND=APEX, the splash screen shows an APEX ASCII art logo instead of the default QWEN logo. Extensible via brandLogos map for future brand variants. Made-with: Cursor
Add context budget trimming that truncates older conversation turns before hitting the provider API, preventing context-window overflows. Includes test expectations updated for the thinking-budget headroom logic that bumps max_tokens by 8000 when thinking is enabled. Made-with: Cursor
Mark getHistory() return as readonly Content[] to prevent accidental mutation. Update stripStartupContext to accept readonly arrays and spread at the arena boundary where mutable Content[] is required. Made-with: Cursor
Increase default maxAttempts from 7 to 10 and add detection for transient SSL/TLS errors (EPROTO, DEPTH_ZERO_SELF_SIGNED_CERT, etc.) and network errors (ECONNRESET, ETIMEDOUT) to trigger automatic retries with backoff. Made-with: Cursor
When a subagent attempts to use a tool not in its allowed list, return a structured error with the tool name and available alternatives instead of a generic "not found" message. Updated test expectation accordingly. Made-with: Cursor
Detects when the model generates lazy placeholder text like "(rest of methods ...)" or "// unchanged code ..." instead of actual code in edit and write_file tool calls. Prevents accidental code deletion by blocking these at validation time. - edit.ts: Compares old_string vs new_string placeholders, blocks new placeholders not present in original - write-file.ts: Blocks content with any omission placeholders - 6 unit tests: standalone, case-insensitive, multi-line, false positive avoidance, inline comments, unrelated ellipsis Made-with: Cursor
Add readOnlyTools boolean to MCPServerConfig. When true, all discovered tools default to readOnlyHint: true unless the server explicitly overrides with readOnlyHint: false in its annotations. This enables MCP tools from read-only servers to benefit from contiguous read-only tool parallelization. Fixes QwenLM#2564 Made-with: Cursor
…tree utils Ported from gemini-cli: - read_many_files tool with glob patterns - JIT context discovery (auto-inject AGENTS.md from subdirectories) - Git worktree utilities - Tool error type extensions Made-with: Cursor
- Read-only tool parallelization (Kind.Read, Kind.Search, Kind.Fetch) - Tool output masking service with telemetry - Dynamic tool output truncation based on context pressure - Token estimation utilities (tokenCalculation.ts) - Settings schema updates for tool output masking - Sync script for portable settings Made-with: Cursor
Full Codex-parity implementation of the OpenAI Responses API as a new provider type `openai-responses`, parallel to `openai` (Chat Completions). New provider: AuthType.USE_OPENAI_RESPONSES - SSE event converter (text, function calls, reasoning summaries) - HTTP streaming pipeline with previous_response_id + incremental items - prompt_cache_key, reasoning, verbosity, service_tier, tool_choice - truncation: auto, parallel_tool_calls, extra_body merge - encrypted_content round-trip for reasoning continuity Dual-path compaction: - Remote via POST /v1/responses/compact (openai-responses) - Inline via existing ChatCompressionService (all other providers) - Pipeline state reset after compaction - Auto-compact threshold 70% -> 90% (Codex default) Pre-work: resolveModel, tokenEstimationScaleFactor, Turn.getResponseText, cleanToolSchema, tool response dedup, abort error handling in stream consumer. 76 new tests, 4905 total passing, zero regressions. Made-with: Cursor
bd23459 to
f493c6a
Compare
Contributor
Author
|
Opened against upstream by mistake — this is a fork-internal review PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Full Codex-parity implementation of the OpenAI Responses API as a new provider type
openai-responses, parallel to the existingopenai(Chat Completions) provider. Users select it viaauthType: "openai-responses"in their model providers config.What's new
openai-responsesauth type — routes to/v1/responsesinstead of/v1/chat/completionsprevious_response_id+ incremental items — server-side history, only new items sent per turnprompt_cache_key— conversation-scoped server caching for faster TTFTeffort,summary,include: ["reasoning.encrypted_content"]with full round-triptruncation: {type: "auto"}— prevents hard 400 on context overflowparallel_tool_calls,tool_choice: "auto"— Codex-parity tool handling/v1/responses/compact) for openai-responses, inline for all othersprevious_response_idresolveModel(),tokenEstimationScaleFactor(),Turn.getResponseText(),cleanToolSchema(), tool response dedup, abort error handlingTest plan
qwen --auth-type openai-responsesMade with Cursor