Skip to content

fix(pipeline): handle duplicate finish_reason chunks from OpenRouter#2403

Merged
tanzhenxin merged 1 commit intoQwenLM:mainfrom
simon100500:fix/duplicate-finish-chunk-tool-calls
Mar 18, 2026
Merged

fix(pipeline): handle duplicate finish_reason chunks from OpenRouter#2403
tanzhenxin merged 1 commit intoQwenLM:mainfrom
simon100500:fix/duplicate-finish-chunk-tool-calls

Conversation

@simon100500
Copy link
Copy Markdown
Contributor

Fixes #2402

Problem

Some OpenRouter model providers (e.g. google/gemini-3.1-flash-lite-preview) send two consecutive SSE chunks with finish_reason: "tool_calls". The second chunk arrives after streamingToolCallParser.reset() has already been called, so it carries empty parts — no functionCall entries.

handleChunkMerging treated every finish chunk as authoritative and overwrote pendingFinishResponse with the empty duplicate, discarding the functionCall parts correctly assembled from the first finish chunk.

This caused processStreamResponse to see hasToolCall=false and throw:

Model stream ended with empty response text.

Fix

In handleChunkMerging: when a second finish chunk arrives and a pendingFinishResponse already exists, only merge usageMetadata (if present) and keep the candidates from the first finish chunk.

if (isFinishChunk) {
  if (hasPendingFinish) {
    // Duplicate finish chunk — keep candidates from first, merge only metadata
    const lastResponse = collectedGeminiResponses[...];
    if (response.usageMetadata) lastResponse.usageMetadata = response.usageMetadata;
    setPendingFinish(lastResponse);
  } else {
    collectedGeminiResponses.push(response);
    setPendingFinish(response);
  }
  return false;
}

Testing

The existing pipeline.test.ts suite should cover regressions. A new test case can be added for the duplicate-finish-chunk scenario if desired.

Some OpenRouter model providers (e.g. google/gemini-3.1-flash-lite-preview)
send two consecutive SSE chunks with finish_reason='tool_calls'. The second
chunk arrives after streamingToolCallParser.reset() has been called, so it
carries empty parts — no functionCall entries.

The original handleChunkMerging treated every finish chunk as authoritative
and overwrote pendingFinishResponse, discarding the functionCall parts that
were correctly assembled from the first finish chunk.

Fix: when a second finish chunk arrives and a pendingFinishResponse already
exists, only merge usageMetadata (if present) and keep the candidates from
the first finish chunk.
simon100500 added a commit to simon100500/gantt-lib-mcp that referenced this pull request Mar 15, 2026
…uter

OpenRouter присылает два SSE чанка с finish_reason=tool_calls. Второй
пустой чанк перезаписывал pendingFinishResponse, сбрасывая functionCall
parts — SDK выбрасывал "Model stream ended with empty response text".

Патч handleChunkMerging: при повторном finish чанке сохраняем candidates
от первого, мёрджим только usageMetadata.

- patches/@qwen-code+sdk+0.1.5.patch — персистентный патч
- package.json — postinstall: patch-package
- patch-package добавлен в devDependencies
- PR: QwenLM/qwen-code#2403

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Mingholy Mingholy added the scope/content-generation AI content generation label Mar 16, 2026
@Mingholy
Copy link
Copy Markdown
Collaborator

Thanks for the contribution!
This is a core change, and has some conflicts with #2404. I'm merging them into a single test branch to validate. This may take some time and will be merged after the validation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

scope/content-generation AI content generation

Projects

None yet

3 participants