Skip to content

fix(gemini): correct streaming finish_reason for tool calls#21577

Merged
Chesars merged 1 commit intoBerriAI:litellm_oss_staging_03_03_2026from
Chesars:fix/gemini-streaming-tool-calls-finish-reason
Mar 3, 2026
Merged

fix(gemini): correct streaming finish_reason for tool calls#21577
Chesars merged 1 commit intoBerriAI:litellm_oss_staging_03_03_2026from
Chesars:fix/gemini-streaming-tool-calls-finish-reason

Conversation

@Chesars
Copy link
Copy Markdown
Collaborator

@Chesars Chesars commented Feb 19, 2026

Relevant issues

Fixes #21041

Pre-Submission checklist

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🐛 Bug Fix

Changes

Gemini always returns finishReason: "STOP" regardless of whether the response contains tool calls. Per the OpenAI spec, finish_reason must be "tool_calls" when the model called a tool.

LiteLLM already maps this correctly for non-streaming responses (PR #10485), but in streaming mode the finishReason arrives in a separate chunk without content, which _process_candidates() skips entirely — so the mapping never runs.

Fix (in ModelResponseIterator, Gemini handler layer):

  1. Track tool_calls across chunks — added self.has_seen_tool_calls flag that gets set when any streaming choice contains delta.tool_calls
  2. Handle final chunk with no content — when _process_candidates produces empty choices but a candidate has finishReason, create a StreamingChoices with the correct finish_reason (respecting prior tool_calls)

Tests (4 new, all mock-only, no network calls):

  • Tool call stream → finish_reason="tool_calls"
  • Text-only stream → finish_reason="stop"
  • Multiple parallel tool calls → finish_reason="tool_calls"
  • Content filter → finish_reason="content_filter" preserved ✓

Gemini returns finishReason="STOP" even when tool calls are present,
and sends tool_calls and finishReason in separate streaming chunks.
The ModelResponseIterator now tracks tool_calls across chunks and
correctly maps finish_reason to "tool_calls" per the OpenAI spec.

Fixes BerriAI#21041
@vercel
Copy link
Copy Markdown

vercel bot commented Feb 19, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Feb 19, 2026 6:05pm

Request Review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Feb 19, 2026

Greptile Summary

This PR fixes a streaming bug where Gemini's finishReason="STOP" was incorrectly mapped to finish_reason="stop" instead of finish_reason="tool_calls" when tool calls were present.

Root cause: Gemini sends tool calls and finishReason in separate chunks. The final chunk (with finishReason but no content) was skipped by _process_candidates(), causing the finish_reason to be lost entirely.

Solution:

  • Track tool calls across chunks using has_seen_tool_calls flag
  • Handle final chunks without content by creating a StreamingChoices with the correct finish_reason based on whether tool calls were seen earlier

Tests: 4 new comprehensive mock-based tests covering all scenarios (tool calls, text-only, multiple tools, content filter).

All changes are provider-specific and properly isolated within llms/vertex_ai/gemini/. No database queries, fastapi imports, or other violations of custom rules.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The fix is well-isolated to Gemini streaming logic, includes comprehensive mock-based tests (no network calls), doesn't modify critical paths or database queries, and follows OpenAI spec correctly. The solution is minimal and focused - only 35 lines added to track state and handle edge case. All existing tests pass.
  • No files require special attention

Important Files Changed

Filename Overview
litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py Adds has_seen_tool_calls flag and final chunk handling logic to correctly map finish_reason="tool_calls" when tool calls arrive in separate chunks from finishReason
tests/test_litellm/llms/vertex_ai/gemini/test_gemini_streaming_tool_call_finish_reason.py New test file with 4 comprehensive mock-based tests covering tool call streaming, text-only streaming, multiple tool calls, and content filtering scenarios

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Chunk arrives in ModelResponseIterator.chunk_parser] --> B{Process candidates via _process_candidates}
    B --> C{Does candidate have 'content' field?}
    C -->|Yes| D[Extract tool_calls, text, etc]
    C -->|No| E[Skip candidate - returns empty choices]
    D --> F{Does delta have tool_calls?}
    F -->|Yes| G[Set has_seen_tool_calls = True]
    F -->|No| H[Keep flag unchanged]
    G --> I[_create_streaming_choice sets finish_reason via _check_finish_reason]
    H --> I
    E --> J{Is choices empty AND candidates exist?}
    J -->|Yes| K{Check has_seen_tool_calls flag}
    J -->|No| M[Return response]
    K -->|True| L[Create choice with finish_reason='tool_calls']
    K -->|False| N[Map finishReason normally via _check_finish_reason]
    L --> M
    N --> M
Loading

Last reviewed commit: 5596545

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@Chesars Chesars changed the base branch from main to litellm_oss_staging_03_03_2026 March 3, 2026 18:24
@Chesars Chesars merged commit b2c7d2e into BerriAI:litellm_oss_staging_03_03_2026 Mar 3, 2026
18 of 21 checks passed
@Chesars Chesars deleted the fix/gemini-streaming-tool-calls-finish-reason branch March 3, 2026 18:25
Chesars added a commit to Chesars/litellm that referenced this pull request Mar 17, 2026
…or gemini-3.1-flash-lite-preview

Models like gemini-3.1-flash-lite-preview send the final streaming chunk
with empty content (text:"") alongside finishReason:"STOP", instead of
omitting content entirely. The existing fix (PR BerriAI#21577) only handled
chunks without content, so this case was missed.

Now, after processing candidates, if tool_calls were seen in earlier
chunks and a choice has finish_reason="stop", it is overridden to
"tool_calls" to match the OpenAI spec.

Fixes BerriAI#22900
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Gemini 3 Flash returns finish_reason="stop" instead of "tool_calls" in streaming mode with tools

1 participant