fix(vertex): streaming finish_reason='stop' instead of 'tool_calls' for gemini-3.1-flash-lite-preview by Chesars · Pull Request #23895 · BerriAI/litellm

Chesars · 2026-03-17T20:50:05Z

Relevant issues

Pre-Submission checklist

I have Added testing in the tests/test_litellm/ directory
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai

Type

🐛 Bug Fix

Changes

Models like gemini-3.1-flash-lite-preview send the final streaming chunk with empty content (parts: [{text: ""}]) alongside finishReason: "STOP", instead of omitting content entirely. The existing fix (#21577) only handled chunks without content, so this case was missed.

After _process_candidates runs, if has_seen_tool_calls is True and any choice has finish_reason="stop", override it to "tool_calls".

…or gemini-3.1-flash-lite-preview Models like gemini-3.1-flash-lite-preview send the final streaming chunk with empty content (text:"") alongside finishReason:"STOP", instead of omitting content entirely. The existing fix (PR BerriAI#21577) only handled chunks without content, so this case was missed. Now, after processing candidates, if tool_calls were seen in earlier chunks and a choice has finish_reason="stop", it is overridden to "tool_calls" to match the OpenAI spec. Fixes BerriAI#22900

vercel · 2026-03-17T20:50:11Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Mar 17, 2026 8:51pm

greptile-apps · 2026-03-17T20:52:55Z

Greptile Summary

This PR extends the existing Gemini streaming finish_reason fix to handle a second model behaviour: models like gemini-3.1-flash-lite-preview that send the final streaming chunk with empty content (parts: [{text: ""}]) together with finishReason: "STOP", instead of omitting content entirely. It also adds the gpt-4-0314 model to model_prices_and_context_window.json.

Key changes:

In ModelResponseIterator.chunk_parser, a new block (lines 3009-3012) checks, after _process_candidates has run, whether has_seen_tool_calls is True and overrides any finish_reason == "stop" to "tool_calls". This sits alongside the existing fix (lines 2981-3002) that handles the completely-content-less final chunk case.
A focused mock unit test is added to test_gemini_streaming_tool_call_finish_reason.py that directly exercises the new path via chunk_parser with no network calls.
gpt-4-0314 is added to model_prices_and_context_window.json, but the entry carries supports_tool_choice: true and supports_prompt_caching: true — both of which are incorrect for this pre-function-calling legacy snapshot and should be removed before merging.

Confidence Score: 3/5

The streaming fix is correct, but the gpt-4-0314 JSON entry ships with incorrect capability flags that could cause API errors.
The core streaming logic change is minimal, well-understood, and backed by a new mock test. The risk is the unrelated model_prices_and_context_window.json addition for gpt-4-0314, which incorrectly marks the model as supporting tool choice and prompt caching — features that didn't exist in OpenAI's API when the 0314 snapshot was published. That metadata error lowers confidence.
model_prices_and_context_window.json — the gpt-4-0314 entry needs capability flags verified and likely corrected before merge.

Important Files Changed

Filename	Overview
litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py	Adds a post-`_process_candidates` override to remap `finish_reason="stop"` → `"tool_calls"` when `has_seen_tool_calls` is True, covering the case where Gemini sends an empty-content final chunk alongside `finishReason="STOP"`. Logic is correct and well-scoped; minor robustness concern around non-final chunks that happen to carry `finish_reason="stop"`.
model_prices_and_context_window.json	Adds `gpt-4-0314` model entry, but the entry incorrectly sets `supports_tool_choice: true` and `supports_prompt_caching: true` — both capabilities postdate the `0314` snapshot and could cause API errors for users of that model.
tests/test_litellm/llms/vertex_ai/gemini/test_gemini_streaming_tool_call_finish_reason.py	Adds a well-structured mock unit test (`test_streaming_tool_call_finish_reason_with_empty_content_in_final_chunk`) that directly exercises the new code path using `chunk_parser` without any network calls, consistent with the project's test-isolation requirements.

Sequence Diagram

sequenceDiagram
    participant Gemini as Gemini API
    participant MRI as ModelResponseIterator
    participant PC as _process_candidates
    participant Client as LiteLLM Client

    Gemini->>MRI: Chunk 1: functionCall parts, no finishReason
    MRI->>PC: _process_candidates(candidates)
    PC-->>MRI: StreamingChoices(tool_calls=..., finish_reason=None)
    Note over MRI: has_seen_tool_calls = True
    MRI-->>Client: finish_reason=None, delta.tool_calls=[...]

    Gemini->>MRI: Chunk 2 (new case): parts=[{text:""}], finishReason=STOP
    MRI->>PC: _process_candidates(candidates)
    Note over PC: "content" key present → choice IS created<br/>finish_reason mapped to "stop"
    PC-->>MRI: StreamingChoices(finish_reason="stop", delta.content="")
    Note over MRI: NEW BLOCK: has_seen_tool_calls=True<br/>AND finish_reason=="stop"<br/>→ override to "tool_calls"
    MRI-->>Client: finish_reason="tool_calls" ✓

    Note over Gemini,Client: Previous fix (pre-PR): Chunk 2 had NO content at all
    Note over Gemini,Client: "content" key absent → _process_candidates skips → choices=[]<br/>→ existing block creates choice with finish_reason="tool_calls"

Comments Outside Diff (1)

model_prices_and_context_window.json, line 16865-16877 (link)

Likely incorrect metadata for gpt-4-0314

The newly added gpt-4-0314 entry sets "supports_tool_choice": true and "supports_prompt_caching": true, but both appear wrong for this model snapshot:
- Function/tool calling was introduced by OpenAI with the gpt-4-0613 snapshot. The 0314 snapshot predates that feature entirely, so "supports_tool_choice": true is incorrect.
- Prompt caching for OpenAI was introduced for more recent models (e.g. gpt-4o), not legacy snapshots like 0314.
This looks like a copy-paste from gpt-4-0613 (or a similar entry) without verifying which capabilities the 0314 snapshot actually exposes. Shipping incorrect metadata will cause litellm to attempt to use those features against a model that doesn't support them, potentially resulting in API errors for users.

(Please verify the actual capabilities of gpt-4-0314 against the OpenAI documentation before merging.)

_{Last reviewed commit: 0c28b47}

greptile-apps · 2026-03-17T20:53:02Z

+                if self.has_seen_tool_calls:
+                    for choice in model_response.choices:
+                        if choice.finish_reason == "stop":
+                            choice.finish_reason = "tool_calls"


Override applies to every post-tool-call chunk, not just the final one

The new block overrides finish_reason == "stop" → "tool_calls" for every chunk processed after has_seen_tool_calls becomes True, not only for the final chunk.

In practice today this is safe because finish_reason is None on intermediate streaming chunks. However, it is an implicit assumption that could break if Gemini ever sends a non-terminal finishReason: "STOP" (e.g. for safety reasons) after a tool call chunk in the same streaming response. Adding a guard that the choice also has no meaningful delta content would make the intent explicit and more robust:

if self.has_seen_tool_calls: for choice in model_response.choices: if ( choice.finish_reason == "stop" and getattr(getattr(choice, "delta", None), "content", None) in (None, "") and not getattr(getattr(choice, "delta", None), "tool_calls", None) ): choice.finish_reason = "tool_calls"

codspeed-hq · 2026-03-17T20:58:17Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing Chesars:fix/streaming-tool-call-finish-reason-empty-content (0c28b47) with litellm_oss_staging_03_17_2026 (b0db75d)¹}

No successful run was found on litellm_oss_staging_03_17_2026 (278c9ba) during the generation of this report, so c693800 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩

Chesars changed the base branch from main to litellm_oss_staging_03_17_2026 March 17, 2026 20:51

vercel bot deployed to Preview March 17, 2026 20:51 View deployment

Chesars mentioned this pull request Mar 17, 2026

[Bug]: vertex_ai/gemini-3.1-flash-lite-preview returns "finish_reason": "stop" instead of "tool_calls" when using streaming #22900

Closed

1 task

greptile-apps bot reviewed Mar 17, 2026

View reviewed changes

Chesars merged commit 1f5a67a into BerriAI:litellm_oss_staging_03_17_2026 Mar 17, 2026
38 of 39 checks passed

Chesars deleted the fix/streaming-tool-call-finish-reason-empty-content branch March 17, 2026 21:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(vertex): streaming finish_reason='stop' instead of 'tool_calls' for gemini-3.1-flash-lite-preview#23895

fix(vertex): streaming finish_reason='stop' instead of 'tool_calls' for gemini-3.1-flash-lite-preview#23895
Chesars merged 1 commit intoBerriAI:litellm_oss_staging_03_17_2026from
Chesars:fix/streaming-tool-call-finish-reason-empty-content

Chesars commented Mar 17, 2026

Uh oh!

vercel bot commented Mar 17, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 17, 2026 •

edited

Loading

Important Files Changed

Comments Outside Diff (1)

Uh oh!

greptile-apps bot Mar 17, 2026

Uh oh!

codspeed-hq bot commented Mar 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Chesars commented Mar 17, 2026

Relevant issues

Pre-Submission checklist

Type

Changes

Uh oh!

vercel bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

codspeed-hq bot commented Mar 17, 2026

Merging this PR will not alter performance

Footnotes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Mar 17, 2026 •

edited

Loading

greptile-apps bot commented Mar 17, 2026 •

edited

Loading