-
-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Description
Check for existing issues
- I have searched the existing issues and checked that my issue is not a duplicate.
What happened?
When calling gemini/gemini-2.5-flash with all three of these options active simultaneously:
stream=Truetools=(one or more function/tool definitions)thinking={"type": "enabled", "budget_tokens": N}
…Gemini occasionally returns finish_reason=MALFORMED_FUNCTION_CALL at the API level.
LiteLLM intercepts this and silently normalizes it to finish_reason='stop' with an empty text response and zero tool calls.
The caller receives a seemingly successful response — no exception is raised, no warning is logged — but the tool call has been completely dropped. From the application's perspective, the model simply returned nothing.
Expected behaviour: LiteLLM should either:
- Raise a
litellm.InternalServerError/litellm.APIErrorwithfinish_reason='MALFORMED_FUNCTION_CALL', OR - Pass the raw
finish_reason='MALFORMED_FUNCTION_CALL'through to the caller so application code can detect and retry it.
Actual behaviour: finish_reason='stop', empty content, zero tool calls. The error is invisible.
Note: I have also raised a bug with python-genai repo to track the original failure: googleapis/python-genai#2081
Steps to Reproduce
- Setup a new venv with python3, pip install litellm==1.81.13
- Run the attached script with: python3 gemini_malformed_reproducer.py --path1 --runs 10 (try with more runs if you cannot repro with 10)
gemini_malformed_reproducer.py
Relevant log output
(test) ➜ scripts git:(feature/ag-ui) ✗ python3 gemini_malformed_reproducer.py --path1 --runs 10
╔══════════════════════════════════════════════════════════════════════╗
║ Gemini 2.5-flash MALFORMED_FUNCTION_CALL Bug Reproducer ║
╚══════════════════════════════════════════════════════════════════════╝
Run at : 2026-02-21T10:47:06
Model : gemini-2.5-flash
Thinking budget: 8192 tokens (forced — required to trigger bug)
Tools : 7 widget tools (line_chart, bar_chart, ...)
Query : 'give me a line chart of vulnerabilities over the last 4 weeks'
Runs : 10
Fail-fast : yes (stop on first bug hit)
NOTE: Bug is non-deterministic. Use --runs 10 to collect statistics.
======================================================================
[Run 1/10] PATH 1: LiteLLM (gemini/gemini-2.5-flash) — streaming + tools + thinking, NO retry
======================================================================
model : gemini-2.5-flash
thinking_budget: 8192
query : 'give me a line chart of vulnerabilities over the last 4 weeks'
Streaming (showing notable chunks only):
--------------------------------------------------
[chunk 2] finish_reason = 'stop'
--------------------------------------------------
RESULTS:
chunks received : 2
finish_reason : 'stop'
tool calls : 0
text response : ''
⚠️ BUG CONFIRMED (hidden by LiteLLM): finish_reason='stop', empty response, no tool.
LiteLLM normalised MALFORMED_FUNCTION_CALL → 'stop' and dropped the call.
The chart was never rendered — same user-visible outcome as explicit MALFORMED.
[fail-fast] Bug reproduced — stopping. Use --no-fail-fast to continue.
======================================================================
STATISTICS (1 run)
======================================================================
PATH 1 — LiteLLM (no retry):
MALFORMED (hidden→'stop'): 1 / 1 (100%)
BUG TOTAL : 1 / 1 (100%)
Valid tool call (no bug) : 0 / 1 (0%)
BUG REPORT:
LiteLLM: https://github.com/BerriAI/litellm/issues
Include this script + output + package versions:
(run: pip show litellm google-genai)What part of LiteLLM is this about?
SDK (litellm Python package)
What LiteLLM version are you on ?
v1.81.3
Twitter / LinkedIn details
@bhaktaonline / https://www.linkedin.com/in/bhakta0007/