Skip to content

fix: emit input_json_delta for tool args bundled in first streaming chunk#25533

Merged
krrish-berri-2 merged 5 commits intoBerriAI:litellm_oss_staging_04_13_2026_p1from
quora:kryang/fix-streaming-tool-args-upstream
Apr 14, 2026
Merged

fix: emit input_json_delta for tool args bundled in first streaming chunk#25533
krrish-berri-2 merged 5 commits intoBerriAI:litellm_oss_staging_04_13_2026_p1from
quora:kryang/fix-streaming-tool-args-upstream

Conversation

@krisyang1125
Copy link
Copy Markdown
Contributor

@krisyang1125 krisyang1125 commented Apr 10, 2026

Summary

Some providers (xAI, Gemini) include tool_call function arguments in the same streaming chunk as the function name/id. The AnthropicStreamWrapper was discarding the trigger chunk entirely when starting a new content block, which silently dropped the input_json_delta carrying tool arguments. This caused tool_use blocks to arrive with empty input {}.

Now queue the processed_chunk after content_block_start when it carries non-empty input_json_delta data. The fix is applied to both the sync __next__ and async __anext__ iteration paths. Backward compatible: providers that send empty arguments in the first chunk (OpenAI-style) are unaffected since the condition checks for truthy partial_json.

Changes

  • litellm/llms/anthropic/experimental_pass_through/adapters/streaming_iterator.py: After emitting content_block_stop + content_block_start for a new tool_use block, also emit the trigger chunk's content_block_delta when it carries input_json_delta with non-empty partial_json. Applied to both __next__ (sync) and __anext__ (async).
  • tests/test_litellm/llms/anthropic/experimental_pass_through/adapters/test_streaming_iterator_tool_args.py: Added 4 tests covering both sync and async paths for bundled args (xAI/Gemini style) and empty args (OpenAI style)

Test plan

  • Added test_async_stream_emits_input_json_delta_for_bundled_tool_args - verifies async input_json_delta is emitted after content_block_start when tool args are bundled
  • Added test_async_stream_no_extra_delta_when_tool_args_empty - verifies async backward compatibility when args are empty (OpenAI-style)
  • Added test_sync_stream_emits_input_json_delta_for_bundled_tool_args - verifies sync input_json_delta is emitted after content_block_start when tool args are bundled
  • Added test_sync_stream_no_extra_delta_when_tool_args_empty - verifies sync backward compatibility when args are empty (OpenAI-style)
  • All 66 adapter tests pass (62 existing + 4 new)

…hunk

Some providers (xAI, Gemini) include tool_call function arguments in the
same streaming chunk as the function name/id. The AnthropicStreamWrapper
was discarding the trigger chunk entirely when starting a new content
block, which silently dropped the input_json_delta carrying tool
arguments. This caused tool_use blocks to arrive with empty input {}.

Now queue the processed_chunk after content_block_start when it carries
non-empty input_json_delta data. Backward compatible: providers that send
empty arguments in the first chunk (OpenAI-style) are unaffected since
the condition checks for truthy partial_json.
Covers the fix for providers (xAI, Gemini) that bundle tool_call
arguments in the same streaming chunk as the function name/id.
Verifies the AnthropicStreamWrapper emits input_json_delta after
content_block_start, and that empty-arg chunks (OpenAI-style) are
unaffected.
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 10, 2026 10:26pm

Request Review

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@krisyang1125
Copy link
Copy Markdown
Contributor Author

@greptileai

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Apr 10, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing quora:kryang/fix-streaming-tool-args-upstream (6164735) with main (f2f2a91)

Open in CodSpeed

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 10, 2026

Codecov Report

❌ Patch coverage is 66.66667% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...mental_pass_through/adapters/streaming_iterator.py 66.66% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 10, 2026

Greptile Summary

This PR fixes a silent data-loss bug in AnthropicStreamWrapper where providers like xAI and Gemini bundle tool arguments in the same streaming chunk as the function name/id; the wrapper was discarding that chunk's input_json_delta, leaving tool_use blocks with empty input {}. The fix queues the trigger chunk's delta after content_block_start when partial_json is non-empty, applied symmetrically to both the sync __next__ and async __anext__ paths. Four new pure-mock unit tests validate bundled-args and empty-args (backward-compatible) scenarios for both paths.

Confidence Score: 5/5

Safe to merge — the fix is minimal, correctly scoped, and all four test scenarios pass.

No P0 or P1 findings. The condition chain properly handles all edge cases: empty-string args ("") is falsy, None args, and non-empty args. Symmetric application in both sync and async paths is correct. Tests are pure-mock (no network calls), cover both the happy path and the backward-compatible empty-args path, and previously noted backward-compat weaknesses have been addressed with exact-count list-comprehension assertions.

No files require special attention.

Important Files Changed

Filename Overview
litellm/llms/anthropic/experimental_pass_through/adapters/streaming_iterator.py Adds a guard to queue input_json_delta from the trigger chunk when it carries non-empty partial_json, in both sync and async iteration paths; logic is correct and backward-compatible.
tests/test_litellm/llms/anthropic/experimental_pass_through/adapters/test_streaming_iterator_tool_args.py Four new mock-only tests covering sync/async × bundled-args/empty-args scenarios; no real network calls, assertions correctly verify event ordering and exact counts.

Sequence Diagram

sequenceDiagram
    participant Provider as Provider (xAI/Gemini)
    participant Wrapper as AnthropicStreamWrapper
    participant Client as Anthropic Client

    Provider->>Wrapper: chunk[text, content="Hello"]
    Wrapper->>Client: message_start
    Wrapper->>Client: content_block_start (text, index=0)
    Wrapper->>Client: content_block_delta (text_delta)

    Provider->>Wrapper: chunk[tool_call: name="get_weather", arguments='{"location":"Boston"}']
    Note over Wrapper: should_start_new_block=True<br/>_increment_content_block_index()
    Note over Wrapper: translate → content_block_delta (input_json_delta)
    Wrapper->>Client: content_block_stop (index=0)
    Wrapper->>Client: content_block_start (tool_use, index=1)
    Note over Wrapper: NEW: partial_json is truthy → queue delta
    Wrapper->>Client: content_block_delta (input_json_delta, partial_json='{"location":"Boston"}')

    Provider->>Wrapper: chunk[finish_reason="tool_calls"]
    Wrapper->>Client: content_block_stop (index=1)
    Wrapper->>Client: message_delta (stop_reason="tool_use")
    Wrapper->>Client: message_stop
Loading

Reviews (4): Last reviewed commit: "test: make no_extra_delta tests assert e..." | Re-trigger Greptile

Comment on lines +220 to +233
assert tool_start_idx is not None

# The event immediately after content_block_start should NOT be
# an input_json_delta from the trigger chunk (since arguments were empty).
# It should be an input_json_delta from the subsequent tool_args_chunk.
next_event = events[tool_start_idx + 1]
if (
isinstance(next_event, dict)
and next_event.get("type") == "content_block_delta"
and isinstance(next_event.get("delta"), dict)
and next_event["delta"].get("type") == "input_json_delta"
):
# This delta must come from tool_args_chunk, not tool_name_chunk
assert next_event["delta"].get("partial_json") == '{"location": "NYC"}'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Backward-compatibility assertion is vacuous when condition is false

The if on line 226 means if next_event is not a content_block_delta (e.g. it's a content_block_stop or some other event), the inner assert never runs and the test passes silently without verifying anything. The test title promises "no_extra_delta_when_tool_args_empty" but doesn't enforce that constraint. Consider asserting directly that no spurious delta was injected:

Suggested change
assert tool_start_idx is not None
# The event immediately after content_block_start should NOT be
# an input_json_delta from the trigger chunk (since arguments were empty).
# It should be an input_json_delta from the subsequent tool_args_chunk.
next_event = events[tool_start_idx + 1]
if (
isinstance(next_event, dict)
and next_event.get("type") == "content_block_delta"
and isinstance(next_event.get("delta"), dict)
and next_event["delta"].get("type") == "input_json_delta"
):
# This delta must come from tool_args_chunk, not tool_name_chunk
assert next_event["delta"].get("partial_json") == '{"location": "NYC"}'
next_event = events[tool_start_idx + 1]
# The event after content_block_start must NOT be an input_json_delta
# originating from the (empty-args) trigger chunk.
if (
isinstance(next_event, dict)
and next_event.get("type") == "content_block_delta"
and isinstance(next_event.get("delta"), dict)
and next_event["delta"].get("type") == "input_json_delta"
):
# If there is a delta here it must come from tool_args_chunk, not the empty trigger
assert next_event["delta"].get("partial_json") == '{"location": "NYC"}', (
"Spurious empty input_json_delta emitted from trigger chunk"
)
else:
pass
# Separately, assert the args delta from tool_args_chunk is present somewhere
all_input_json_deltas = [
e for e in events
if isinstance(e, dict)
and e.get("type") == "content_block_delta"
and isinstance(e.get("delta"), dict)
and e["delta"].get("type") == "input_json_delta"
]
assert any(
d["delta"].get("partial_json") == '{"location": "NYC"}' for d in all_input_json_deltas
), "Expected tool_args_chunk delta to appear in events"

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 10, 2026

Tip:

Greploop — Automatically fix all review issues by running /greploops in Claude Code. It iterates: fix, push, re-review, repeat until 5/5 confidence.

Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal.

@krisyang1125
Copy link
Copy Markdown
Contributor Author

@greptileai

@krisyang1125
Copy link
Copy Markdown
Contributor Author

krisyang1125 commented Apr 10, 2026

Added Screenshot after the patch to fix the issue

Screenshot 2026-04-10 at 16 21 06 Screenshot 2026-04-10 at 15 53 58

@krrish-berri-2 krrish-berri-2 changed the base branch from main to litellm_oss_staging_04_13_2026_p1 April 14, 2026 02:09
@krrish-berri-2 krrish-berri-2 merged commit bf3ed8d into BerriAI:litellm_oss_staging_04_13_2026_p1 Apr 14, 2026
50 of 51 checks passed
Sameerlite pushed a commit that referenced this pull request Apr 14, 2026
…hunk (#25533)

* fix: emit input_json_delta for tool args bundled in first streaming chunk

Some providers (xAI, Gemini) include tool_call function arguments in the
same streaming chunk as the function name/id. The AnthropicStreamWrapper
was discarding the trigger chunk entirely when starting a new content
block, which silently dropped the input_json_delta carrying tool
arguments. This caused tool_use blocks to arrive with empty input {}.

Now queue the processed_chunk after content_block_start when it carries
non-empty input_json_delta data. Backward compatible: providers that send
empty arguments in the first chunk (OpenAI-style) are unaffected since
the condition checks for truthy partial_json.

* test: add tests for input_json_delta emission on bundled tool args

Covers the fix for providers (xAI, Gemini) that bundle tool_call
arguments in the same streaming chunk as the function name/id.
Verifies the AnthropicStreamWrapper emits input_json_delta after
content_block_start, and that empty-arg chunks (OpenAI-style) are
unaffected.

* style: apply Black formatting to streaming_iterator.py

* fix: mirror input_json_delta fix to sync __next__ and add sync tests

* test: make no_extra_delta tests assert explicitly instead of passing silently
Benniphx pushed a commit to Benniphx/litellm that referenced this pull request Apr 15, 2026
…hunk (BerriAI#25533)

* fix: emit input_json_delta for tool args bundled in first streaming chunk

Some providers (xAI, Gemini) include tool_call function arguments in the
same streaming chunk as the function name/id. The AnthropicStreamWrapper
was discarding the trigger chunk entirely when starting a new content
block, which silently dropped the input_json_delta carrying tool
arguments. This caused tool_use blocks to arrive with empty input {}.

Now queue the processed_chunk after content_block_start when it carries
non-empty input_json_delta data. Backward compatible: providers that send
empty arguments in the first chunk (OpenAI-style) are unaffected since
the condition checks for truthy partial_json.

* test: add tests for input_json_delta emission on bundled tool args

Covers the fix for providers (xAI, Gemini) that bundle tool_call
arguments in the same streaming chunk as the function name/id.
Verifies the AnthropicStreamWrapper emits input_json_delta after
content_block_start, and that empty-arg chunks (OpenAI-style) are
unaffected.

* style: apply Black formatting to streaming_iterator.py

* fix: mirror input_json_delta fix to sync __next__ and add sync tests

* test: make no_extra_delta tests assert explicitly instead of passing silently
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants