Skip to content

fix(responses): map refusal stop_reason to incomplete status in streaming#25498

Merged
krrish-berri-2 merged 3 commits intolitellm_internal_staging_04_11_2026from
litellm_fix-refusal-status-streaming2
Apr 11, 2026
Merged

fix(responses): map refusal stop_reason to incomplete status in streaming#25498
krrish-berri-2 merged 3 commits intolitellm_internal_staging_04_11_2026from
litellm_fix-refusal-status-streaming2

Conversation

@Sameerlite
Copy link
Copy Markdown
Collaborator

Summary

Fixes streaming responses API translation where Anthropic's stop_reason="refusal" was incorrectly translated to status="completed" instead of "incomplete".

Root Cause

build_base_response in streaming_chunk_builder_utils.py was unconditionally overwriting finish_reason with None from later chunks, losing the terminal content_filter value that was correctly mapped from refusal.

Changes

  • streaming_chunk_builder_utils: Skip None finish_reason values in build_base_response so the last meaningful finish_reason wins
  • streaming_iterator: Snapshot chunks before returning pending events (sync path consistency with async)
  • streaming_handler: Treat usage-only chunks as meaningful content to preserve finish_reason metadata
  • transformation: Map finish_reason="refusal" to status="incomplete" in responses API
  • tests: Add regression tests for refusal handling in both streaming and transformation layers

Test Plan

  • Added unit test test_transform_chat_completion_response_status_with_refusal
  • Added regression test test_usage_only_chunk_not_dropped_when_finish_reason_already_set
  • Updated test_emit_response_completed_uses_stream_finish_reason to validate natural flow
  • All existing tests pass

…ming

Fixes streaming responses API translation where Anthropic's stop_reason="refusal"
was incorrectly translated to status="completed" instead of "incomplete".

Root cause: build_base_response was unconditionally overwriting finish_reason
with None from later chunks, losing the terminal content_filter value.

Changes:
- streaming_chunk_builder_utils: skip None finish_reason values in build_base_response
- streaming_iterator: snapshot chunks before returning pending events (sync path)
- streaming_handler: treat usage-only chunks as meaningful content
- transformation: map finish_reason=refusal to status=incomplete
- tests: add regression tests for refusal handling

Made-with: Cursor
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 10, 2026 7:08pm

Request Review

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Apr 10, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing litellm_fix-refusal-status-streaming2 (89f05b5) with main (d0e347a)

Open in CodSpeed

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 10, 2026

Greptile Summary

This PR fixes a bug in the Responses API streaming translation where Anthropic's stop_reason=\"refusal\" was incorrectly surfaced as status=\"completed\" instead of \"incomplete\". The fix involves four coordinated changes: skipping None finish_reason values in build_base_response, treating usage-only chunks as meaningful content in the streaming handler, snapshotting chunks before returning pending events in the sync iterator path, and mapping \"refusal\" to \"incomplete\" in the status mapping function.

Confidence Score: 5/5

Safe to merge — all changes are targeted bug fixes with regression tests and no backwards-incompatible behaviour changes.

All four code changes are narrowly scoped corrections to an existing translation pipeline, each with a corresponding unit test. No security concerns, no schema changes, no breaking API changes. Remaining findings are P2 at most.

No files require special attention.

Important Files Changed

Filename Overview
litellm/litellm_core_utils/streaming_chunk_builder_utils.py Correctly guards against None finish_reason overwriting a previously-captured meaningful value; the "stop" sentinel is preserved when no chunk provides a non-None finish_reason.
litellm/litellm_core_utils/streaming_handler.py Adds chunk.get("usage") is not None to allow usage-only final chunks through when finish_reason is already set, preserving terminal metadata for downstream translators.
litellm/responses/litellm_completion_transformation/streaming_iterator.py Moves chunk snapshot before the pending-events check in sync next, mirroring the async path and ensuring finish_reason is captured before output_item events are emitted.
litellm/responses/litellm_completion_transformation/transformation.py Adds "refusal" to the "incomplete" branch of _map_chat_completion_finish_reason_to_responses_status; straightforward and correct.
tests/test_litellm/litellm_core_utils/test_streaming_handler.py New regression test test_usage_only_chunk_not_dropped_when_finish_reason_already_set directly exercises the streaming handler fix with a mock usage-only chunk.
tests/test_litellm/responses/litellm_completion_transformation/test_litellm_completion_responses.py Adds test_transform_chat_completion_response_status_with_refusal and test_emit_response_completed_uses_stream_finish_reason; both are focused, mock-only tests appropriate for this folder.

Sequence Diagram

sequenceDiagram
    participant Anthropic as Anthropic Stream
    participant SH as streaming_handler
    participant SI as streaming_iterator
    participant CB as chunk_builder_utils
    participant TF as transformation

    Anthropic->>SH: chunk {stop_reason="refusal", usage={...}}
    Note over SH: is_finished=True → received_finish_reason="refusal"
    Anthropic->>SH: usage-only chunk {usage={...}, is_finished=False}
    Note over SH: NEW: usage chunk passes through (was dropped before)
    SH-->>SI: ModelResponseStream (finish_reason="refusal")
    Note over SI: _ensure_output_item_for_chunk queues OutputItemAddedEvent
    Note over SI: NEW: snapshot chunk FIRST so finish_reason is captured
    SI-->>SI: return OutputItemAddedEvent (pending)
    Note over CB: build_base_response iterates chunks
    Note over CB: NEW: skip None finish_reasons → finish_reason="refusal" preserved
    CB->>TF: finish_reason="refusal"
    Note over TF: NEW: "refusal" maps to "incomplete"
    TF-->>SI: status="incomplete"
    SI-->>SI: emit ResponseCompletedEvent(status="incomplete")
Loading

Reviews (2): Last reviewed commit: "Merge pull request #25523 from BerriAI/m..." | Re-trigger Greptile

Comment thread litellm/litellm_core_utils/streaming_chunk_builder_utils.py
@Sameerlite Sameerlite temporarily deployed to integration-postgres April 10, 2026 19:02 — with GitHub Actions Inactive
@Sameerlite Sameerlite temporarily deployed to integration-postgres April 10, 2026 19:02 — with GitHub Actions Inactive
@Sameerlite Sameerlite temporarily deployed to integration-postgres April 10, 2026 19:02 — with GitHub Actions Inactive
@krrish-berri-2 krrish-berri-2 changed the base branch from main to litellm_internal_staging_04_11_2026 April 11, 2026 15:47
@krrish-berri-2 krrish-berri-2 merged commit dc200c3 into litellm_internal_staging_04_11_2026 Apr 11, 2026
100 of 107 checks passed
@krrish-berri-2 krrish-berri-2 deleted the litellm_fix-refusal-status-streaming2 branch April 11, 2026 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants