fix(responses): map refusal stop_reason to incomplete status in streaming#25498
Conversation
…ming Fixes streaming responses API translation where Anthropic's stop_reason="refusal" was incorrectly translated to status="completed" instead of "incomplete". Root cause: build_base_response was unconditionally overwriting finish_reason with None from later chunks, losing the terminal content_filter value. Changes: - streaming_chunk_builder_utils: skip None finish_reason values in build_base_response - streaming_iterator: snapshot chunks before returning pending events (sync path) - streaming_handler: treat usage-only chunks as meaningful content - transformation: map finish_reason=refusal to status=incomplete - tests: add regression tests for refusal handling Made-with: Cursor
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR fixes a bug in the Responses API streaming translation where Anthropic's Confidence Score: 5/5Safe to merge — all changes are targeted bug fixes with regression tests and no backwards-incompatible behaviour changes. All four code changes are narrowly scoped corrections to an existing translation pipeline, each with a corresponding unit test. No security concerns, no schema changes, no breaking API changes. Remaining findings are P2 at most. No files require special attention.
|
| Filename | Overview |
|---|---|
| litellm/litellm_core_utils/streaming_chunk_builder_utils.py | Correctly guards against None finish_reason overwriting a previously-captured meaningful value; the "stop" sentinel is preserved when no chunk provides a non-None finish_reason. |
| litellm/litellm_core_utils/streaming_handler.py | Adds chunk.get("usage") is not None to allow usage-only final chunks through when finish_reason is already set, preserving terminal metadata for downstream translators. |
| litellm/responses/litellm_completion_transformation/streaming_iterator.py | Moves chunk snapshot before the pending-events check in sync next, mirroring the async path and ensuring finish_reason is captured before output_item events are emitted. |
| litellm/responses/litellm_completion_transformation/transformation.py | Adds "refusal" to the "incomplete" branch of _map_chat_completion_finish_reason_to_responses_status; straightforward and correct. |
| tests/test_litellm/litellm_core_utils/test_streaming_handler.py | New regression test test_usage_only_chunk_not_dropped_when_finish_reason_already_set directly exercises the streaming handler fix with a mock usage-only chunk. |
| tests/test_litellm/responses/litellm_completion_transformation/test_litellm_completion_responses.py | Adds test_transform_chat_completion_response_status_with_refusal and test_emit_response_completed_uses_stream_finish_reason; both are focused, mock-only tests appropriate for this folder. |
Sequence Diagram
sequenceDiagram
participant Anthropic as Anthropic Stream
participant SH as streaming_handler
participant SI as streaming_iterator
participant CB as chunk_builder_utils
participant TF as transformation
Anthropic->>SH: chunk {stop_reason="refusal", usage={...}}
Note over SH: is_finished=True → received_finish_reason="refusal"
Anthropic->>SH: usage-only chunk {usage={...}, is_finished=False}
Note over SH: NEW: usage chunk passes through (was dropped before)
SH-->>SI: ModelResponseStream (finish_reason="refusal")
Note over SI: _ensure_output_item_for_chunk queues OutputItemAddedEvent
Note over SI: NEW: snapshot chunk FIRST so finish_reason is captured
SI-->>SI: return OutputItemAddedEvent (pending)
Note over CB: build_base_response iterates chunks
Note over CB: NEW: skip None finish_reasons → finish_reason="refusal" preserved
CB->>TF: finish_reason="refusal"
Note over TF: NEW: "refusal" maps to "incomplete"
TF-->>SI: status="incomplete"
SI-->>SI: emit ResponseCompletedEvent(status="incomplete")
Reviews (2): Last reviewed commit: "Merge pull request #25523 from BerriAI/m..." | Re-trigger Greptile
merge main
dc200c3
into
litellm_internal_staging_04_11_2026
Summary
Fixes streaming responses API translation where Anthropic's
stop_reason="refusal"was incorrectly translated tostatus="completed"instead of"incomplete".Root Cause
build_base_responseinstreaming_chunk_builder_utils.pywas unconditionally overwritingfinish_reasonwithNonefrom later chunks, losing the terminalcontent_filtervalue that was correctly mapped fromrefusal.Changes
Nonefinish_reason values inbuild_base_responseso the last meaningful finish_reason winsfinish_reason="refusal"tostatus="incomplete"in responses APITest Plan
test_transform_chat_completion_response_status_with_refusaltest_usage_only_chunk_not_dropped_when_finish_reason_already_settest_emit_response_completed_uses_stream_finish_reasonto validate natural flow