fix(bedrock): prevent negative streaming costs for start-only cache usage by Sameerlite · Pull Request #25846 · BerriAI/litellm

Sameerlite · 2026-04-16T07:38:45Z

Summary

merge cache usage from message_start onto buffered message_delta when message_stop omits cache fields in Bedrock Anthropic /v1/messages streams
add a defensive clamp in generic cost calc so derived text_tokens never goes negative on inconsistent usage payloads
add regression coverage asserting start-only cache streams reconstruct consistent usage and produce positive, exact cost

Before

After

fixes LIT-2144

…usage Bedrock /v1/messages streams can report cache tokens only on message_start while message_delta carries only uncached input tokens. Merge cache fields onto the final delta usage and clamp negative text-token remainders in cost calc to keep usage/cost consistent. Made-with: Cursor

vercel · 2026-04-16T07:38:47Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Apr 16, 2026 7:40am

greptile-apps · 2026-04-16T07:43:07Z

Greptile Summary

This PR fixes negative streaming costs on Bedrock Anthropic /v1/messages streams by (1) buffering message_delta, then merging cache usage fields from message_stop—and, when absent, from message_start—before yielding the delta, and (2) adding a defensive zero-clamp in the generic cost-calc path so derived text_tokens can never go negative. Regression tests cover both the "cache-on-start-only" and the normal "cache-on-stop" paths end-to-end.

Confidence Score: 5/5

Safe to merge; the fix is well-scoped and all new tests are mock-based with no real network calls.

All three changed files look correct. The streaming state machine properly handles both the cache-on-stop and cache-on-start-only Bedrock variants, the zero-clamp is a purely defensive guard, and the new tests provide solid regression coverage. The two open P2 concerns (missing warning log at the clamp site, overly strict cost assertion) were already raised in prior review threads and do not block merge.

No files require special attention beyond the already-discussed clamp warning log and exact cost assertion.

Important Files Changed

Filename	Overview
litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py	Adds `_merge_message_start_cache_into_delta_usage` and `_promote_message_stop_usage` to correctly merge cache-usage fields from message_start/message_stop onto message_delta before yielding; logic is sound for all described scenarios.
litellm/litellm_core_utils/llm_cost_calc/utils.py	Adds a defensive zero-clamp on derived `text_tokens` after the double-counting correction; prevents negative costs from inconsistent usage payloads.
tests/test_litellm/llms/bedrock/messages/invoke_transformations/test_anthropic_claude3_transformation.py	Adds mock-based regression tests covering start-only cache, stop-promoted cache, and end-to-end cost; all tests avoid real network calls as required. Exact-equality cost assertion (abs=1e-9) was flagged in a prior review thread.

Sequence Diagram

sequenceDiagram
    participant B as Bedrock Stream
    participant P as _promote_message_stop_usage
    participant H as bedrock_sse_wrapper / logging

    B->>P: message_start with cache usage fields
    Note over P: snapshot start_usage
    P->>H: message_start (pass-through)

    B->>P: message_delta with input and output tokens
    Note over P: buffer as pending_delta, do not yield yet

    B->>P: message_stop with optional cache fields
    Note over P: merge stop cache fields onto delta_usage
    Note over P: if still absent, merge from start_usage snapshot
    P->>H: message_delta enriched with cache usage
    P->>H: message_stop (pass-through)

    Note over H: cost calc sees consistent usage, text_tokens clamped to zero or above

_{Reviews (2): Last reviewed commit: "fix(mypy): cast msg and pending_delta in..." | Re-trigger Greptile}

greptile-apps · 2026-04-16T07:43:10Z

+        custom_llm_provider="bedrock",
+    )
+    assert cost > 0
+    assert cost == pytest.approx(0.0093951, rel=0, abs=1e-9)


Overly strict cost assertion will break on pricing updates

rel=0, abs=1e-9 is effectively exact equality — any change to the bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0 entry in model_prices_and_context_window.json will fail this assertion regardless of whether the fix under test is still correct. The regression guard really only needs to confirm the cost is positive and in the right ballpark.

Suggested change

assert cost == pytest.approx(0.0093951, rel=0, abs=1e-9)

assert cost == pytest.approx(0.0093951, rel=1e-5)

greptile-apps · 2026-04-16T07:43:11Z

+        # Clamp to zero: inconsistent streaming usage 
+        if text_tokens < 0:
+            text_tokens = 0
        prompt_tokens_details["text_tokens"] = text_tokens


Silent clamp swallows inconsistencies without observability

Clamping to 0 is the right outcome (negative cost is worse), but when cache_hit > prompt_tokens the root cause is a signal of upstream usage-data inconsistency that is now silently discarded. A verbose_logger.warning(...) at the clamp site would make future occurrences diagnosable without changing behavior.

…to resolve union-attr and assignment errors

Sameerlite temporarily deployed to integration-postgres April 16, 2026 07:38 — with GitHub Actions Inactive

Sameerlite had a problem deploying to integration-postgres April 16, 2026 07:38 — with GitHub Actions Error

Sameerlite temporarily deployed to integration-postgres April 16, 2026 07:38 — with GitHub Actions Inactive

vercel Bot deployed to Preview April 16, 2026 07:40 View deployment

greptile-apps Bot reviewed Apr 16, 2026

View reviewed changes

fix(mypy): cast msg and pending_delta in _promote_message_stop_usage …

6bf90fb

…to resolve union-attr and assignment errors

ishaan-berri temporarily deployed to integration-postgres April 18, 2026 17:37 — with GitHub Actions Inactive

ishaan-berri had a problem deploying to integration-postgres April 18, 2026 17:37 — with GitHub Actions Error

ishaan-berri temporarily deployed to integration-postgres April 18, 2026 17:37 — with GitHub Actions Inactive

ishaan-berri approved these changes Apr 18, 2026

View reviewed changes

ishaan-berri merged commit f476447 into litellm_internal_staging Apr 18, 2026
93 of 98 checks passed

ishaan-berri deleted the litellm_bedrock_cache_start_negative_cost branch April 18, 2026 18:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(bedrock): prevent negative streaming costs for start-only cache usage#25846

fix(bedrock): prevent negative streaming costs for start-only cache usage#25846
ishaan-berri merged 2 commits intolitellm_internal_stagingfrom
litellm_bedrock_cache_start_negative_cost

Sameerlite commented Apr 16, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Apr 16, 2026 •

edited

Loading

Important Files Changed

Uh oh!

greptile-apps Bot Apr 16, 2026

Uh oh!

greptile-apps Bot Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	assert cost == pytest.approx(0.0093951, rel=0, abs=1e-9)
	assert cost == pytest.approx(0.0093951, rel=1e-5)

Uh oh!

Conversation

Sameerlite commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

vercel Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Sameerlite commented Apr 16, 2026 •

edited

Loading

vercel Bot commented Apr 16, 2026 •

edited

Loading

greptile-apps Bot commented Apr 16, 2026 •

edited

Loading