feat(dashscope): preserve cache_control for explicit prompt caching by silencedoctor · Pull Request #25331 · BerriAI/litellm

silencedoctor · 2026-04-08T09:27:24Z

Relevant issues

Pre-Submission checklist

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🐛 Bug Fix

Changes

The DashScope provider inherits OpenAIGPTConfig, which strips all cache_control fields from messages and tools by default via remove_cache_control_flag_from_messages_and_tools(). This prevents users from using explicit prompt caching with DashScope-hosted models that support it.

This PR overrides remove_cache_control_flag_from_messages_and_tools() in DashScopeChatConfig to preserve cache_control fields, following the exact same pattern already used by:

ZAI (litellm/llms/zai/chat/transformation.py)
MiniMax (litellm/llms/minimax/chat/transformation.py)
Databricks (litellm/llms/databricks/chat/transformation.py)

This change is safe for models that don't use cache_control — if no cache_control field is present, the behavior is identical to before.

Files changed

litellm/llms/dashscope/chat/transformation.py — Added override method
tests/test_litellm/llms/dashscope/test_dashscope_chat_transformation.py — Added 2 tests for cache_control preservation in messages and tools

Verification

Beyond the unit tests, this change was validated with live 10-round multi-turn conversation tests against the DashScope API:

Explicit caching works correctly:

With cache_control markers on user messages, cached_tokens grows each round as conversation history accumulates, and cache_creation_input_tokens is reported on initial cache build.
Cache hits begin from the first round after prompt tokens exceed the 1024-token threshold.

Implicit caching is not affected:

Models relying on implicit prefix-matching caching produce identical cached_tokens values with and without this change, confirmed by running the same conversation against both the reverted codebase and the patched codebase.
Results were further cross-validated by comparing litellm output against direct API calls (bypassing litellm entirely via raw HTTP requests to the same /compatible-mode/v1/chat/completions endpoint). The cached_tokens matched on every round.

No regressions on non-caching models:

Models that do not support explicit caching were tested with cache_control present in the request. The DashScope API silently ignores the unrecognized field — no errors or behavioral changes observed.

vercel · 2026-04-08T09:27:30Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Apr 8, 2026 0:27am

greptile-apps · 2026-04-08T09:29:20Z

Greptile Summary

This PR adds support for explicit prompt caching in the DashScope provider by overriding remove_cache_control_flag_from_messages_and_tools() in DashScopeChatConfig to preserve cache_control fields instead of stripping them. The fix follows the exact same pattern already established by ZAI, MiniMax, and Databricks providers, making it a minimal, targeted, and low-risk change.

Key changes:

Added remove_cache_control_flag_from_messages_and_tools() override in DashScopeChatConfig that returns messages and tools unchanged, preserving cache_control fields for DashScope-hosted models that support prompt caching.
Added 2 pure unit tests covering cache_control preservation in both messages and tools, with no real network calls (compliant with the tests/test_litellm/ mock-only policy).

Confidence Score: 5/5

Safe to merge — minimal, targeted override following an established pattern with appropriate unit test coverage.

The change is a one-method override that mirrors the exact same implementation already used by ZAI, MiniMax, and Databricks providers. It is backward-compatible (no cache_control = identical behavior to before), the tests are pure unit tests with no network calls, and no custom rules are violated. All remaining findings are P2 or already captured in prior review threads.

No files require special attention.

Vulnerabilities

No security concerns identified.

Important Files Changed

Filename	Overview
litellm/llms/dashscope/chat/transformation.py	Adds `remove_cache_control_flag_from_messages_and_tools` override to preserve `cache_control` fields; two imports from the same module on separate lines (already flagged in a prior review thread).
tests/test_litellm/llms/dashscope/test_dashscope_chat_transformation.py	Adds two pure unit tests that directly call `remove_cache_control_flag_from_messages_and_tools` to verify `cache_control` is preserved in messages and tools; no network calls, no regressions.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Request with cache_control fields] --> B{Provider?}
    B -->|OpenAIGPTConfig default| C[remove_cache_control_flag_from_messages_and_tools]
    C --> D[Strip cache_control from messages & tools]
    B -->|DashScopeChatConfig override| E[remove_cache_control_flag_from_messages_and_tools]
    E --> F[Return messages & tools unchanged]
    D --> G[Request sent without cache_control]
    F --> H[Request sent with cache_control preserved]

_{Reviews (2): Last reviewed commit: "feat(dashscope): preserve cache_control ..." | Re-trigger Greptile}

greptile-apps · 2026-04-08T09:29:24Z

+from litellm.types.llms.openai import ChatCompletionToolParam
+
 from litellm.secret_managers.main import get_secret_str
 from litellm.types.llms.openai import AllMessageValues


Duplicate import from the same module

ChatCompletionToolParam and AllMessageValues are both imported from litellm.types.llms.openai in separate statements. Per the project's style guide, these should be merged into a single import.

Suggested change

from litellm.types.llms.openai import ChatCompletionToolParam

from litellm.secret_managers.main import get_secret_str

from litellm.types.llms.openai import AllMessageValues

from litellm.types.llms.openai import AllMessageValues, ChatCompletionToolParam

from litellm.secret_managers.main import get_secret_str

Context Used: CLAUDE.md (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

codspeed-hq · 2026-04-08T09:29:52Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing silencedoctor:feat/dashscope-preserve-cache-control (45f155f) with main (62757ff)}

codecov · 2026-04-08T09:35:13Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

DashScope inherits OpenAIGPTConfig which strips cache_control from messages and tools by default. Override remove_cache_control_flag_from_messages_and_tools() to preserve cache_control, following the same pattern used by ZAI, MiniMax, and Databricks. Verified through 10-round multi-turn conversation tests: - Explicit caching works correctly: cached_tokens grows each round from R4 onwards, with cache_creation_tokens reported on first cache build. - Implicit caching is not affected: models that rely on implicit prefix-matching caching produce identical cached_tokens with and without this change, confirmed by comparing results against both the reverted codebase and direct API calls bypassing litellm. - No errors or regressions observed on any model, including those that do not support explicit caching — the DashScope API silently ignores unrecognized cache_control fields. Fixes BerriAI#25330 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

greptile-apps · 2026-04-08T12:22:44Z

Tip:

Greploops — Automatically fix all review issues by running /greploops in Claude Code. It iterates: fix, push, re-review, repeat until 5/5 confidence.

Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal.

greptile-apps bot reviewed Apr 8, 2026

View reviewed changes

vercel bot deployed to Preview April 8, 2026 09:33 View deployment

silencedoctor force-pushed the feat/dashscope-preserve-cache-control branch from b6319f2 to 45f155f Compare April 8, 2026 12:21

vercel bot deployed to Preview April 8, 2026 12:27 View deployment

krrish-berri-2 changed the base branch from main to litellm_oss_staging_04_08_2026 April 9, 2026 04:31

krrish-berri-2 merged commit 4e32479 into BerriAI:litellm_oss_staging_04_08_2026 Apr 9, 2026
49 of 51 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(dashscope): preserve cache_control for explicit prompt caching#25331

feat(dashscope): preserve cache_control for explicit prompt caching#25331
krrish-berri-2 merged 1 commit intoBerriAI:litellm_oss_staging_04_08_2026from
silencedoctor:feat/dashscope-preserve-cache-control

silencedoctor commented Apr 8, 2026 •

edited

Loading

Uh oh!

vercel bot commented Apr 8, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Apr 8, 2026 •

edited

Loading

Vulnerabilities

Important Files Changed

Uh oh!

greptile-apps bot Apr 8, 2026

Uh oh!

codspeed-hq bot commented Apr 8, 2026 •

edited

Loading

Uh oh!

codecov bot commented Apr 8, 2026

Uh oh!

greptile-apps bot commented Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

silencedoctor commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Relevant issues

Pre-Submission checklist

Type

Changes

Files changed

Verification

Uh oh!

vercel bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Vulnerabilities

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

codspeed-hq bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

codecov bot commented Apr 8, 2026

Codecov Report

Uh oh!

greptile-apps bot commented Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

silencedoctor commented Apr 8, 2026 •

edited

Loading

vercel bot commented Apr 8, 2026 •

edited

Loading

greptile-apps bot commented Apr 8, 2026 •

edited

Loading

codspeed-hq bot commented Apr 8, 2026 •

edited

Loading