feat(dashscope): preserve cache_control for explicit prompt caching#25331
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR adds support for explicit prompt caching in the DashScope provider by overriding Key changes:
Confidence Score: 5/5Safe to merge — minimal, targeted override following an established pattern with appropriate unit test coverage. The change is a one-method override that mirrors the exact same implementation already used by ZAI, MiniMax, and Databricks providers. It is backward-compatible (no No files require special attention.
|
| Filename | Overview |
|---|---|
| litellm/llms/dashscope/chat/transformation.py | Adds remove_cache_control_flag_from_messages_and_tools override to preserve cache_control fields; two imports from the same module on separate lines (already flagged in a prior review thread). |
| tests/test_litellm/llms/dashscope/test_dashscope_chat_transformation.py | Adds two pure unit tests that directly call remove_cache_control_flag_from_messages_and_tools to verify cache_control is preserved in messages and tools; no network calls, no regressions. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Request with cache_control fields] --> B{Provider?}
B -->|OpenAIGPTConfig default| C[remove_cache_control_flag_from_messages_and_tools]
C --> D[Strip cache_control from messages & tools]
B -->|DashScopeChatConfig override| E[remove_cache_control_flag_from_messages_and_tools]
E --> F[Return messages & tools unchanged]
D --> G[Request sent without cache_control]
F --> H[Request sent with cache_control preserved]
Reviews (2): Last reviewed commit: "feat(dashscope): preserve cache_control ..." | Re-trigger Greptile
| from litellm.types.llms.openai import ChatCompletionToolParam | ||
|
|
||
| from litellm.secret_managers.main import get_secret_str | ||
| from litellm.types.llms.openai import AllMessageValues |
There was a problem hiding this comment.
Duplicate import from the same module
ChatCompletionToolParam and AllMessageValues are both imported from litellm.types.llms.openai in separate statements. Per the project's style guide, these should be merged into a single import.
| from litellm.types.llms.openai import ChatCompletionToolParam | |
| from litellm.secret_managers.main import get_secret_str | |
| from litellm.types.llms.openai import AllMessageValues | |
| from litellm.types.llms.openai import AllMessageValues, ChatCompletionToolParam | |
| from litellm.secret_managers.main import get_secret_str |
Context Used: CLAUDE.md (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
DashScope inherits OpenAIGPTConfig which strips cache_control from messages and tools by default. Override remove_cache_control_flag_from_messages_and_tools() to preserve cache_control, following the same pattern used by ZAI, MiniMax, and Databricks. Verified through 10-round multi-turn conversation tests: - Explicit caching works correctly: cached_tokens grows each round from R4 onwards, with cache_creation_tokens reported on first cache build. - Implicit caching is not affected: models that rely on implicit prefix-matching caching produce identical cached_tokens with and without this change, confirmed by comparing results against both the reverted codebase and direct API calls bypassing litellm. - No errors or regressions observed on any model, including those that do not support explicit caching — the DashScope API silently ignores unrecognized cache_control fields. Fixes BerriAI#25330 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
b6319f2 to
45f155f
Compare
|
Tip: Greploops — Automatically fix all review issues by running Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal. |
4e32479
into
BerriAI:litellm_oss_staging_04_08_2026
Relevant issues
Fixes #25330
Pre-Submission checklist
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewType
🐛 Bug Fix
Changes
The DashScope provider inherits
OpenAIGPTConfig, which strips allcache_controlfields from messages and tools by default viaremove_cache_control_flag_from_messages_and_tools(). This prevents users from using explicit prompt caching with DashScope-hosted models that support it.This PR overrides
remove_cache_control_flag_from_messages_and_tools()inDashScopeChatConfigto preservecache_controlfields, following the exact same pattern already used by:litellm/llms/zai/chat/transformation.py)litellm/llms/minimax/chat/transformation.py)litellm/llms/databricks/chat/transformation.py)This change is safe for models that don't use
cache_control— if nocache_controlfield is present, the behavior is identical to before.Files changed
litellm/llms/dashscope/chat/transformation.py— Added override methodtests/test_litellm/llms/dashscope/test_dashscope_chat_transformation.py— Added 2 tests for cache_control preservation in messages and toolsVerification
Beyond the unit tests, this change was validated with live 10-round multi-turn conversation tests against the DashScope API:
Explicit caching works correctly:
cache_controlmarkers on user messages,cached_tokensgrows each round as conversation history accumulates, andcache_creation_input_tokensis reported on initial cache build.Implicit caching is not affected:
cached_tokensvalues with and without this change, confirmed by running the same conversation against both the reverted codebase and the patched codebase./compatible-mode/v1/chat/completionsendpoint). Thecached_tokensmatched on every round.No regressions on non-caching models:
cache_controlpresent in the request. The DashScope API silently ignores the unrecognized field — no errors or behavioral changes observed.