fix(gemini): resolve image token undercounting in usage metadata by gustipardo · Pull Request #22608 · BerriAI/litellm

gustipardo · 2026-03-03T00:11:56Z

This PR fixes the issue where image_tokens were being overwritten instead of accumulated in the usage metadata for Gemini/Vertex AI models.

Changes:

Implemented token accumulation (+=) instead of overwriting.
Added support for both 'tokenCount' and 'token_count' keys.
Normalized modality strings to uppercase for consistency.

Verified with a standalone reproduction script (190 tokens counted vs 100 previously).

Fixes #22082

vercel · 2026-03-03T00:12:02Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Error		Mar 3, 2026 3:23pm

CLAassistant · 2026-03-03T00:12:13Z

All committers have signed the CLA.

greptile-apps · 2026-03-03T00:19:52Z

Greptile Summary

This PR fixes image token undercounting in Gemini/Vertex AI usage metadata by changing token assignment from overwriting (=) to accumulation (+=). When the API returns multiple entries for the same modality (e.g., two IMAGE entries in promptTokensDetails), tokens are now correctly summed. The fix also adds defensive handling for both tokenCount and token_count API key formats, and normalizes modality strings to uppercase for consistency.

Token accumulation fix: All four token detail loops in VertexGeminiConfig._calculate_usage (responseTokensDetails, candidatesTokensDetails, promptTokensDetails, cacheTokensDetails) and the GoogleImageGenConfig._transform_image_usage method now accumulate instead of overwrite
Dual-key support: A _get_token_count helper tries tokenCount first, then falls back to token_count for API compatibility
Test coverage: Two new tests verify the accumulation behavior — one for the image generation path and one for the vertex _calculate_usage path, both using mock data only

Confidence Score: 4/5

This PR is safe to merge — it fixes a clear bug with a well-scoped change and includes regression tests.
Score of 4 reflects a focused, correct bug fix with good test coverage. The changes are straightforward accumulation fixes that don't alter control flow. Both the image generation and vertex calculation paths are covered by tests. Slight deduction because the similar code in vertex_gemini_transformation.py was not updated with the same defensive improvements (token_count fallback, .upper() normalization), though that's existing code not broken by this PR.
No files require special attention. The main logic file litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py has the most changes but they are consistent and well-tested.

Important Files Changed

Filename	Overview
litellm/llms/gemini/image_generation/transformation.py	Changed token assignment from `=` to `+=` for accumulation, added `token_count` key fallback and `.upper()` normalization. Clean and correct fix.
litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py	Added `_get_token_count` helper for dual-key lookup, updated all four token detail loops to accumulate instead of overwrite. Logic is correct and consistent.
tests/llm_translation/test_gemini_image_usage.py	Added regression test for image token accumulation in GoogleImageGenConfig. No network calls, uses local model cost map. Covers the core fix well.
tests/test_litellm/llms/vertex_ai/gemini/test_vertex_and_google_ai_studio_gemini.py	Added test for VertexGeminiConfig._calculate_usage with duplicate modality entries. Tests both `tokenCount` and `token_count` keys, and verifies cache subtraction logic.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[API Response with usageMetadata] --> B{Parse token details loops}
    B --> C[responseTokensDetails]
    B --> D[candidatesTokensDetails]
    B --> E[promptTokensDetails]
    B --> F[cacheTokensDetails]
    
    C --> G[_get_token_count: try tokenCount then token_count]
    D --> G
    E --> G
    F --> G
    
    G --> H[Normalize modality to UPPERCASE]
    H --> I{Accumulate tokens by modality}
    I --> |TEXT| J["field = (field or 0) + token_count"]
    I --> |IMAGE| J
    I --> |AUDIO| J
    I --> |VIDEO| J
    
    E --> K[prompt_*_tokens accumulated]
    F --> L[cached_*_tokens accumulated]
    K --> M[Subtract cached from prompt per modality]
    L --> M
    M --> N[Final Usage object]
    C --> N
    D --> N

_{Last reviewed commit: de63dd8}

greptile-apps

_{5 files reviewed, 5 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-03-03T00:57:11Z

+def test_gemini_image_generation_accumulates_multiple_image_prompt_token_details():
+    """
+    Regression test: promptTokensDetails can include multiple IMAGE entries.
+    These must be accumulated instead of overwritten.
+    """
+    previous_local_model_cost_map = os.environ.get("LITELLM_LOCAL_MODEL_COST_MAP")
+    previous_model_cost = litellm.model_cost
+    try:
+        os.environ["LITELLM_LOCAL_MODEL_COST_MAP"] = "True"
+        litellm.model_cost = litellm.get_model_cost_map(url="")
+
+        model = "gemini/gemini-3-pro-image-preview"
+        config = GoogleImageGenConfig()
+
+        usage_metadata = {
+            "promptTokenCount": 200,
+            "candidatesTokenCount": 0,
+            "totalTokenCount": 200,
+            "promptTokensDetails": [
+                {"modality": "TEXT", "tokenCount": 10},
+                {"modality": "IMAGE", "tokenCount": 90},
+                {"modality": "IMAGE", "tokenCount": 100},
+            ],
+        }
+
+        parsed_usage = config._transform_image_usage(usage_metadata)
+        image_response = ImageResponse(
+            data=[ImageObject(b64_json="fake_image_data")],
+            usage=parsed_usage,
+        )
+
+        observed_cost = litellm.completion_cost(
+            completion_response=image_response,
+            model=model,
+            custom_llm_provider="gemini",
+        )
+
+        model_info = litellm.get_model_info(model=model, custom_llm_provider="gemini")
+        expected_image_tokens = 190
+        expected_total_prompt_tokens = 200
+        expected_prompt_cost = expected_total_prompt_tokens * model_info["input_cost_per_token"]
+
+        assert parsed_usage.input_tokens_details.image_tokens == expected_image_tokens
+        assert parsed_usage.input_tokens_details.text_tokens == 10
+        assert observed_cost == pytest.approx(expected_prompt_cost, rel=1e-12)
+    finally:
+        if previous_local_model_cost_map is None:
+            os.environ.pop("LITELLM_LOCAL_MODEL_COST_MAP", None)
+        else:
+            os.environ["LITELLM_LOCAL_MODEL_COST_MAP"] = previous_local_model_cost_map
+        litellm.model_cost = previous_model_cost


Missing test for vertex _calculate_usage accumulation

This test covers the GoogleImageGenConfig._transform_image_usage path (image generation), but the PR also changed token accumulation logic in VertexGeminiConfig._calculate_usage across four loops (responseTokensDetails, candidatesTokensDetails, promptTokensDetails, cacheTokensDetails). None of those accumulation changes are covered by a test with duplicate modality entries.

Consider adding a test in tests/test_litellm/llms/vertex_ai/gemini/test_vertex_and_google_ai_studio_gemini.py that passes promptTokensDetails (or candidatesTokensDetails) with multiple entries of the same modality to verify the accumulation works there too.

Fixed an issue where image tokens were being overwritten instead of accumulated in Gemini responses. Added support for both camelCase and snake_case token count keys. Fixes BerriAI#22082.

Parse tokenCount/token_count as int-safe values to satisfy mypy and avoid None/object arithmetic. Add regression test for duplicate modality accumulation in Vertex _calculate_usage.

vercel bot had a problem deploying to Preview March 3, 2026 00:12 Failure

greptile-apps bot reviewed Mar 3, 2026

View reviewed changes

Comment thread excalidraw.log Outdated

Comment thread package-lock.json

Comment thread scripts/repro_gemini_image_cost.py Outdated

vercel bot had a problem deploying to Preview March 3, 2026 00:35 Failure

greptile-apps bot reviewed Mar 3, 2026

View reviewed changes

Comment thread litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py Outdated

vercel bot had a problem deploying to Preview March 3, 2026 00:51 Failure

greptile-apps bot reviewed Mar 3, 2026

View reviewed changes

gustipardo added 4 commits March 3, 2026 12:18

fix(gemini): ensure image token accumulation in usage metadata

167f2b8

Fixed an issue where image tokens were being overwritten instead of accumulated in Gemini responses. Added support for both camelCase and snake_case token count keys. Fixes BerriAI#22082.

test: add regression test for image token accumulation and cleanup files

49adc63

fix(gemini): ensure consistent accumulation for responseTokensDetails

36b3cd1

fix(gemini): harden token count parsing and add vertex accumulation test

de63dd8

Parse tokenCount/token_count as int-safe values to satisfy mypy and avoid None/object arithmetic. Add regression test for duplicate modality accumulation in Vertex _calculate_usage.

gustipardo force-pushed the fix/gemini-image-token-accumulation branch from fc14141 to de63dd8 Compare March 3, 2026 15:22

vercel bot had a problem deploying to Preview March 3, 2026 15:23 Failure

ghost changed the base branch from main to litellm_oss_staging_03_05_2026 March 5, 2026 02:52

ghost merged commit b3a1759 into BerriAI:litellm_oss_staging_03_05_2026 Mar 5, 2026
3 of 36 checks passed

greptile-apps bot mentioned this pull request Mar 5, 2026

Litellm oss staging 03 05 2026 #22844

Merged

7 tasks

emerzon mentioned this pull request Apr 12, 2026

add azure ai grok 4 20 models #25582

Open

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(gemini): resolve image token undercounting in usage metadata#22608

fix(gemini): resolve image token undercounting in usage metadata#22608
4 commits merged intoBerriAI:litellm_oss_staging_03_05_2026from
gustipardo:fix/gemini-image-token-accumulation

gustipardo commented Mar 3, 2026

Uh oh!

vercel bot commented Mar 3, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Mar 3, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 3, 2026 •

edited

Loading

Important Files Changed

Uh oh!

greptile-apps bot left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot Mar 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

gustipardo commented Mar 3, 2026

Changes:

Uh oh!

vercel bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel bot commented Mar 3, 2026 •

edited

Loading

CLAassistant commented Mar 3, 2026 •

edited

Loading

greptile-apps bot commented Mar 3, 2026 •

edited

Loading

greptile-apps bot left a comment •

edited

Loading