Skip to content

[bug-fix]return actual status code - /v1/messages/count_tokens endpoint#21352

Merged
ishaan-berri merged 7 commits intoBerriAI:litellm_ishaan_april6from
otaviofbrito:chore/handle-count-tokens-errors
Apr 6, 2026
Merged

[bug-fix]return actual status code - /v1/messages/count_tokens endpoint#21352
ishaan-berri merged 7 commits intoBerriAI:litellm_ishaan_april6from
otaviofbrito:chore/handle-count-tokens-errors

Conversation

@otaviofbrito
Copy link
Copy Markdown
Contributor

@otaviofbrito otaviofbrito commented Feb 17, 2026

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🐛 Bug Fix
✅ Test

Changes

Return actual status code from count_tokens API - avoid returning 5XX from bad request errors

httpx.Response.raise_for_status() from httpx async client post was not being handled

image

@vercel
Copy link
Copy Markdown

vercel Bot commented Feb 17, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 6, 2026 8:44pm

Request Review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Feb 17, 2026

Greptile Summary

This PR fixes a real bug where httpx.HTTPStatusError raised inside provider_counter.count_tokens() was propagating uncaught, causing the proxy to return 500 for upstream 4XX errors (rate limits, auth failures) at the /v1/messages/count_tokens endpoint.

  • ✅ The core fix (catching httpx.HTTPStatusError and re-raising as ProxyException) is correct and well-targeted
  • ✅ A new unit test correctly validates the 429 upstream error path with proper mocks
  • ⚠️ Regression: tools=tools and system=system were accidentally dropped from the count_tokens() call inside the new try block — token counting for requests with tools or a system prompt will silently return undercounted totals for Anthropic (and any future provider that forwards these fields)

Confidence Score: 4/5

Not safe to merge as-is — dropping tools/system from count_tokens() silently breaks tool-use and system-prompt token counting for Anthropic

Score is 4 because there is a confirmed P1 regression: tools=tools and system=system were accidentally dropped from the count_tokens() call inside the new try block. The error-handling fix itself is sound, but this side-effect must be corrected before merge.

litellm/proxy/proxy_server.py lines 9121-9127 — restore tools=tools and system=system to the count_tokens() call

Important Files Changed

Filename Overview
litellm/proxy/proxy_server.py Adds try/except for httpx.HTTPStatusError correctly, but accidentally drops tools and system parameters from the count_tokens() call, silently breaking tool-use and system-prompt token counting
tests/proxy_unit_tests/test_proxy_token_counter.py Adds a well-structured mock test for the HTTPStatusError catch path; does not cover tools/system forwarding so the parameter regression goes undetected

Sequence Diagram

sequenceDiagram
    participant Client
    participant ProxyServer
    participant ProviderCounter
    participant UpstreamAPI

    Client->>ProxyServer: POST /v1/messages/count_tokens
    ProxyServer->>ProviderCounter: count_tokens(model, messages, ...)
    ProviderCounter->>UpstreamAPI: HTTP POST count tokens
    UpstreamAPI-->>ProviderCounter: 429 / 403 HTTP error
    ProviderCounter-->>ProxyServer: raise httpx.HTTPStatusError
    Note over ProxyServer: [NEW] try/except catches error
    ProxyServer->>ProxyServer: extract status_code from exception
    ProxyServer-->>Client: ProxyException(code=429/403)
    Note over ProxyServer: Before fix: uncaught → 500 returned
Loading

Greploops — Automatically fix all review issues by running /greploops in Claude Code. It iterates: fix, push, re-review, repeat until 5/5 confidence.
Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal.

Reviews (6): Last reviewed commit: "Merge branch 'litellm_ishaan_april6' int..." | Re-trigger Greptile

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment thread litellm/proxy/proxy_server.py Outdated
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
@shivamrawat1
Copy link
Copy Markdown
Collaborator

@greptile review with new commits.

@shivamrawat1
Copy link
Copy Markdown
Collaborator

@greptile give it score out of 5 again with the new commits.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Feb 27, 2026

Additional Comments (2)

litellm/llms/vertex_ai/vertex_ai_partner_models/count_tokens/handler.py, line 148
Error responses may cause JSON parse failure

After removing the manual status check, if AsyncHTTPHandler.post() ever changes to not call raise_for_status() internally (or if this handler is used with a different HTTP client), a non-200 response would reach response.json() and could fail with a JSONDecodeError if the body isn't valid JSON — returning a confusing error instead of the actual HTTP error.

Currently this is safe because AsyncHTTPHandler.post() at litellm/llms/custom_httpx/http_handler.py:466 always calls raise_for_status() before returning. But the implicit coupling is worth noting — a comment here would help future maintainers understand why no status check is needed.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!


litellm/proxy/proxy_server.py, line 7762
Catching HTTPStatusError bypasses local tokenizer fallback

When litellm.disable_token_counter is False (the default), the existing logic at lines 7766-7779 gracefully handles provider errors by falling back to the local tokenizer via TokenCountResponse(error=True, ...). However, this new except httpx.HTTPStatusError block converts the error into a ProxyException immediately — which means HTTP errors from the provider will now always be surfaced as errors to the caller, even when the fallback-to-local-tokenizer path is available.

This is a behavior change: previously, Vertex AI partner model HTTP errors would propagate as unhandled exceptions (resulting in 500s). Other providers (Anthropic, Bedrock, Azure AI) catch errors internally and return TokenCountResponse(error=True, ...), which triggers the fallback. Consider whether the intent here is to always fail hard on HTTP errors, or whether it would be more consistent to wrap the error in a TokenCountResponse(error=True, ...) so the fallback logic at line 7766 can handle it uniformly — matching the pattern used by other provider token counters.

@gitguardian
Copy link
Copy Markdown

gitguardian Bot commented Apr 1, 2026

⚠️ GitGuardian has uncovered 7 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard.
Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secrets in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
29203064 Triggered Zapier Webhook URL 151c76a litellm/proxy/_experimental/out/_next/static/chunks/a7aecb91c09b0e9a.js View secret
29203060 Triggered Bearer Token f3ed443 tests/test_litellm/llms/anthropic/test_anthropic_common_utils.py View secret
29203066 Triggered Base64 Basic Authentication f3ed443 tests/litellm/test_no_hardcoded_secrets.py View secret
29203131 Triggered Generic High Entropy Secret f3ed443 tests/test_litellm/llms/anthropic/test_anthropic_common_utils.py View secret
29203090 Triggered Bearer Token 151c76a tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_debug.py View secret
29203092 Triggered Generic High Entropy Secret 151c76a tests/test_litellm/proxy/test_api_key_masking_in_errors.py View secret
29203053 Triggered Generic Password f3ed443 .circleci/config.yml View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secrets safely. Learn here the best practices.
  3. Revoke and rotate these secrets.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@ishaan-berri ishaan-berri changed the base branch from main to litellm_ishaan_april6 April 6, 2026 20:31
@ishaan-berri ishaan-berri merged commit 8b0fadf into BerriAI:litellm_ishaan_april6 Apr 6, 2026
1 of 2 checks passed
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ otaviofbrito
❌ ishaan-berri
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants