Skip to content

feat(health-check): add BACKGROUND_HEALTH_CHECK_MAX_TOKENS env var#25344

Merged
ishaan-berri merged 3 commits intolitellm_ishaan_april14from
litellm_Sameerlite/healthcheck-max-tokens
Apr 14, 2026
Merged

feat(health-check): add BACKGROUND_HEALTH_CHECK_MAX_TOKENS env var#25344
ishaan-berri merged 3 commits intolitellm_ishaan_april14from
litellm_Sameerlite/healthcheck-max-tokens

Conversation

@Sameerlite
Copy link
Copy Markdown
Collaborator

Relevant issues

fixes LIT-2231

Pre-Submission checklist

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🆕 New Feature

Changes

Adds a new BACKGROUND_HEALTH_CHECK_MAX_TOKENS environment variable that sets a global default max_tokens for health check calls on explicit (non-wildcard) model deployments. Previously, max_tokens=1 was hardcoded for explicit models with no way to override it globally — only per-model via health_check_max_tokens in model_info. This is especially useful for Azure deployments that fail with max_tokens=1. Priority order: per-model health_check_max_tokens > BACKGROUND_HEALTH_CHECK_MAX_TOKENS env var > existing defaults (1 for explicit, 10 for wildcard).

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 8, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 8, 2026 4:08pm

Request Review

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented Apr 8, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing litellm_Sameerlite/healthcheck-max-tokens (b08e058) with main (62757ff)

Open in CodSpeed

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 8, 2026

Greptile Summary

This PR adds a new environment variable that provides a global fallback for max_tokens on proxy background health checks, sitting between per-model health_check_max_tokens and the hardcoded default of 1. This is a clean, focused feature that addresses Azure deployments (and others) where a very small max_tokens causes health check failures.

Key changes:

  • litellm/constants.py: New constant parsed with a try/except guard — safely falls back to None on bad env input, addressing the previous review concern about crashing the proxy at import time.
  • litellm/proxy/health_check.py: Single elif branch added to _update_litellm_params_for_health_check; runs before the wildcard guard so it applies to both explicit and wildcard models (correctly reflected in the updated docs).
  • tests/test_litellm/proxy/test_health_check_max_tokens.py: Three new tests verify global default, per-model priority override, and wildcard applicability. The two pre-existing tests do not pin the new constant to None, leaving them sensitive to the env var being set in a developer's local shell.
  • docs/my-website/docs/proxy/config_settings.md: Entry added in alphabetical order, including a note that wildcard routes are also affected when set.

Confidence Score: 5/5

Safe to merge — the feature is correct, well-tested, and the previous P1 concern about crashing on bad env input is resolved.

All remaining findings are P2 style suggestions (missing blank line in constants.py, existing tests not pinning the new constant to None). Neither blocks correctness or reliability. Prior P1 concern about unguarded int() conversion is fully addressed by the try/except block.

tests/test_litellm/proxy/test_health_check_max_tokens.py — pre-existing tests should pin BACKGROUND_HEALTH_CHECK_MAX_TOKENS to None to avoid environment-sensitive failures.

Vulnerabilities

No security concerns identified. The new env var only controls an integer max_tokens value sent with health-check requests; it is parsed safely with a try/except guard and falls back to None on invalid input. There is no auth, credential, or injection surface introduced.

Important Files Changed

Filename Overview
litellm/constants.py Adds BACKGROUND_HEALTH_CHECK_MAX_TOKENS constant with safe try/except guarded int() conversion; prior review concern about crashing proxy on bad env var value is now resolved.
litellm/proxy/health_check.py Inserts elif branch for BACKGROUND_HEALTH_CHECK_MAX_TOKENS between per-model override and the hardcoded default of 1; correctly applies to both wildcard and explicit models (documented in config_settings.md).
tests/test_litellm/proxy/test_health_check_max_tokens.py Adds 3 well-structured unit tests for the new env var; existing tests lack isolation from the new constant and could fail if the env var is set in the developer's shell.
docs/my-website/docs/proxy/config_settings.md Adds entry for BACKGROUND_HEALTH_CHECK_MAX_TOKENS in alphabetical order with accurate description covering both wildcard and explicit model behavior.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[_update_litellm_params_for_health_check] --> B{model_info has\nhealth_check_max_tokens?}
    B -- Yes --> C[max_tokens = health_check_max_tokens\nper-model override]
    B -- No --> D{BACKGROUND_HEALTH_CHECK_MAX_TOKENS\nenv var set?}
    D -- Yes --> E[max_tokens = BACKGROUND_HEALTH_CHECK_MAX_TOKENS\nglobal default]
    D -- No --> F{model string\ncontains wildcard *?}
    F -- No explicit model --> G[max_tokens = 1\nhardcoded default]
    F -- Yes wildcard --> H[max_tokens not set\nwildcard default = 10 handled downstream]
Loading

Reviews (3): Last reviewed commit: "Fix greptile reviews" | Re-trigger Greptile

Comment thread litellm/constants.py Outdated
Comment thread litellm/proxy/health_check.py
@krrish-berri-2
Copy link
Copy Markdown
Contributor

only per-model via health_check_max_tokens in model_info. This is especially useful for Azure deployments that fail with max_tokens=1.

shouldn't the 'real fix' be to fix the max tokens for azure to be higher or bump the default to max_tokens=5?

@Sameerlite

@Sameerlite
Copy link
Copy Markdown
Collaborator Author

only per-model via health_check_max_tokens in model_info. This is especially useful for Azure deployments that fail with max_tokens=1.

shouldn't the 'real fix' be to fix the max tokens for azure to be higher or bump the default to max_tokens=5?

@Sameerlite

We need this because we also have configurable DEFAULT_HEALTH_CHECK_PROMPT. This fix will make sure user can control both prompt and tokens

@Sameerlite
Copy link
Copy Markdown
Collaborator Author

Sameerlite commented Apr 13, 2026

.

@ishaan-berri ishaan-berri changed the base branch from main to litellm_ishaan_april14 April 14, 2026 17:04
@ishaan-berri ishaan-berri merged commit 9810a1b into litellm_ishaan_april14 Apr 14, 2026
93 of 97 checks passed
@ishaan-berri ishaan-berri deleted the litellm_Sameerlite/healthcheck-max-tokens branch April 14, 2026 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants