feat(health-check): add BACKGROUND_HEALTH_CHECK_MAX_TOKENS env var by Sameerlite · Pull Request #25344 · BerriAI/litellm

Sameerlite · 2026-04-08T14:14:28Z

Relevant issues

fixes LIT-2231

Pre-Submission checklist

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🆕 New Feature

Changes

Adds a new BACKGROUND_HEALTH_CHECK_MAX_TOKENS environment variable that sets a global default max_tokens for health check calls on explicit (non-wildcard) model deployments. Previously, max_tokens=1 was hardcoded for explicit models with no way to override it globally — only per-model via health_check_max_tokens in model_info. This is especially useful for Azure deployments that fail with max_tokens=1. Priority order: per-model health_check_max_tokens > BACKGROUND_HEALTH_CHECK_MAX_TOKENS env var > existing defaults (1 for explicit, 10 for wildcard).

vercel · 2026-04-08T14:14:34Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Apr 8, 2026 4:08pm

codspeed-hq · 2026-04-08T14:16:26Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing litellm_Sameerlite/healthcheck-max-tokens (b08e058) with main (62757ff)}

greptile-apps · 2026-04-08T14:18:10Z

Greptile Summary

This PR adds a new environment variable that provides a global fallback for max_tokens on proxy background health checks, sitting between per-model health_check_max_tokens and the hardcoded default of 1. This is a clean, focused feature that addresses Azure deployments (and others) where a very small max_tokens causes health check failures.

Key changes:

litellm/constants.py: New constant parsed with a try/except guard — safely falls back to None on bad env input, addressing the previous review concern about crashing the proxy at import time.
litellm/proxy/health_check.py: Single elif branch added to _update_litellm_params_for_health_check; runs before the wildcard guard so it applies to both explicit and wildcard models (correctly reflected in the updated docs).
tests/test_litellm/proxy/test_health_check_max_tokens.py: Three new tests verify global default, per-model priority override, and wildcard applicability. The two pre-existing tests do not pin the new constant to None, leaving them sensitive to the env var being set in a developer's local shell.
docs/my-website/docs/proxy/config_settings.md: Entry added in alphabetical order, including a note that wildcard routes are also affected when set.

Confidence Score: 5/5

Safe to merge — the feature is correct, well-tested, and the previous P1 concern about crashing on bad env input is resolved.

All remaining findings are P2 style suggestions (missing blank line in constants.py, existing tests not pinning the new constant to None). Neither blocks correctness or reliability. Prior P1 concern about unguarded int() conversion is fully addressed by the try/except block.

tests/test_litellm/proxy/test_health_check_max_tokens.py — pre-existing tests should pin BACKGROUND_HEALTH_CHECK_MAX_TOKENS to None to avoid environment-sensitive failures.

Vulnerabilities

No security concerns identified. The new env var only controls an integer max_tokens value sent with health-check requests; it is parsed safely with a try/except guard and falls back to None on invalid input. There is no auth, credential, or injection surface introduced.

Important Files Changed

Filename	Overview
litellm/constants.py	Adds BACKGROUND_HEALTH_CHECK_MAX_TOKENS constant with safe try/except guarded int() conversion; prior review concern about crashing proxy on bad env var value is now resolved.
litellm/proxy/health_check.py	Inserts elif branch for BACKGROUND_HEALTH_CHECK_MAX_TOKENS between per-model override and the hardcoded default of 1; correctly applies to both wildcard and explicit models (documented in config_settings.md).
tests/test_litellm/proxy/test_health_check_max_tokens.py	Adds 3 well-structured unit tests for the new env var; existing tests lack isolation from the new constant and could fail if the env var is set in the developer's shell.
docs/my-website/docs/proxy/config_settings.md	Adds entry for BACKGROUND_HEALTH_CHECK_MAX_TOKENS in alphabetical order with accurate description covering both wildcard and explicit model behavior.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[_update_litellm_params_for_health_check] --> B{model_info has\nhealth_check_max_tokens?}
    B -- Yes --> C[max_tokens = health_check_max_tokens\nper-model override]
    B -- No --> D{BACKGROUND_HEALTH_CHECK_MAX_TOKENS\nenv var set?}
    D -- Yes --> E[max_tokens = BACKGROUND_HEALTH_CHECK_MAX_TOKENS\nglobal default]
    D -- No --> F{model string\ncontains wildcard *?}
    F -- No explicit model --> G[max_tokens = 1\nhardcoded default]
    F -- Yes wildcard --> H[max_tokens not set\nwildcard default = 10 handled downstream]

_{Reviews (3): Last reviewed commit: "Fix greptile reviews" | Re-trigger Greptile}

krrish-berri-2 · 2026-04-11T16:17:44Z

only per-model via health_check_max_tokens in model_info. This is especially useful for Azure deployments that fail with max_tokens=1.

shouldn't the 'real fix' be to fix the max tokens for azure to be higher or bump the default to max_tokens=5?

@Sameerlite

Sameerlite · 2026-04-13T12:13:38Z

only per-model via health_check_max_tokens in model_info. This is especially useful for Azure deployments that fail with max_tokens=1.

shouldn't the 'real fix' be to fix the max tokens for azure to be higher or bump the default to max_tokens=5?

@Sameerlite

We need this because we also have configurable DEFAULT_HEALTH_CHECK_PROMPT. This fix will make sure user can control both prompt and tokens

Sameerlite · 2026-04-13T12:39:49Z

.

feat(health-check): add BACKGROUND_HEALTH_CHECK_MAX_TOKENS env var

6621c40

vercel Bot deployed to Preview April 8, 2026 14:15 View deployment

greptile-apps Bot reviewed Apr 8, 2026

View reviewed changes

Comment thread litellm/constants.py Outdated

Comment thread litellm/proxy/health_check.py

Fix code qa

d8598ab

vercel Bot deployed to Preview April 8, 2026 16:05 View deployment

Fix greptile reviews

b08e058

vercel Bot deployed to Preview April 8, 2026 16:08 View deployment

ishaan-berri changed the base branch from main to litellm_ishaan_april14 April 14, 2026 17:04

ishaan-berri merged commit 9810a1b into litellm_ishaan_april14 Apr 14, 2026
93 of 97 checks passed

ishaan-berri deleted the litellm_Sameerlite/healthcheck-max-tokens branch April 14, 2026 17:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(health-check): add BACKGROUND_HEALTH_CHECK_MAX_TOKENS env var#25344

feat(health-check): add BACKGROUND_HEALTH_CHECK_MAX_TOKENS env var#25344
ishaan-berri merged 3 commits intolitellm_ishaan_april14from
litellm_Sameerlite/healthcheck-max-tokens

Sameerlite commented Apr 8, 2026

Uh oh!

vercel Bot commented Apr 8, 2026 •

edited

Loading

Uh oh!

codspeed-hq Bot commented Apr 8, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Apr 8, 2026 •

edited

Loading

Vulnerabilities

Important Files Changed

Uh oh!

Uh oh!

Uh oh!

krrish-berri-2 commented Apr 11, 2026

Uh oh!

Sameerlite commented Apr 13, 2026

Uh oh!

Sameerlite commented Apr 13, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Sameerlite commented Apr 8, 2026

Relevant issues

Pre-Submission checklist

Type

Changes

Uh oh!

vercel Bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq Bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

greptile-apps Bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Vulnerabilities

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

krrish-berri-2 commented Apr 11, 2026

Uh oh!

Sameerlite commented Apr 13, 2026

Uh oh!

Sameerlite commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vercel Bot commented Apr 8, 2026 •

edited

Loading

codspeed-hq Bot commented Apr 8, 2026 •

edited

Loading

greptile-apps Bot commented Apr 8, 2026 •

edited

Loading

Sameerlite commented Apr 13, 2026 •

edited

Loading