feat(health-check): add BACKGROUND_HEALTH_CHECK_MAX_TOKENS env var#25344
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR adds a new environment variable that provides a global fallback for Key changes:
Confidence Score: 5/5Safe to merge — the feature is correct, well-tested, and the previous P1 concern about crashing on bad env input is resolved. All remaining findings are P2 style suggestions (missing blank line in constants.py, existing tests not pinning the new constant to None). Neither blocks correctness or reliability. Prior P1 concern about unguarded int() conversion is fully addressed by the try/except block. tests/test_litellm/proxy/test_health_check_max_tokens.py — pre-existing tests should pin BACKGROUND_HEALTH_CHECK_MAX_TOKENS to None to avoid environment-sensitive failures.
|
| Filename | Overview |
|---|---|
| litellm/constants.py | Adds BACKGROUND_HEALTH_CHECK_MAX_TOKENS constant with safe try/except guarded int() conversion; prior review concern about crashing proxy on bad env var value is now resolved. |
| litellm/proxy/health_check.py | Inserts elif branch for BACKGROUND_HEALTH_CHECK_MAX_TOKENS between per-model override and the hardcoded default of 1; correctly applies to both wildcard and explicit models (documented in config_settings.md). |
| tests/test_litellm/proxy/test_health_check_max_tokens.py | Adds 3 well-structured unit tests for the new env var; existing tests lack isolation from the new constant and could fail if the env var is set in the developer's shell. |
| docs/my-website/docs/proxy/config_settings.md | Adds entry for BACKGROUND_HEALTH_CHECK_MAX_TOKENS in alphabetical order with accurate description covering both wildcard and explicit model behavior. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[_update_litellm_params_for_health_check] --> B{model_info has\nhealth_check_max_tokens?}
B -- Yes --> C[max_tokens = health_check_max_tokens\nper-model override]
B -- No --> D{BACKGROUND_HEALTH_CHECK_MAX_TOKENS\nenv var set?}
D -- Yes --> E[max_tokens = BACKGROUND_HEALTH_CHECK_MAX_TOKENS\nglobal default]
D -- No --> F{model string\ncontains wildcard *?}
F -- No explicit model --> G[max_tokens = 1\nhardcoded default]
F -- Yes wildcard --> H[max_tokens not set\nwildcard default = 10 handled downstream]
Reviews (3): Last reviewed commit: "Fix greptile reviews" | Re-trigger Greptile
shouldn't the 'real fix' be to fix the max tokens for azure to be higher or bump the default to max_tokens=5? |
We need this because we also have configurable DEFAULT_HEALTH_CHECK_PROMPT. This fix will make sure user can control both prompt and tokens |
|
. |
Relevant issues
fixes LIT-2231
Pre-Submission checklist
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewType
🆕 New Feature
Changes
Adds a new
BACKGROUND_HEALTH_CHECK_MAX_TOKENSenvironment variable that sets a global defaultmax_tokensfor health check calls on explicit (non-wildcard) model deployments. Previously,max_tokens=1was hardcoded for explicit models with no way to override it globally — only per-model viahealth_check_max_tokensinmodel_info. This is especially useful for Azure deployments that fail withmax_tokens=1. Priority order: per-modelhealth_check_max_tokens>BACKGROUND_HEALTH_CHECK_MAX_TOKENSenv var > existing defaults (1for explicit,10for wildcard).