Conversation
…update and key rotation (#25552) Two code paths in key_management_endpoints.py call hash_token() unconditionally when invalidating the user_api_key_cache after a key update. When the caller passes a pre-hashed token ID (not an sk- prefixed key), hash_token() double-hashes it, producing a cache key that does not match the actual cached entry. Cache invalidation silently fails. This is compounded by update_cache() which writes the stale cached key object back with a fresh 60s TTL after every successful request, preventing natural TTL expiry. The stale entry (with outdated fields like max_budget=None) persists indefinitely under load. PR #24969 fixed this in update_key_fn but missed two other call sites: - _process_single_key_update (bulk update path) - _execute_virtual_key_regeneration (key rotation path) Fix: replace hash_token() with _hash_token_if_needed() in both locations, matching the pattern already used elsewhere in the file. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… key (#25549) The model_max_budget limiter tracks spend in one code path (async_log_success_event) and enforces budget limits in another (is_key_within_model_budget via user_api_key_auth). These two paths used different model name formats to build cache keys: - Tracking used standard_logging_payload["model"], which is the deployment-level model name (e.g. "vertex_ai/claude-opus-4-6@default") - Enforcement used request_data["model"], which is the model group alias (e.g. "claude-opus-4-6") Because the cache keys never matched, the enforcement path always read None for current spend, silently allowing all requests through even after the budget was exceeded. This affected any provider that decorates model names with provider prefixes or version suffixes (Vertex AI, Bedrock, etc.). Fix: use model_group (the user-facing alias) from StandardLoggingPayload for spend tracking, falling back to model when model_group is None. This aligns the tracking cache key with the enforcement cache key. Fixes the same root cause reported in #15223 and #10052. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…r schedule (#25440) Budget table entries (team members, end-users) used duration_in_seconds() for a sliding-window reset, while keys/users/teams used calendar-aligned get_budget_reset_time(). This made "30d" and "1mo" mean different things depending on entity type. Now both paths use get_budget_reset_time() for consistent calendar-aligned resets (e.g. "30d" → 1st of next month). Fixes #25432 Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis staging PR bundles several bug fixes: aligning Confidence Score: 5/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/common_utils/reset_budget_job.py | Adds reset_budget_for_keys_linked_to_budgets (resets spend on keys tied to a budget tier with no own reset schedule) and reset_budget_for_team_members (resets team-member spend counters in Redis/in-memory + DB). Also extends _reset_budget_common to flush Redis spend counters for keys and teams on budget reset. Logic is sound; imports from proxy_server are inline due to the circular-import constraint. |
| litellm/proxy/hooks/model_max_budget_limiter.py | Uses model_group (user-facing alias) instead of the deployment-level model field as the spend-tracking cache key, aligning the log path with the enforcement path in is_key_within_model_budget. No issues found. |
| litellm/proxy/management_endpoints/budget_management_endpoints.py | Adds validate_model_max_budget calls to /budget/new and /budget/update to enforce the enterprise license check and schema validation on model_max_budget. Minor: the import is placed inline inside the handler functions instead of at the top of the file, violating the project style guide. |
| tests/test_litellm/proxy/common_utils/test_reset_budget_job.py | Comprehensive new test suite for ResetBudgetJob covering keys, users, teams, end-users, budget-table resets, and the new reset_budget_for_keys_linked_to_budgets helper. Minor gap: MockLiteLLMTeamMembership is missing find_many, silently skipping the new team-member cache-reset path; and spend: {gt: 0} filter is not verified. |
| tests/test_litellm/proxy/management_endpoints/test_key_management_endpoints.py | Adds tests for cache-invalidation correctness in _process_single_key_update and _execute_virtual_key_regeneration, verifying that pre-hashed tokens are not double-hashed. Tests are isolated and use appropriate mocking. |
| tests/test_litellm/test_cost_calculator.py | Adds pricing-verification tests for baseten and wandb model entries, a dynamic-routing cost-fallback test, and an image token cost test with and without input_cost_per_image_token. All tests are self-contained mock-based unit tests. |
| tests/test_budget_management.py | Integration test verifying that /budget/new with a budget_duration sets budget_reset_at to the next calendar-aligned reset time. Lives in the tests/ root (integration tier) and makes real HTTP calls, which is appropriate for this location. |
| litellm/proxy/management_endpoints/key_management_endpoints.py | Replaces raw cache-key lookup with _hash_token_if_needed in bulk-update and key-regeneration paths to prevent double-hashing when the input is already a pre-hashed token ID. No functional regressions found. |
Sequence Diagram
sequenceDiagram
participant BudgetResetJob
participant DB as PrismaClient / DB
participant Cache as Redis / InMemoryCache
BudgetResetJob->>DB: "find budgets where reset_at <= now"
DB-->>BudgetResetJob: budgets_to_reset[]
BudgetResetJob->>BudgetResetJob: _reset_budget_reset_at_date() for each budget
BudgetResetJob->>DB: update_many budgets (new reset_at)
BudgetResetJob->>DB: find team memberships by budget_id
DB-->>BudgetResetJob: memberships[]
loop each membership
BudgetResetJob->>Cache: "set spend:team_member:{uid}:{tid} = 0"
end
BudgetResetJob->>DB: "update_many litellm_teammembership spend=0"
BudgetResetJob->>DB: "update_many litellm_verificationtoken<br/>(budget_id IN ids, budget_duration IS NULL, spend > 0) spend=0"
BudgetResetJob->>DB: find end-users by budget_id
DB-->>BudgetResetJob: endusers[]
BudgetResetJob->>BudgetResetJob: "_reset_budget_for_enduser() spend=0"
BudgetResetJob->>DB: "update_many endusers spend=0"
Reviews (4): Last reviewed commit: "Fix code qa" | Re-trigger Greptile
| class DictLikeResult: | ||
| def __init__(self, data): | ||
| self._data = data | ||
| def __iter__(self): | ||
| return iter(self._data.items()) | ||
| mock_prisma_client.db.litellm_verificationtoken.update = AsyncMock( | ||
| return_value=DictLikeResult({"token": "new-hashed-token", "key_name": "sk-...ab12", "user_id": "user-1"}) | ||
| ) | ||
| mock_prisma_client.db.litellm_verificationtoken.create = AsyncMock( | ||
| return_value=None | ||
| ) | ||
| mock_prisma_client.jsonify_object = MagicMock(side_effect=lambda data: data) | ||
|
|
||
| mock_user_api_key_cache = MagicMock() | ||
| mock_proxy_logging_obj = MagicMock() | ||
|
|
||
| user_api_key_dict = UserAPIKeyAuth( | ||
| user_role=LitellmUserRoles.PROXY_ADMIN, | ||
| api_key="sk-admin", |
There was a problem hiding this comment.
DictLikeResult workaround is fragile
The inner DictLikeResult class works today because _execute_virtual_key_regeneration calls dict(updated_token), and __iter__ yields (key, value) pairs. However, if the function is later updated to call any Prisma model method (e.g., updated_token.model_dump(), updated_token.token, attribute access), the test would raise an AttributeError without a clear failure message.
A more resilient alternative is using a MagicMock with __iter__ set, or using a real LiteLLM_VerificationToken instance as the mock return value. This keeps the test aligned with the actual Prisma return type shape.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
…ng-encoding-format Revert "fix(embedding): omit null encoding_format for openai requests"
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 29203053 | Triggered | Generic Password | ee40da5 | .circleci/config.yml | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewDelays in PR merge?
If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).
CI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Screenshots / Proof of Fix
Type
🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test
Changes