Litellm oss staging 04 11 2026 by krrish-berri-2 · Pull Request #25589 · BerriAI/litellm

krrish-berri-2 · 2026-04-12T02:55:55Z

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Screenshots / Proof of Fix

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

…update and key rotation (#25552) Two code paths in key_management_endpoints.py call hash_token() unconditionally when invalidating the user_api_key_cache after a key update. When the caller passes a pre-hashed token ID (not an sk- prefixed key), hash_token() double-hashes it, producing a cache key that does not match the actual cached entry. Cache invalidation silently fails. This is compounded by update_cache() which writes the stale cached key object back with a fresh 60s TTL after every successful request, preventing natural TTL expiry. The stale entry (with outdated fields like max_budget=None) persists indefinitely under load. PR #24969 fixed this in update_key_fn but missed two other call sites: - _process_single_key_update (bulk update path) - _execute_virtual_key_regeneration (key rotation path) Fix: replace hash_token() with _hash_token_if_needed() in both locations, matching the pattern already used elsewhere in the file. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… key (#25549) The model_max_budget limiter tracks spend in one code path (async_log_success_event) and enforces budget limits in another (is_key_within_model_budget via user_api_key_auth). These two paths used different model name formats to build cache keys: - Tracking used standard_logging_payload["model"], which is the deployment-level model name (e.g. "vertex_ai/claude-opus-4-6@default") - Enforcement used request_data["model"], which is the model group alias (e.g. "claude-opus-4-6") Because the cache keys never matched, the enforcement path always read None for current spend, silently allowing all requests through even after the budget was exceeded. This affected any provider that decorates model names with provider prefixes or version suffixes (Vertex AI, Bedrock, etc.). Fix: use model_group (the user-facing alias) from StandardLoggingPayload for spend tracking, falling back to model when model_group is None. This aligns the tracking cache key with the enforcement cache key. Fixes the same root cause reported in #15223 and #10052. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…r schedule (#25440) Budget table entries (team members, end-users) used duration_in_seconds() for a sliding-window reset, while keys/users/teams used calendar-aligned get_budget_reset_time(). This made "30d" and "1mo" mean different things depending on entity type. Now both paths use get_budget_reset_time() for consistent calendar-aligned resets (e.g. "30d" → 1st of next month). Fixes #25432 Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…x-m2.5 (#25409)

vercel · 2026-04-12T02:56:02Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Apr 14, 2026 3:37pm

codspeed-hq · 2026-04-12T02:57:33Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing litellm_oss_staging_04_11_2026 (a0e61a9) with main (e64d98f)}

greptile-apps · 2026-04-12T03:01:05Z

Greptile Summary

This staging PR bundles several bug fixes: aligning model_max_budget spend-tracking cache keys to use model_group (the user-facing alias), fixing silent cache-invalidation failures caused by double-hashing token IDs in bulk-update and key-rotation endpoints, resetting spend counters for budget-tier-linked keys and team members when a budget period expires, and adding model_max_budget validation to the /budget/new and /budget/update endpoints. New model pricing entries (baseten, wandb) and accompanying unit/integration tests are included.

Confidence Score: 5/5

Safe to merge; all remaining findings are P2 style and test-coverage suggestions that do not block production correctness.
All three issues are P2: an inline import that should be module-level, a missing mock method that silently skips (but does not break) a test, and a missing assertion on a spend > 0 filter. None affect runtime behaviour or data integrity.
tests/test_litellm/proxy/common_utils/test_reset_budget_job.py — add find_many stub to MockLiteLLMTeamMembership so the new team-member cache-reset path is exercised.

Important Files Changed

Filename	Overview
litellm/proxy/common_utils/reset_budget_job.py	Adds `reset_budget_for_keys_linked_to_budgets` (resets spend on keys tied to a budget tier with no own reset schedule) and `reset_budget_for_team_members` (resets team-member spend counters in Redis/in-memory + DB). Also extends `_reset_budget_common` to flush Redis spend counters for keys and teams on budget reset. Logic is sound; imports from `proxy_server` are inline due to the circular-import constraint.
litellm/proxy/hooks/model_max_budget_limiter.py	Uses `model_group` (user-facing alias) instead of the deployment-level `model` field as the spend-tracking cache key, aligning the log path with the enforcement path in `is_key_within_model_budget`. No issues found.
litellm/proxy/management_endpoints/budget_management_endpoints.py	Adds `validate_model_max_budget` calls to `/budget/new` and `/budget/update` to enforce the enterprise license check and schema validation on `model_max_budget`. Minor: the import is placed inline inside the handler functions instead of at the top of the file, violating the project style guide.
tests/test_litellm/proxy/common_utils/test_reset_budget_job.py	Comprehensive new test suite for ResetBudgetJob covering keys, users, teams, end-users, budget-table resets, and the new `reset_budget_for_keys_linked_to_budgets` helper. Minor gap: `MockLiteLLMTeamMembership` is missing `find_many`, silently skipping the new team-member cache-reset path; and `spend: {gt: 0}` filter is not verified.
tests/test_litellm/proxy/management_endpoints/test_key_management_endpoints.py	Adds tests for cache-invalidation correctness in `_process_single_key_update` and `_execute_virtual_key_regeneration`, verifying that pre-hashed tokens are not double-hashed. Tests are isolated and use appropriate mocking.
tests/test_litellm/test_cost_calculator.py	Adds pricing-verification tests for baseten and wandb model entries, a dynamic-routing cost-fallback test, and an image token cost test with and without `input_cost_per_image_token`. All tests are self-contained mock-based unit tests.
tests/test_budget_management.py	Integration test verifying that `/budget/new` with a `budget_duration` sets `budget_reset_at` to the next calendar-aligned reset time. Lives in the `tests/` root (integration tier) and makes real HTTP calls, which is appropriate for this location.
litellm/proxy/management_endpoints/key_management_endpoints.py	Replaces raw cache-key lookup with `_hash_token_if_needed` in bulk-update and key-regeneration paths to prevent double-hashing when the input is already a pre-hashed token ID. No functional regressions found.

Sequence Diagram

sequenceDiagram
    participant BudgetResetJob
    participant DB as PrismaClient / DB
    participant Cache as Redis / InMemoryCache

    BudgetResetJob->>DB: "find budgets where reset_at <= now"
    DB-->>BudgetResetJob: budgets_to_reset[]

    BudgetResetJob->>BudgetResetJob: _reset_budget_reset_at_date() for each budget
    BudgetResetJob->>DB: update_many budgets (new reset_at)

    BudgetResetJob->>DB: find team memberships by budget_id
    DB-->>BudgetResetJob: memberships[]
    loop each membership
        BudgetResetJob->>Cache: "set spend:team_member:{uid}:{tid} = 0"
    end
    BudgetResetJob->>DB: "update_many litellm_teammembership spend=0"

    BudgetResetJob->>DB: "update_many litellm_verificationtoken<br/>(budget_id IN ids, budget_duration IS NULL, spend > 0) spend=0"

    BudgetResetJob->>DB: find end-users by budget_id
    DB-->>BudgetResetJob: endusers[]
    BudgetResetJob->>BudgetResetJob: "_reset_budget_for_enduser() spend=0"
    BudgetResetJob->>DB: "update_many endusers spend=0"

_{Reviews (4): Last reviewed commit: "Fix code qa" | Re-trigger Greptile}

greptile-apps · 2026-04-12T03:01:12Z

+    class DictLikeResult:
+        def __init__(self, data):
+            self._data = data
+        def __iter__(self):
+            return iter(self._data.items())
+    mock_prisma_client.db.litellm_verificationtoken.update = AsyncMock(
+        return_value=DictLikeResult({"token": "new-hashed-token", "key_name": "sk-...ab12", "user_id": "user-1"})
+    )
+    mock_prisma_client.db.litellm_verificationtoken.create = AsyncMock(
+        return_value=None
+    )
+    mock_prisma_client.jsonify_object = MagicMock(side_effect=lambda data: data)
+
+    mock_user_api_key_cache = MagicMock()
+    mock_proxy_logging_obj = MagicMock()
+
+    user_api_key_dict = UserAPIKeyAuth(
+        user_role=LitellmUserRoles.PROXY_ADMIN,
+        api_key="sk-admin",


DictLikeResult workaround is fragile

The inner DictLikeResult class works today because _execute_virtual_key_regeneration calls dict(updated_token), and __iter__ yields (key, value) pairs. However, if the function is later updated to call any Prisma model method (e.g., updated_token.model_dump(), updated_token.token, attribute access), the test would raise an AttributeError without a clear failure message.

A more resilient alternative is using a MagicMock with __iter__ set, or using a real LiteLLM_VerificationToken instance as the mock return value. This keeps the test aligned with the actual Prisma return type shape.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

…#25395)" This reverts commit e3d160f.

…ng-encoding-format Revert "fix(embedding): omit null encoding_format for openai requests"

gitguardian · 2026-04-14T15:25:00Z

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
29203053	Triggered	Generic Password	`ee40da5`	.circleci/config.yml	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secret safely. Learn here the best practices.
Revoke and rotate this secret.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

dkindlund and others added 5 commits April 11, 2026 19:36

fix(embedding): omit null encoding_format for openai requests (#25395)

e3d160f

feat(model):add wandb model offerings to include kimi-k2.5 and minima…

ee06b92

…x-m2.5 (#25409)

krrish-berri-2 requested a review from Sameerlite April 12, 2026 02:55

greptile-apps Bot reviewed Apr 12, 2026

View reviewed changes

Sameerlite added 2 commits April 14, 2026 20:36

Revert "fix(embedding): omit null encoding_format for openai requests (…

e6771fe

…#25395)" This reverts commit e3d160f.

Merge pull request #25698 from BerriAI/revert-25395-fix/25388-embeddi…

3d567c3

…ng-encoding-format Revert "fix(embedding): omit null encoding_format for openai requests"

Sameerlite had a problem deploying to integration-postgres April 14, 2026 15:07 — with GitHub Actions Error

Sameerlite temporarily deployed to integration-postgres April 14, 2026 15:07 — with GitHub Actions Inactive

vercel Bot deployed to Preview April 14, 2026 15:09 View deployment

Sameerlite added 2 commits April 14, 2026 20:46

Fix bulk update tests

f6e526c

Fix budget reset test

ef94f5f

Sameerlite had a problem deploying to integration-postgres April 14, 2026 15:20 — with GitHub Actions Error

Sameerlite had a problem deploying to integration-postgres April 14, 2026 15:21 — with GitHub Actions Error

vercel Bot deployed to Preview April 14, 2026 15:22 View deployment

Merge branch 'main' into litellm_oss_staging_04_11_2026

ee40da5

Sameerlite temporarily deployed to integration-postgres April 14, 2026 15:25 — with GitHub Actions Inactive

Sameerlite had a problem deploying to integration-postgres April 14, 2026 15:25 — with GitHub Actions Error

Sameerlite temporarily deployed to integration-postgres April 14, 2026 15:25 — with GitHub Actions Inactive

Sameerlite had a problem deploying to integration-postgres April 14, 2026 15:25 — with GitHub Actions Error

vercel Bot deployed to Preview April 14, 2026 15:26 View deployment

Fix code qa

a0e61a9

Sameerlite temporarily deployed to integration-postgres April 14, 2026 15:31 — with GitHub Actions Inactive

Sameerlite temporarily deployed to integration-postgres April 14, 2026 15:32 — with GitHub Actions Inactive

Sameerlite had a problem deploying to integration-postgres April 14, 2026 15:32 — with GitHub Actions Error

vercel Bot deployed to Preview April 14, 2026 15:37 View deployment

Sameerlite approved these changes Apr 14, 2026

View reviewed changes

Sameerlite merged commit b8f7d61 into main Apr 14, 2026
100 of 108 checks passed

Sameerlite deleted the litellm_oss_staging_04_11_2026 branch April 14, 2026 18:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Litellm oss staging 04 11 2026#25589

Litellm oss staging 04 11 2026#25589
Sameerlite merged 11 commits intomainfrom
litellm_oss_staging_04_11_2026

krrish-berri-2 commented Apr 12, 2026

Uh oh!

vercel Bot commented Apr 12, 2026 •

edited

Loading

Uh oh!

codspeed-hq Bot commented Apr 12, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Apr 12, 2026 •

edited

Loading

Important Files Changed

Uh oh!

greptile-apps Bot Apr 12, 2026

Uh oh!

gitguardian Bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

Conversation

krrish-berri-2 commented Apr 12, 2026

Relevant issues

Pre-Submission checklist

Delays in PR merge?

CI (LiteLLM team)

Screenshots / Proof of Fix

Type

Changes

Uh oh!

vercel Bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq Bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

greptile-apps Bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

gitguardian Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

vercel Bot commented Apr 12, 2026 •

edited

Loading

codspeed-hq Bot commented Apr 12, 2026 •

edited

Loading

greptile-apps Bot commented Apr 12, 2026 •

edited

Loading

gitguardian Bot commented Apr 14, 2026 •

edited

Loading