Skip to content

fix(proxy): fix virtual key projected-spend soft budget alerts#25838

Merged
ryan-crabbe-berri merged 1 commit intolitellm_internal_stagingfrom
litellm_fix-virtual-key-projected-spend-alert
Apr 16, 2026
Merged

fix(proxy): fix virtual key projected-spend soft budget alerts#25838
ryan-crabbe-berri merged 1 commit intolitellm_internal_stagingfrom
litellm_fix-virtual-key-projected-spend-alert

Conversation

@ryan-crabbe-berri
Copy link
Copy Markdown
Collaborator

@ryan-crabbe-berri ryan-crabbe-berri commented Apr 16, 2026

Summary

  • The projected-spend alert in _update_key_cache (proxy_server.py) read from existing_spend_obj.litellm_budget_table["soft_budget"] — a nested dict that is never populated for virtual keys. The combined_view SQL maps budget fields to flat top-level attributes (soft_budget, max_budget, etc.) but never constructs the dict. This made the projected-spend check dead code that silently short-circuited on every request.
  • When unblocked by reading from the correct flat field, the code crashed with a Pydantic ValidationError because _get_projected_spend_over_limit returns a date object but CallInfo.projected_exceeded_date expects str.
  • Also marks team soft budget email alerts as an enterprise feature in docs.

Fix

  • Read from existing_spend_obj.soft_budget (the flat field that IS populated via the combined_view SQL mapping) instead of existing_spend_obj.litellm_budget_table["soft_budget"]
  • Stringify projected_exceeded_date before passing to CallInfo

Screenshots

before
Screenshot 2026-04-15 at 9 43 30 PM

after
Screenshot 2026-04-15 at 9 43 49 PM
Screenshot 2026-04-15 at 9 45 12 PM

Test plan

  • Start proxy with mock model, create key with soft_budget=0.0001 assigned to a user with email
  • Make 15 requests (~$0.0000135/req) to cross the soft budget threshold
  • Verify spend increments correctly in cache (was broken before — update_cache crashed and spend stopped accumulating)
  • Verify email alert delivered to mailpit (local SMTP) with correct subject and recipient
  • Verify no Pydantic ValidationError in logs

Closes #20324

…d alerts

The projected-spend alert in _update_key_cache read from
existing_spend_obj.litellm_budget_table["soft_budget"], but the nested
dict is never populated for virtual keys (the combined_view SQL maps
budget fields to flat top-level attributes instead). This made the
check dead code — it silently short-circuited on every request, and
when unblocked, crashed update_cache with a Pydantic ValidationError
because _get_projected_spend_over_limit returns a date object but
CallInfo.projected_exceeded_date expects str.

Fixes: read from the flat existing_spend_obj.soft_budget field that IS
populated, and stringify projected_exceeded_date.

Also marks team soft budget email alerts as enterprise in docs.

Closes #20324
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 16, 2026 4:40am

Request Review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 16, 2026

Greptile Summary

This PR fixes two bugs in the virtual key projected-spend soft budget alerting path in _update_key_cache. The original code read from existing_spend_obj.litellm_budget_table["soft_budget"] — a nested dict that is never populated for virtual keys via the combined_view SQL — making the entire check dead code. It also passed a date object to CallInfo.projected_exceeded_date (typed Optional[str]), causing a Pydantic ValidationError if the path ever executed. The fix correctly reads from the flat soft_budget field and stringifies the date. The docs change adds an enterprise callout to the team soft budget alerts page.

Confidence Score: 5/5

Safe to merge — both changes are targeted bug fixes with no backward-incompatible impact.

Both fixes are straightforward and correct: reading from the populated flat field instead of the empty nested dict, and converting a date to string for a string-typed field. The only findings are P2: a pre-existing unimplemented cooldown placeholder (now more observable since the alert path is unblocked) and missing unit tests. Neither blocks merge.

No files require special attention.

Important Files Changed

Filename Overview
litellm/proxy/proxy_server.py Fixes the virtual key projected-spend soft budget check: reads soft_budget from the flat LiteLLM_VerificationTokenView field (which is populated) instead of the never-populated litellm_budget_table dict, and stringifies the date return value from _get_projected_spend_over_limit to satisfy CallInfo.projected_exceeded_date: Optional[str].
docs/my-website/docs/proxy/ui_team_soft_budget_alerts.md Adds an enterprise callout at the top of the team soft budget alerts doc page, marking the feature as requiring an enterprise license.

Sequence Diagram

sequenceDiagram
    participant Req as Incoming Request
    participant UC as update_cache()
    participant KC as _update_key_cache()
    participant Cache as user_api_key_cache
    participant PS as _is_projected_spend_over_limit()
    participant GP as _get_projected_spend_over_limit()
    participant Alert as proxy_logging_obj.budget_alerts()

    Req->>UC: response_cost
    UC->>KC: token, response_cost
    KC->>Cache: async_get_cache(hashed_token)
    Cache-->>KC: LiteLLM_VerificationTokenView
    KC->>KC: new_spend = existing_spend + response_cost
    KC->>PS: current_spend=new_spend, soft_budget_limit=obj.soft_budget
    Note over PS: BEFORE: read from litellm_budget_table dict (never populated) AFTER: read from flat soft_budget field
    PS-->>KC: True / False
    alt projected spend over limit
        KC->>GP: current_spend=new_spend, soft_budget_limit=obj.soft_budget
        GP-->>KC: (projected_spend, date_object)
        KC->>KC: projected_exceeded_date = str(date_object)
        Note over KC: BEFORE: passed raw date → Pydantic ValidationError AFTER: str() converts to ISO string
        KC->>Alert: CallInfo(projected_exceeded_date=str, ...)
        Alert-->>Req: email alert dispatched
    end
    KC->>Cache: update spend in cache
Loading

Comments Outside Diff (1)

  1. litellm/proxy/proxy_server.py, line 1904 (link)

    P2 Cooldown is never actually set

    The # set cooldown on alert comment at line 1904 is a placeholder with no implementation — soft_budget_cooldown is never flipped to True anywhere in the codebase. This means every request after the threshold is crossed will re-trigger the projected-spend alert rather than rate-limiting notifications. Now that the alert path is unblocked by this fix, this pre-existing gap becomes observable in production. Consider setting existing_spend_obj.soft_budget_cooldown = True (and updating the cache) after firing the alert to suppress repeated notifications within the same cache TTL window.

Reviews (1): Last reviewed commit: "fix(proxy): use flat soft_budget field f..." | Re-trigger Greptile

Comment on lines 1882 to 1885
projected_spend, projected_exceeded_date = _get_projected_spend_over_limit(
current_spend=new_spend,
soft_budget_limit=existing_spend_obj.litellm_budget_table.get(
"soft_budget", None
),
soft_budget_limit=existing_spend_obj.soft_budget,
) # type: ignore
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No unit test for the corrected code path

The CLAUDE.md template asks for at least one test in tests/litellm/. The existing tests in test_proxy_utils.py cover _get_projected_spend_over_limit in isolation but not the _update_key_cache path that reads soft_budget from the flat field. A minimal test with a mocked LiteLLM_VerificationTokenView (with soft_budget set and litellm_budget_table=None) would guard against the regression re-appearing.

@yuneng-berri yuneng-berri self-requested a review April 16, 2026 04:56
@ryan-crabbe-berri ryan-crabbe-berri merged commit 2dd060b into litellm_internal_staging Apr 16, 2026
97 of 100 checks passed
@ryan-crabbe-berri ryan-crabbe-berri deleted the litellm_fix-virtual-key-projected-spend-alert branch April 16, 2026 05:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Email budget alerts not working for virtual keys with soft_budget (v1.80.11-v1.80.15)

2 participants