Skip to content

feat: configurable multi-threshold budget alerts for virtual keys#25989

Merged
ryan-crabbe-berri merged 13 commits intolitellm_internal_stagingfrom
litellm_feat-multi_threshold_budget_alerts
Apr 18, 2026
Merged

feat: configurable multi-threshold budget alerts for virtual keys#25989
ryan-crabbe-berri merged 13 commits intolitellm_internal_stagingfrom
litellm_feat-multi_threshold_budget_alerts

Conversation

@ryan-crabbe-berri
Copy link
Copy Markdown
Collaborator

@ryan-crabbe-berri ryan-crabbe-berri commented Apr 18, 2026

Summary

Adds configurable multi-threshold budget alerts for virtual keys. Users can define multiple spend thresholds (e.g. 50%, 75%, 100%) each with their own list of email recipients, replacing the single hardcoded 80% alert.

Per-key configuration (via /key/generate or /key/update metadata):

{
  "metadata": {
    "max_budget_alert_emails": {
      "50": ["finance@co.com"],
      "75": ["finance@co.com", "bu_lead@co.com"],
      "100": ["finance@co.com", "bu_lead@co.com", "cto@co.com"]
    }
  }
}

Global fallback for all keys (via config yaml):

litellm_settings:
  default_key_max_budget_alert_emails:
    "50": ["finance@co.com"]
    "75": ["finance@co.com", "bu_lead@co.com"]
  • Per-key metadata takes priority over the global setting
  • Key owner's email is auto-included and deduplicated at every threshold
  • When no map is configured, existing single 80% threshold behavior is preserved unchanged
  • Each threshold has its own dedup cache key (24hr TTL) to prevent duplicate sends
  • alerting: ["email"] must be enabled in general_settings

Changes

  • litellm/proxy/_types.py — Added max_budget_alert_emails field to CallInfo
  • litellm/proxy/auth/auth_checks.py — New path reads threshold map from key metadata (with global fallback), passes it through to email handler
  • litellm/__init__.py — Added default_key_max_budget_alert_emails module variable
  • base_email.py — Added _parse_email_list helper, _handle_multi_threshold_max_budget_alert method, extended send_max_budget_alert_email with optional threshold_pct/recipient_emails params

Test plan

  • Multi-threshold sends emails for all crossed thresholds
  • Dedup cache prevents re-sending already-sent thresholds
  • Owner email auto-included and deduplicated
  • Malformed threshold keys (non-numeric) are skipped
  • Empty email list for a threshold sends only to owner
  • Old 80% single-threshold path preserved when no map is set
  • Global fallback config used when key has no per-key map
  • Per-key metadata overrides global fallback
  • Auth layer attaches map to CallInfo correctly
  • Old path does not fire below 80% threshold

Users can set metadata.max_budget_alert_emails as a JSON map of threshold
percentages to email recipients on virtual keys. When configured, the email
handler loops over each threshold, checks per-threshold dedup cache, and
sends to the configured recipients (auto-including the key owner's email).

When no map is set, the existing single 80% threshold behavior is preserved
unchanged. Teams support is out of scope for this v0.
_virtual_key_max_budget_check raises BudgetExceededError when spend
crosses max_budget, which meant the 100% threshold in the multi-threshold
email config never got a chance to fire — the request that pushes spend
over 100% was rejected before the alert check ran. Reorder so the alert
check runs first; enforcement still raises right after.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 18, 2026

Greptile Summary

This PR adds configurable multi-threshold budget alerts for virtual keys, replacing the single hardcoded 80% email alert with a per-threshold recipient map that can be set per-key or via a global fallback. All P0/P1 issues flagged in the previous review round (non-list value crashes in the merge, empty-recipient IndexError, test assertion mismatch, task-per-request overhead) have been addressed. Remaining findings are minor style/design P2s.

Confidence Score: 5/5

Safe to merge; all prior P0/P1 blockers resolved, remaining findings are P2 style and design suggestions.

All critical issues from previous review rounds have been fixed: non-list value coercion, empty-recipient guard, test assertions, and task pre-filtering. The only open items are a set() vs dict.fromkeys ordering inconsistency, an unused test import, and a documentation/description mismatch around additive-vs-override merge semantics — none of which affect runtime correctness.

litellm/proxy/auth/auth_checks.py — the additive merge semantics vs the PR description's 'per-key takes priority' claim should be clarified before this becomes user-facing documentation.

Important Files Changed

Filename Overview
enterprise/litellm_enterprise/enterprise_callbacks/send_emails/base_email.py Adds _parse_email_list helper, _handle_multi_threshold_max_budget_alert method, and extends send_max_budget_alert_email with optional threshold_pct/recipient_emails params; old 80% path fully preserved; empty-list guard in place; minor: set() dedup loses order.
litellm/proxy/auth/auth_checks.py Adds _parse_email_list, _normalize_alert_emails, _merge_budget_alert_email_configs, and _virtual_key_max_budget_alert_check; non-list values are now safely coerced; min-pct pre-filter avoids unnecessary tasks; merge semantics are additive (contradicting PR description 'per-key takes priority').
litellm/proxy/_types.py Adds max_budget_alert_emails: Optional[Dict[str, List[str]]] field to CallInfo; type accurately reflects the normalized form produced by _normalize_alert_emails.
litellm/init.py Adds default_key_max_budget_alert_emails: Optional[Dict[str, list]] = None module-level variable for global fallback config.
litellm/proxy/auth/user_api_key_auth.py Reorders key budget alert check to run before max-budget enforcement, enabling 100% threshold alerts to fire on the crossing request; import added for _virtual_key_max_budget_alert_check.
tests/test_litellm/enterprise/enterprise_callbacks/send_emails/test_base_email.py Comprehensive new tests for multi-threshold sends, dedup cache, owner auto-include, malformed key skipping, empty list, old path preservation; unused TestClient import.
tests/test_litellm/proxy/auth/test_auth_checks.py New tests for multi-threshold map attachment, old path (below/above 80%), global fallback, and per-key additive merge; assertions now correctly match additive merge semantics.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Request arrives] --> B[_virtual_key_max_budget_alert_check]
    B --> C{max_budget_alert_emails configured?}
    C -- No --> D[Old 80% single-threshold path]
    D --> E{spend >= 80% of max_budget AND spend < max_budget?}
    E -- No --> F[Return — no alert]
    E -- Yes --> G[asyncio.create_task budget_alerts]
    G --> H[send_max_budget_alert_email old single-recipient path]
    C -- Yes --> I[_merge_budget_alert_email_configs global + per-key additive merge]
    I --> J{spend >= min configured threshold?}
    J -- No --> F
    J -- Yes --> K[asyncio.create_task budget_alerts]
    K --> L[_handle_multi_threshold_max_budget_alert]
    L --> M{For each threshold: spend >= threshold amount?}
    M -- No --> N[Skip threshold]
    M -- Yes --> O{Cache hit?}
    O -- Yes --> N
    O -- No --> P[Build recipient list = configured emails + owner]
    P --> Q{Any recipients?}
    Q -- No --> R[Log warning and skip]
    Q -- Yes --> S[send_max_budget_alert_email multi-recipient path]
    S --> T[Set cache SENT TTL 24h]
Loading

Reviews (10): Last reviewed commit: "Merge remote-tracking branch 'origin/lit..." | Re-trigger Greptile

Comment thread enterprise/litellm_enterprise/enterprise_callbacks/send_emails/base_email.py Outdated
Comment thread litellm/proxy/_types.py
Comment thread litellm/proxy/auth/auth_checks.py
…reeting

- Add `default_key_max_budget_alert_emails` litellm_settings config as
  global fallback for all virtual keys (per-key metadata takes priority)
- Fix crash when key has no user_id/user_email by passing recipient email
  to _get_email_params (same pattern as team soft budget path)
- Use owner email for greeting, falling back to key_alias or token
- Rename setting from default_max_budget_alert_emails to
  default_key_max_budget_alert_emails for clarity
…n, task pre-filter

- Guard empty recipients in _handle_multi_threshold_max_budget_alert:
  log warning and skip instead of falling through to old path error loop
- Widen max_budget_alert_emails type to Dict[str, Union[str, List[str]]]
  to match _parse_email_list runtime behavior (accepts comma-separated strings)
- Pre-filter asyncio.create_task with min threshold check to avoid
  unnecessary task allocation on every request when spend is below
  all configured thresholds
Copy link
Copy Markdown

@veria-ai veria-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium: Authenticated users can send emails to arbitrary recipients via key metadata

This PR introduces configurable multi-threshold budget alerts where email recipients are read from key metadata (max_budget_alert_emails). Since key metadata is user-controlled (any team admin can set it via /key/generate or /key/update), an authenticated user can trigger alert emails to arbitrary external addresses through the organization's email infrastructure (Resend, SendGrid, or SMTP).

  • medium: arbitrary email relay via user-controlled metadata — litellm/proxy/auth/auth_checks.py
  • low: HTML injection in email body via key_alias — enterprise/litellm_enterprise/enterprise_callbacks/send_emails/base_email.py

Comment thread litellm/proxy/auth/auth_checks.py Outdated
alert_threshold,
owner_email = user_obj.user_email if user_obj else None
alert_email_config = (valid_token.metadata or {}).get(
"max_budget_alert_emails"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium: Arbitrary email relay via user-controlled metadata

max_budget_alert_emails is read from valid_token.metadata, which is settable by any user with key-create or key-update permissions (team admins, key owners). An attacker can set this to arbitrary external email addresses, then deliberately trigger a budget threshold crossing to have the proxy send emails on their behalf through the org's email provider (Resend/SendGrid/SMTP). This turns the proxy into an email relay.

Consider either:

  1. Restricting max_budget_alert_emails to admin-only (don't read it from user-settable metadata, only from operator config or a dedicated admin field), or
  2. Validating recipient addresses against a team/org membership list before sending.

Comment thread enterprise/litellm_enterprise/enterprise_callbacks/send_emails/base_email.py Outdated
html.escape() the greeting (user_email/key_alias/token fallback) before
inserting into HTML email body to prevent HTML injection via key_alias.
- Add early return guard in _handle_multi_threshold_max_budget_alert
  for None max_budget_alert_emails and max_budget
- Add explicit type annotation on alert_email_config in auth_checks
Comment thread tests/test_litellm/proxy/auth/test_auth_checks.py Outdated
test_virtual_key_max_budget_alert_check_per_key_overrides_global asserted
override semantics but the implementation does additive merge. Renamed test
and updated assertion to match: per-key and global thresholds are unioned,
not replaced.
The lazy import from litellm_enterprise inside _normalize_alert_emails
coupled the core proxy auth path to an optional package. Core should not
depend on enterprise, even lazily — it hides the dependency from static
analysis and inverts the intended layering.

Duplicate the 7-line parser locally. It's pure and unlikely to drift; the
enterprise copy stays where it is for its own callers.
Copy link
Copy Markdown

@veria-ai veria-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium: Arbitrary email relay via user-controlled key metadata

This PR adds multi-threshold budget alert emails for virtual keys. The recipient list is read from valid_token.metadata.max_budget_alert_emails, which is settable by any user with key-create or key-update permissions. There is no validation that the email addresses belong to the key owner, their team, or any known entity — an attacker can point alerts at arbitrary external addresses, turning the proxy's email infrastructure into a relay for spam or phishing.

_merge_budget_alert_email_configs(
global_cfg=litellm.default_key_max_budget_alert_emails,
per_key_cfg=(valid_token.metadata or {}).get(
"max_budget_alert_emails"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium: Arbitrary email relay via user-controlled metadata

max_budget_alert_emails is pulled from valid_token.metadata, which any user with key-create or key-update permissions can set to arbitrary external addresses. When the key's spend crosses a threshold, emails are sent to those addresses via send_email(to_email=recipient_emails) with no validation.

An attacker can set metadata.max_budget_alert_emails to {"1": ["victim@external.com"]}, spend $0.01 on a $1 budget, and the proxy sends an HTML email to victim@external.com from the proxy's email domain. This is useful for phishing (the email comes from the proxy's configured sender address) or for abusing the proxy as a spam relay.

Consider validating that recipient addresses belong to known users in the system (e.g., team members or the key owner), or restrict this metadata field to admin-only configuration.

Copy link
Copy Markdown

@veria-ai veria-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium: Arbitrary email relay via user-controlled key metadata

This PR adds configurable multi-threshold budget alerts, allowing email recipients to be specified per key via metadata.max_budget_alert_emails. The email addresses sourced from key metadata are passed directly to the email sending subsystem without any validation or restriction, allowing any authenticated user with key-create or key-update permissions to use the proxy as an email relay to arbitrary external addresses.

_merge_budget_alert_email_configs(
global_cfg=litellm.default_key_max_budget_alert_emails,
per_key_cfg=(valid_token.metadata or {}).get(
"max_budget_alert_emails"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium: Arbitrary email relay via user-controlled metadata

max_budget_alert_emails is read directly from valid_token.metadata, which any user with key-create or key-update permissions can set. An attacker can populate this field with arbitrary external email addresses (e.g., spam targets), then deliberately spend up to the threshold to trigger the proxy to send emails on their behalf. There is no validation that the addresses belong to the organization or are otherwise authorized.

Consider either restricting this metadata field to admin-only writes (filtering it out during key creation/update for non-admin users), or validating recipient addresses against an allowlist (e.g., same domain as the organization, or addresses already registered in the system).

send_max_budget_alert_email previously guarded with `is not None`, which
accepts `[]` and then crashes on `recipient_emails[0]` inside
_get_email_params. The current caller (_handle_multi_threshold_max_budget_alert)
already filters empty lists upstream, but the public method signature makes
no such guarantee — a future caller passing [] would hit IndexError.

Switch to truthiness so both None and [] fall through to the single-recipient
path.
@yuneng-berri yuneng-berri self-requested a review April 18, 2026 22:07
@gitguardian
Copy link
Copy Markdown

gitguardian Bot commented Apr 18, 2026

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
29203065 Triggered JSON Web Token c8b7c1b tests/test_litellm/proxy/test_litellm_pre_call_utils.py View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 18, 2026

Codecov Report

❌ Patch coverage is 73.52941% with 9 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
litellm/proxy/auth/auth_checks.py 72.72% 9 Missing ⚠️

📢 Thoughts on this report? Let us know!

@ryan-crabbe-berri ryan-crabbe-berri merged commit 67bf18d into litellm_internal_staging Apr 18, 2026
96 of 99 checks passed
@ryan-crabbe-berri ryan-crabbe-berri deleted the litellm_feat-multi_threshold_budget_alerts branch April 18, 2026 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants