feat: configurable multi-threshold budget alerts for virtual keys#25989
Conversation
Users can set metadata.max_budget_alert_emails as a JSON map of threshold percentages to email recipients on virtual keys. When configured, the email handler loops over each threshold, checks per-threshold dedup cache, and sends to the configured recipients (auto-including the key owner's email). When no map is set, the existing single 80% threshold behavior is preserved unchanged. Teams support is out of scope for this v0.
_virtual_key_max_budget_check raises BudgetExceededError when spend crosses max_budget, which meant the 100% threshold in the multi-threshold email config never got a chance to fire — the request that pushes spend over 100% was rejected before the alert check ran. Reorder so the alert check runs first; enforcement still raises right after.
Greptile SummaryThis PR adds configurable multi-threshold budget alerts for virtual keys, replacing the single hardcoded 80% email alert with a per-threshold recipient map that can be set per-key or via a global fallback. All P0/P1 issues flagged in the previous review round (non-list value crashes in the merge, empty-recipient Confidence Score: 5/5Safe to merge; all prior P0/P1 blockers resolved, remaining findings are P2 style and design suggestions. All critical issues from previous review rounds have been fixed: non-list value coercion, empty-recipient guard, test assertions, and task pre-filtering. The only open items are a set() vs dict.fromkeys ordering inconsistency, an unused test import, and a documentation/description mismatch around additive-vs-override merge semantics — none of which affect runtime correctness. litellm/proxy/auth/auth_checks.py — the additive merge semantics vs the PR description's 'per-key takes priority' claim should be clarified before this becomes user-facing documentation.
|
| Filename | Overview |
|---|---|
| enterprise/litellm_enterprise/enterprise_callbacks/send_emails/base_email.py | Adds _parse_email_list helper, _handle_multi_threshold_max_budget_alert method, and extends send_max_budget_alert_email with optional threshold_pct/recipient_emails params; old 80% path fully preserved; empty-list guard in place; minor: set() dedup loses order. |
| litellm/proxy/auth/auth_checks.py | Adds _parse_email_list, _normalize_alert_emails, _merge_budget_alert_email_configs, and _virtual_key_max_budget_alert_check; non-list values are now safely coerced; min-pct pre-filter avoids unnecessary tasks; merge semantics are additive (contradicting PR description 'per-key takes priority'). |
| litellm/proxy/_types.py | Adds max_budget_alert_emails: Optional[Dict[str, List[str]]] field to CallInfo; type accurately reflects the normalized form produced by _normalize_alert_emails. |
| litellm/init.py | Adds default_key_max_budget_alert_emails: Optional[Dict[str, list]] = None module-level variable for global fallback config. |
| litellm/proxy/auth/user_api_key_auth.py | Reorders key budget alert check to run before max-budget enforcement, enabling 100% threshold alerts to fire on the crossing request; import added for _virtual_key_max_budget_alert_check. |
| tests/test_litellm/enterprise/enterprise_callbacks/send_emails/test_base_email.py | Comprehensive new tests for multi-threshold sends, dedup cache, owner auto-include, malformed key skipping, empty list, old path preservation; unused TestClient import. |
| tests/test_litellm/proxy/auth/test_auth_checks.py | New tests for multi-threshold map attachment, old path (below/above 80%), global fallback, and per-key additive merge; assertions now correctly match additive merge semantics. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Request arrives] --> B[_virtual_key_max_budget_alert_check]
B --> C{max_budget_alert_emails configured?}
C -- No --> D[Old 80% single-threshold path]
D --> E{spend >= 80% of max_budget AND spend < max_budget?}
E -- No --> F[Return — no alert]
E -- Yes --> G[asyncio.create_task budget_alerts]
G --> H[send_max_budget_alert_email old single-recipient path]
C -- Yes --> I[_merge_budget_alert_email_configs global + per-key additive merge]
I --> J{spend >= min configured threshold?}
J -- No --> F
J -- Yes --> K[asyncio.create_task budget_alerts]
K --> L[_handle_multi_threshold_max_budget_alert]
L --> M{For each threshold: spend >= threshold amount?}
M -- No --> N[Skip threshold]
M -- Yes --> O{Cache hit?}
O -- Yes --> N
O -- No --> P[Build recipient list = configured emails + owner]
P --> Q{Any recipients?}
Q -- No --> R[Log warning and skip]
Q -- Yes --> S[send_max_budget_alert_email multi-recipient path]
S --> T[Set cache SENT TTL 24h]
Reviews (10): Last reviewed commit: "Merge remote-tracking branch 'origin/lit..." | Re-trigger Greptile
…reeting - Add `default_key_max_budget_alert_emails` litellm_settings config as global fallback for all virtual keys (per-key metadata takes priority) - Fix crash when key has no user_id/user_email by passing recipient email to _get_email_params (same pattern as team soft budget path) - Use owner email for greeting, falling back to key_alias or token - Rename setting from default_max_budget_alert_emails to default_key_max_budget_alert_emails for clarity
…n, task pre-filter - Guard empty recipients in _handle_multi_threshold_max_budget_alert: log warning and skip instead of falling through to old path error loop - Widen max_budget_alert_emails type to Dict[str, Union[str, List[str]]] to match _parse_email_list runtime behavior (accepts comma-separated strings) - Pre-filter asyncio.create_task with min threshold check to avoid unnecessary task allocation on every request when spend is below all configured thresholds
There was a problem hiding this comment.
Medium: Authenticated users can send emails to arbitrary recipients via key metadata
This PR introduces configurable multi-threshold budget alerts where email recipients are read from key metadata (max_budget_alert_emails). Since key metadata is user-controlled (any team admin can set it via /key/generate or /key/update), an authenticated user can trigger alert emails to arbitrary external addresses through the organization's email infrastructure (Resend, SendGrid, or SMTP).
- medium: arbitrary email relay via user-controlled metadata — litellm/proxy/auth/auth_checks.py
- low: HTML injection in email body via key_alias — enterprise/litellm_enterprise/enterprise_callbacks/send_emails/base_email.py
| alert_threshold, | ||
| owner_email = user_obj.user_email if user_obj else None | ||
| alert_email_config = (valid_token.metadata or {}).get( | ||
| "max_budget_alert_emails" |
There was a problem hiding this comment.
Medium: Arbitrary email relay via user-controlled metadata
max_budget_alert_emails is read from valid_token.metadata, which is settable by any user with key-create or key-update permissions (team admins, key owners). An attacker can set this to arbitrary external email addresses, then deliberately trigger a budget threshold crossing to have the proxy send emails on their behalf through the org's email provider (Resend/SendGrid/SMTP). This turns the proxy into an email relay.
Consider either:
- Restricting
max_budget_alert_emailsto admin-only (don't read it from user-settable metadata, only from operator config or a dedicated admin field), or - Validating recipient addresses against a team/org membership list before sending.
html.escape() the greeting (user_email/key_alias/token fallback) before inserting into HTML email body to prevent HTML injection via key_alias.
- Add early return guard in _handle_multi_threshold_max_budget_alert for None max_budget_alert_emails and max_budget - Add explicit type annotation on alert_email_config in auth_checks
test_virtual_key_max_budget_alert_check_per_key_overrides_global asserted override semantics but the implementation does additive merge. Renamed test and updated assertion to match: per-key and global thresholds are unioned, not replaced.
The lazy import from litellm_enterprise inside _normalize_alert_emails coupled the core proxy auth path to an optional package. Core should not depend on enterprise, even lazily — it hides the dependency from static analysis and inverts the intended layering. Duplicate the 7-line parser locally. It's pure and unlikely to drift; the enterprise copy stays where it is for its own callers.
There was a problem hiding this comment.
Medium: Arbitrary email relay via user-controlled key metadata
This PR adds multi-threshold budget alert emails for virtual keys. The recipient list is read from valid_token.metadata.max_budget_alert_emails, which is settable by any user with key-create or key-update permissions. There is no validation that the email addresses belong to the key owner, their team, or any known entity — an attacker can point alerts at arbitrary external addresses, turning the proxy's email infrastructure into a relay for spam or phishing.
| _merge_budget_alert_email_configs( | ||
| global_cfg=litellm.default_key_max_budget_alert_emails, | ||
| per_key_cfg=(valid_token.metadata or {}).get( | ||
| "max_budget_alert_emails" |
There was a problem hiding this comment.
Medium: Arbitrary email relay via user-controlled metadata
max_budget_alert_emails is pulled from valid_token.metadata, which any user with key-create or key-update permissions can set to arbitrary external addresses. When the key's spend crosses a threshold, emails are sent to those addresses via send_email(to_email=recipient_emails) with no validation.
An attacker can set metadata.max_budget_alert_emails to {"1": ["victim@external.com"]}, spend $0.01 on a $1 budget, and the proxy sends an HTML email to victim@external.com from the proxy's email domain. This is useful for phishing (the email comes from the proxy's configured sender address) or for abusing the proxy as a spam relay.
Consider validating that recipient addresses belong to known users in the system (e.g., team members or the key owner), or restrict this metadata field to admin-only configuration.
There was a problem hiding this comment.
Medium: Arbitrary email relay via user-controlled key metadata
This PR adds configurable multi-threshold budget alerts, allowing email recipients to be specified per key via metadata.max_budget_alert_emails. The email addresses sourced from key metadata are passed directly to the email sending subsystem without any validation or restriction, allowing any authenticated user with key-create or key-update permissions to use the proxy as an email relay to arbitrary external addresses.
| _merge_budget_alert_email_configs( | ||
| global_cfg=litellm.default_key_max_budget_alert_emails, | ||
| per_key_cfg=(valid_token.metadata or {}).get( | ||
| "max_budget_alert_emails" |
There was a problem hiding this comment.
Medium: Arbitrary email relay via user-controlled metadata
max_budget_alert_emails is read directly from valid_token.metadata, which any user with key-create or key-update permissions can set. An attacker can populate this field with arbitrary external email addresses (e.g., spam targets), then deliberately spend up to the threshold to trigger the proxy to send emails on their behalf. There is no validation that the addresses belong to the organization or are otherwise authorized.
Consider either restricting this metadata field to admin-only writes (filtering it out during key creation/update for non-admin users), or validating recipient addresses against an allowlist (e.g., same domain as the organization, or addresses already registered in the system).
send_max_budget_alert_email previously guarded with `is not None`, which accepts `[]` and then crashes on `recipient_emails[0]` inside _get_email_params. The current caller (_handle_multi_threshold_max_budget_alert) already filters empty lists upstream, but the public method signature makes no such guarantee — a future caller passing [] would hit IndexError. Switch to truthiness so both None and [] fall through to the single-recipient path.
…reshold_budget_alerts
…itellm_feat-multi_threshold_budget_alerts
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 29203065 | Triggered | JSON Web Token | c8b7c1b | tests/test_litellm/proxy/test_litellm_pre_call_utils.py | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
67bf18d
into
litellm_internal_staging
Summary
Adds configurable multi-threshold budget alerts for virtual keys. Users can define multiple spend thresholds (e.g. 50%, 75%, 100%) each with their own list of email recipients, replacing the single hardcoded 80% alert.
Per-key configuration (via
/key/generateor/key/updatemetadata):{ "metadata": { "max_budget_alert_emails": { "50": ["finance@co.com"], "75": ["finance@co.com", "bu_lead@co.com"], "100": ["finance@co.com", "bu_lead@co.com", "cto@co.com"] } } }Global fallback for all keys (via config yaml):
alerting: ["email"]must be enabled ingeneral_settingsChanges
litellm/proxy/_types.py— Addedmax_budget_alert_emailsfield toCallInfolitellm/proxy/auth/auth_checks.py— New path reads threshold map from key metadata (with global fallback), passes it through to email handlerlitellm/__init__.py— Addeddefault_key_max_budget_alert_emailsmodule variablebase_email.py— Added_parse_email_listhelper,_handle_multi_threshold_max_budget_alertmethod, extendedsend_max_budget_alert_emailwith optionalthreshold_pct/recipient_emailsparamsTest plan