feat: multiple concurrent budget windows per API key and team#24883
feat: multiple concurrent budget windows per API key and team#24883ishaan-berri merged 28 commits intolitellm_ishaan_april3from
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR introduces All seven P1 issues flagged in the previous review have been resolved:
Remaining minor items (P2):
Confidence Score: 5/5Safe to merge — all prior P1 issues are resolved and only minor P2 style/edge-case findings remain. All seven P1 issues from the previous review cycle are demonstrably fixed in the current commit. The remaining findings are P2: a React key anti-pattern and a duplicate-duration edge case that requires user misconfiguration to trigger. Neither affects the primary happy path or data integrity for correctly configured keys. ui/litellm-dashboard/src/components/key_team_helpers/BudgetWindowsEditor.tsx — duplicate duration validation and React key stability.
|
| Filename | Overview |
|---|---|
| litellm/proxy/auth/auth_checks.py | Added _virtual_key_multi_budget_check and _team_multi_budget_check with stable duration-based counter keys, correct model_dump() coercion, and both integrated into common_checks(). |
| litellm/proxy/proxy_server.py | Per-window Redis increment logic is clean: isinstance(str) guard added, iterates by duration, handles both dict and Pydantic objects. Cold-start seeding limitation is documented. |
| litellm/proxy/common_utils/reset_budget_job.py | Reset job correctly uses duration-keyed counters, resets only expired windows, logs Redis failures via verbose_proxy_logger.warning rather than silently swallowing exceptions. |
| litellm/exceptions.py | BudgetExceededError now sets self.status_code = 429; previously it inherited from Exception with no status code, causing HTTP 400 responses. |
| litellm/proxy/auth/auth_exception_handler.py | Hardcoded code=400 replaced with code=429 for BudgetExceededError handling — budget-exceeded responses now correctly return 429. |
| tests/test_litellm/proxy/auth/test_multi_budget_windows.py | Five focused unit tests with async mocks only (no real network calls); covers empty limits, under-budget, first/second window exceeded, and Pydantic object coercion. |
| ui/litellm-dashboard/src/components/key_team_helpers/BudgetWindowsEditor.tsx | Shared component correctly extracted; minor issues: array index used as React key, and no guard against duplicate budget_duration values that would double-count Redis spend. |
Sequence Diagram
sequenceDiagram
participant Client
participant AuthLayer as Auth Layer (user_api_key_auth)
participant CommonChecks as common_checks()
participant Redis as spend_counter_cache (Redis)
participant DB as Prisma DB
Client->>AuthLayer: API Request
AuthLayer->>CommonChecks: validate token
CommonChecks->>Redis: get_current_spend(window:{duration}) per window
Redis-->>CommonChecks: window_spend
alt any window_spend >= max_budget
CommonChecks-->>AuthLayer: BudgetExceededError (HTTP 429)
AuthLayer-->>Client: 429 Budget Exceeded
else all windows under budget
CommonChecks-->>AuthLayer: valid_token
AuthLayer-->>Client: LLM Response
AuthLayer->>Redis: async_increment_cache(window:{duration}) per window
end
note over DB,Redis: Background reset_budget_job
DB->>DB: find keys/teams with budget_limits
DB-->>Redis: set counter=0.0 for expired windows
DB->>DB: update reset_at for expired windows
Reviews (5): Last reviewed commit: "refactor: create_key_button imports Budg..." | Re-trigger Greptile
| for i in range(len(key_budget_limits)): | ||
| await spend_counter_cache.async_increment_cache( | ||
| key=f"spend:key:{hashed_token}:window:{i}", | ||
| value=response_cost, |
There was a problem hiding this comment.
Counter indices become misaligned after budget_limits update or window removal
Redis counters are keyed by list position (window:{i}). If a user later updates budget_limits by removing a window, inserting one, or reordering — the existing Redis counters no longer map to the correct windows.
Concrete example:
- Create key with
[{"1h", $10}, {"30d", $500}]→ Redis:window:0= hourly spend,window:1= monthly spend - User removes the hourly window → DB now has
[{"30d", $500}] window:0in Redis still holds the old (near-$0, recently-reset) hourly counter, but the code now reads it as the monthly counter — effectively resetting the monthly budget to near-zero
There is no code in prepare_key_update_data or _set_budget_reset_at that clears or remaps the old Redis counters when budget_limits changes. A stable per-window identity (e.g., a UUID or a key derived from budget_duration) should be used instead of the list index, or old counters must be explicitly cleared on every budget_limits update.
The same issue exists for team budget windows in the team update path.
| value=response_cost, | ||
| ) | ||
|
|
||
| if team_id is not None: | ||
| await _init_and_increment_spend_counter( |
There was a problem hiding this comment.
len(key_budget_limits) returns string length when value is a JSON string
key_obj comes from user_api_key_cache. If the cached object is a plain dict, its budget_limits field can be a raw JSON-serialized string (e.g. the output of json.dumps(initialized_windows) stored without further parsing). Calling len(...) on a string returns its character count — potentially 40-80 — rather than the number of budget windows (1 or 2). This causes the code to iterate over the wrong range and create dozens of spurious Redis counter keys.
reset_budget_windows already guards against this correctly with raw if isinstance(raw, list) else json.loads(raw). The same normalization should be applied to key_budget_limits (and team_budget_limits a few lines below) before calling range(len(...)).
if isinstance(key_budget_limits, str):
key_budget_limits = json.loads(key_budget_limits)| return | ||
|
|
||
| from litellm.proxy.proxy_server import get_current_spend |
There was a problem hiding this comment.
No-op else branch does not convert
BudgetLimitEntry to dict
The line:
w: dict = window if isinstance(window, dict) else window # type: ignore[assignment]is a no-op — both branches assign window unchanged. The intent was clearly to call .model_dump() in the else-branch, so that subscript access (w["max_budget"], w["budget_duration"]) works regardless of whether the window is a plain dict or a BudgetLimitEntry Pydantic model.
Suggested fix:
| return | |
| from litellm.proxy.proxy_server import get_current_spend | |
| w: dict = window if isinstance(window, dict) else window.model_dump() # type: ignore[union-attr] |
The same pattern appears in _team_multi_budget_check and should be fixed there too.
| window["reset_at"] = get_budget_reset_time( | ||
| budget_duration=window["budget_duration"] | ||
| ).isoformat() | ||
| changed = True | ||
| if changed: | ||
| await self.prisma_client.db.litellm_verificationtoken.update( |
There was a problem hiding this comment.
Silent
except Exception: pass hides Redis reset failures
When the Redis counter reset fails, the in-memory cache is already set to 0.0 but the Redis cache still holds the old spend value. On the next check, get_current_spend would read the un-reset Redis value and incorrectly continue blocking the key/team even though the window has expired.
At a minimum the exception should be logged:
except Exception as e:
verbose_proxy_logger.warning(
"Failed to reset budget window counter %s in Redis: %s",
counter_key, e,
)The same pattern appears in the Teams section of reset_budget_windows.
| valid_token=valid_token, | ||
| ) | ||
|
|
||
| # 3.1. Multi-window budget check for team | ||
| with tracer.trace("litellm.proxy.auth.common_checks.team_multi_budget_check"): |
There was a problem hiding this comment.
Multi-budget checks are split across two call sites inconsistently
_team_multi_budget_check is invoked inside common_checks() (step "3.1"), while _virtual_key_multi_budget_check is invoked directly inside _user_api_key_auth_builder() (check "4.1") instead of inside common_checks() alongside the other virtual-key checks.
This asymmetry means any code path that calls common_checks() directly (e.g., custom auth hooks or tests) would skip the per-key window check. Both checks should live in common_checks() for consistency.
|
|
||
| const BUDGET_WINDOW_OPTIONS = [ | ||
| { value: "1h", label: "Hourly" }, | ||
| { value: "24h", label: "Daily" }, | ||
| { value: "7d", label: "Weekly" }, | ||
| { value: "30d", label: "Monthly" }, | ||
| ]; | ||
|
|
||
| function BudgetWindowsEditor({ | ||
| value, | ||
| onChange, | ||
| }: { | ||
| value: Array<{ budget_duration: string; max_budget: number | null }>; | ||
| onChange: (v: Array<{ budget_duration: string; max_budget: number | null }>) => void; | ||
| }) { | ||
| const addWindow = () => { | ||
| onChange([...value, { budget_duration: "24h", max_budget: null }]); | ||
| }; | ||
|
|
||
| const removeWindow = (idx: number) => { | ||
| onChange(value.filter((_, i) => i !== idx)); | ||
| }; | ||
|
|
||
| const updateWindow = (idx: number, field: string, fieldValue: any) => { | ||
| const updated = value.map((w, i) => (i === idx ? { ...w, [field]: fieldValue } : w)); | ||
| onChange(updated); | ||
| }; | ||
|
|
||
| return ( | ||
| <div> | ||
| {value.map((window, idx) => ( | ||
| <div key={idx} style={{ display: "flex", gap: 8, marginBottom: 8, alignItems: "center" }}> | ||
| <Select | ||
| value={window.budget_duration} | ||
| onChange={(v) => updateWindow(idx, "budget_duration", v)} | ||
| style={{ width: 120 }} | ||
| > | ||
| {BUDGET_WINDOW_OPTIONS.map((opt) => ( | ||
| <Select.Option key={opt.value} value={opt.value}> | ||
| {opt.label} | ||
| </Select.Option> | ||
| ))} | ||
| </Select> | ||
| <NumericalInput | ||
| step={0.01} | ||
| min={0} | ||
| value={window.max_budget ?? undefined} | ||
| onChange={(v: number | null) => updateWindow(idx, "max_budget", v)} | ||
| placeholder="Max $ (e.g. 10.00)" | ||
| style={{ width: 160 }} | ||
| /> | ||
| <span | ||
| onClick={() => removeWindow(idx)} | ||
| style={{ cursor: "pointer", color: "#ff4d4f", fontSize: 16, lineHeight: 1 }} | ||
| title="Remove" | ||
| > | ||
| ✕ | ||
| </span> | ||
| </div> | ||
| ))} | ||
| <Button | ||
| size="xs" | ||
| variant="secondary" | ||
| onClick={(e: React.MouseEvent) => { e.preventDefault(); addWindow(); }} | ||
| > | ||
| + Add Window | ||
| </Button> | ||
| </div> | ||
| ); | ||
| } | ||
|
|
||
| /** | ||
| * ───────────────────────────────────────────────────────────────────────── | ||
| * @deprecated |
There was a problem hiding this comment.
BudgetWindowsEditor component is duplicated across two files
The BudgetWindowsEditor component (and BUDGET_WINDOW_OPTIONS constant) is defined identically in both create_key_button.tsx and key_edit_view.tsx. Any future change must be applied in two places.
This should be extracted into a shared component file, e.g., ui/litellm-dashboard/src/components/key_team_helpers/BudgetWindowsEditor.tsx, and imported in both consumers.
…n 500 Prisma rejects nullable JSON fields (Json? without @default) when passed as Python None — it needs the field omitted entirely so the DB stores NULL via the column's nullable constraint. This was breaking /v2/login because the UI session key creation path hit the upsert with budget_limits=None.
| reset_at <= now. Only the expired windows are reset; other windows are untouched. | ||
| """ | ||
| from litellm.proxy.common_utils.timezone_utils import get_budget_reset_time | ||
| from litellm.proxy.proxy_server import spend_counter_cache |
Check notice
Code scanning / CodeQL
Cyclic import Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 21 days ago
In general, to break a cyclic import you move shared or low-level concepts into a module that both sides can import, or you pass required objects in as parameters instead of importing them from the higher-level module. Here, ResetBudgetJob (in common_utils) should not import proxy_server to reach spend_counter_cache; instead, code in proxy_server (or wherever the job is constructed) should supply the cache object to ResetBudgetJob.
Concretely, we can modify ResetBudgetJob so that it takes a spend_counter_cache dependency via its constructor and stores it on self. Then, in reset_budget_windows, we remove the in-function import of spend_counter_cache from proxy_server and replace all uses with self.spend_counter_cache. This breaks the cycle because reset_budget_job.py no longer imports proxy_server. To keep existing functionality, we only touch the snippet we see: add an optional spend_counter_cache attribute usage (assuming the instance is created correctly elsewhere) and remove the problematic import. We must not add any new imports beyond standard library or modify existing imports, so we keep everything else unchanged.
| @@ -558,8 +558,9 @@ | ||
| reset_at <= now. Only the expired windows are reset; other windows are untouched. | ||
| """ | ||
| from litellm.proxy.common_utils.timezone_utils import get_budget_reset_time | ||
| from litellm.proxy.proxy_server import spend_counter_cache | ||
|
|
||
| spend_counter_cache = getattr(self, "spend_counter_cache", None) | ||
|
|
||
| now = datetime.utcnow() | ||
|
|
||
| # --- Keys --- | ||
| @@ -585,20 +585,21 @@ | ||
| counter_key = ( | ||
| f"spend:key:{key.token}:window:{window['budget_duration']}" | ||
| ) | ||
| spend_counter_cache.in_memory_cache.set_cache( | ||
| key=counter_key, value=0.0 | ||
| ) | ||
| if spend_counter_cache.redis_cache is not None: | ||
| try: | ||
| await spend_counter_cache.redis_cache.async_set_cache( | ||
| key=counter_key, value=0.0 | ||
| ) | ||
| except Exception as redis_err: | ||
| verbose_proxy_logger.warning( | ||
| "Failed to reset Redis counter %s: %s", | ||
| counter_key, | ||
| redis_err, | ||
| ) | ||
| if spend_counter_cache is not None: | ||
| spend_counter_cache.in_memory_cache.set_cache( | ||
| key=counter_key, value=0.0 | ||
| ) | ||
| if spend_counter_cache.redis_cache is not None: | ||
| try: | ||
| await spend_counter_cache.redis_cache.async_set_cache( | ||
| key=counter_key, value=0.0 | ||
| ) | ||
| except Exception as redis_err: | ||
| verbose_proxy_logger.warning( | ||
| "Failed to reset Redis counter %s: %s", | ||
| counter_key, | ||
| redis_err, | ||
| ) | ||
| window["reset_at"] = get_budget_reset_time( | ||
| budget_duration=window["budget_duration"] | ||
| ).isoformat() |
- Add status_code=429 to BudgetExceededError class - auth_exception_handler hardcoded code=400 → code=429
- BudgetLimitEntry objects must be coerced via model_dump() not left as-is - Move _virtual_key_multi_budget_check into common_checks (was asymmetric with _team_multi_budget_check which already lived there)
Guard with isinstance check + json.loads() before iterating per-window Redis counters in increment_spend_counters
…ndows Log Redis counter reset failures as warnings so they are observable
5 tests covering: no budget_limits passes, under budget passes, over hourly window raises 429, over monthly window raises 429, BudgetLimitEntry objects coerced without KeyError
| await spend_counter_cache.async_increment_cache( | ||
| key=f"spend:key:{hashed_token}:window:{i}", | ||
| value=response_cost, | ||
| ) | ||
|
|
||
| if team_id is not None: | ||
| await _init_and_increment_spend_counter( | ||
| counter_key=f"spend:team:{team_id}", | ||
| source_cache_key=f"team_id:{team_id}", | ||
| increment=response_cost, | ||
| ) | ||
|
|
||
| # Increment per-window budget counters for multi-budget teams | ||
| team_obj = await user_api_key_cache.async_get_cache(key=f"team_id:{team_id}") | ||
| if team_obj is not None: | ||
| team_budget_limits = getattr(team_obj, "budget_limits", None) or ( | ||
| team_obj.get("budget_limits") if isinstance(team_obj, dict) else None | ||
| ) | ||
| if isinstance(team_budget_limits, str): | ||
| team_budget_limits = json.loads(team_budget_limits) | ||
| if isinstance(team_budget_limits, list): | ||
| for i in range(len(team_budget_limits)): | ||
| await spend_counter_cache.async_increment_cache( | ||
| key=f"spend:team:{team_id}:window:{i}", | ||
| value=response_cost, |
There was a problem hiding this comment.
Per-window counters not seeded from DB on cold start
The single-window counter path uses _init_and_increment_spend_counter, which seeds the Redis counter from the cached object's accumulated .spend on first access (when the key is absent in Redis). This prevents under-counting after a Redis restart.
The per-window increments here use bare async_increment_cache calls with no equivalent seeding logic. After a Redis flush or restart, all window:{i} counters reset to zero. get_current_spend will return only the cost of the most recent request, so the budget check treats the window as nearly empty — a key can spend up to its full max_budget again within the same window period, bypassing enforcement.
There is no per-window accumulated spend stored in the DB (only reset_at timestamps), so perfect re-seeding is not straightforward. At a minimum, this limitation should be documented in the function docstring; a more complete fix would store a per-window cumulative spend value in the budget_limits JSON and re-seed from it on counter miss.
The same gap applies to the team window counter increments a few lines below.
|
@greptile-apps all open comments are now resolved. Here's a summary of every fix: P1 – Counter index misalignment (windows reordered/removed) P1 – HTTP 400 instead of 429 on budget exceeded P1 – No-op else branch (BudgetLimitEntry objects not coerced) P1 – P1 – P2 – Silent P2 – 5 unit tests added in Please re-review when you get a chance. |
| import { useQueryClient } from "@tanstack/react-query"; | ||
| import { Accordion, AccordionBody, AccordionHeader, Button, Col, Grid, Text, TextInput, Title } from "@tremor/react"; | ||
| import { Button as Button2, Form, Input, Modal, Radio, Select, Switch, Tag, Tooltip } from "antd"; | ||
| import { Button as Button2, Form, Input, InputNumber, Modal, Radio, Select, Switch, Tag, Tooltip } from "antd"; |
Check notice
Code scanning / CodeQL
Unused variable, import, function or class Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 21 days ago
In general, unused imports should be removed to reduce clutter and avoid confusion. Here, we should delete InputNumber from the destructuring import from "antd" on line 12.
The best minimal fix without changing existing functionality is:
- Edit
ui/litellm-dashboard/src/components/organisms/create_key_button.tsx. - In the
import { Button as Button2, Form, Input, InputNumber, Modal, Radio, Select, Switch, Tag, Tooltip } from "antd";line, removeInputNumber,from the import list. - No other code changes, new methods, or additional imports are needed.
| @@ -9,7 +9,7 @@ | ||
| import { InfoCircleOutlined } from "@ant-design/icons"; | ||
| import { useQueryClient } from "@tanstack/react-query"; | ||
| import { Accordion, AccordionBody, AccordionHeader, Button, Col, Grid, Text, TextInput, Title } from "@tremor/react"; | ||
| import { Button as Button2, Form, Input, InputNumber, Modal, Radio, Select, Switch, Tag, Tooltip } from "antd"; | ||
| import { Button as Button2, Form, Input, Modal, Radio, Select, Switch, Tag, Tooltip } from "antd"; | ||
| import debounce from "lodash/debounce"; | ||
| import React, { useCallback, useEffect, useState } from "react"; | ||
| import { rolesWithWriteAccess } from "../../utils/roles"; |
| import { InfoCircleOutlined } from "@ant-design/icons"; | ||
| import { TextInput, Button as TremorButton } from "@tremor/react"; | ||
| import { Form, Input, Select, Switch, Tooltip } from "antd"; | ||
| import { Button as AntButton, Form, Input, InputNumber, Select, Switch, Tooltip } from "antd"; |
Check notice
Code scanning / CodeQL
Unused variable, import, function or class Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 21 days ago
To fix unused imports, remove only the specific identifiers that are not used anywhere in the file, while keeping the rest of the import statement intact. This avoids changing runtime behavior and only cleans up dead code.
In this case, we should edit ui/litellm-dashboard/src/components/templates/key_edit_view.tsx and modify the antd import on line 8 to drop Button as AntButton and InputNumber from the imported names. The resulting import will still bring in Form, Input, Select, Switch, and Tooltip, which are presumably used in the component. No new methods, imports, or definitions are required; we are only pruning unused specifiers from an existing import.
| @@ -5,7 +5,7 @@ | ||
| import PolicySelector from "@/components/policies/PolicySelector"; | ||
| import { InfoCircleOutlined } from "@ant-design/icons"; | ||
| import { TextInput, Button as TremorButton } from "@tremor/react"; | ||
| import { Button as AntButton, Form, Input, InputNumber, Select, Switch, Tooltip } from "antd"; | ||
| import { Form, Input, Select, Switch, Tooltip } from "antd"; | ||
| import { useEffect, useState } from "react"; | ||
| import { rolesWithWriteAccess } from "../../utils/roles"; | ||
| import AgentSelector from "../agent_management/AgentSelector"; |
#25109) * feat: multiple concurrent budget windows per API key and team (#24883) * feat(proxy): add BudgetLimitEntry type and wire budget_limits into key/team models * feat(schema): add budget_limits Json column to VerificationToken and TeamTable * feat(migrations): add migration for budget_limits column on keys and teams * feat(keys): initialize budget_limits windows with reset_at on key create/update * feat(teams): initialize budget_limits windows with reset_at on team create/update * feat(auth): add _virtual_key_multi_budget_check and _team_multi_budget_check * feat(auth): call multi-budget checks from common_checks for keys and teams * feat(proxy): increment per-window Redis spend counters after each request * feat(budget): reset individual budget windows on schedule via reset_budget_job * feat(ui): add hourly option to BudgetDurationDropdown * feat(ui): add budget_limits field to KeyResponse type * feat(ui): add Budget Windows editor to key edit view * feat(ui): add Budget Windows editor to create key form * fix(proxy): strip budget_limits=None before Prisma upsert to fix login 500 Prisma rejects nullable JSON fields (Json? without @default) when passed as Python None — it needs the field omitted entirely so the DB stores NULL via the column's nullable constraint. This was breaking /v2/login because the UI session key creation path hit the upsert with budget_limits=None. * ui(key-edit): use antd InputNumber+Button for budget windows, add reset hints * ui(create-key): use antd InputNumber+Button for budget windows, add reset hints * docs(users): add multiple budget windows section with API + dashboard walkthrough * fix: BudgetExceededError returns HTTP 429 instead of 400 - Add status_code=429 to BudgetExceededError class - auth_exception_handler hardcoded code=400 → code=429 * fix: no-op else branch in multi-budget auth checks causes KeyError - BudgetLimitEntry objects must be coerced via model_dump() not left as-is - Move _virtual_key_multi_budget_check into common_checks (was asymmetric with _team_multi_budget_check which already lived there) * fix: len() on JSON string returns char count not window count Guard with isinstance check + json.loads() before iterating per-window Redis counters in increment_spend_counters * fix: silent except:pass hides Redis reset failures in reset_budget_windows Log Redis counter reset failures as warnings so they are observable * test: add unit tests for multi-budget window enforcement 5 tests covering: no budget_limits passes, under budget passes, over hourly window raises 429, over monthly window raises 429, BudgetLimitEntry objects coerced without KeyError * fix: key per-window counters stable across reorders (duration key, not index) * fix: team+key per-window spend increments use duration key, not index * fix: budget window reset uses duration key; log failures instead of swallowing * refactor: extract BudgetWindowsEditor to shared component * refactor: key_edit_view imports BudgetWindowsEditor from shared component * refactor: create_key_button imports BudgetWindowsEditor from shared component --------- Co-authored-by: Ishaan Jaffer <ishaanjaffer0324@gmail.com> * fix(reset_budget_job): extract _reset_expired_window helper to fix PLR0915 too many statements * feat(skills): Skills Registry & Hub — register skills, browse in AI Hub, public skill hub (#25118) * feat(skills): add domain and namespace fields to plugin types * feat(skills): store and return domain/namespace inside manifest_json * feat(skills): add /public/skill_hub endpoint for unauthenticated access * feat(skills): whitelist /public/skill_hub from auth requirements * feat(skills): add domain, namespace to Plugin and RegisterPluginRequest types * feat(skills): smart URL parser — paste github URL, auto-detect source type and name * feat(skills): replace enable toggle with Public badge, make rows clickable * feat(skills): add skill detail view with Overview and How to Use tabs * feat(skills): add MakeSkillPublicForm modal for publishing skills to the hub * feat(skills): rename panel to Skills, wire in skill detail view on row click * feat(skills): add skill hub table columns — name, description, domain, source, status * feat(skills): add SkillHubDashboard with stats row, domain dropdown filter, and table * feat(skills): add Skill Hub tab to AI Hub with Select Skills to Make Public button * feat(skills): move Skills to top-level nav item directly under MCP Servers * feat(skills): add skillHubPublicCall and NEXT_PUBLIC_BASE_URL support * feat(skills): add Skill Hub tab to public AI Hub page * feat(skills): add skills page routing in main app router * feat(skills): add /skills page route * chore: update package-lock after npm install * docs(skills): add Skills Gateway doc page with mermaid architecture diagram * docs(skills): add Skills Gateway to sidebar under Agent & MCP Gateway * docs(skills): add loom walkthrough video to Skills Gateway doc * chore: fixes --------- Co-authored-by: Ishaan Jaffer <ishaanjaffer0324@gmail.com> Co-authored-by: Yuneng Jiang <yuneng@berri.ai>
|
Hi @ishaan-berri , i get this error with this PR: |
Relevant issues
Pre-Submission checklist
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewType
🆕 New Feature
Changes
Adds
budget_limits— a JSON array of independent budget windows on a virtual key or team. Useful for enforcing e.g. a $10/hour cap AND a $500/month cap simultaneously without blocking legitimate bursty use.How it works
New field:
budget_limits: [{budget_duration, max_budget}]on keys and teams. Each window tracks spend independently in Redis and resets on its own schedule. The existingmax_budget/budget_durationsingle-window fields are untouched.API demo (works today against the proxy):
Backend changes
_types.py—BudgetLimitEntryPydantic model;budget_limitsfield onGenerateRequestBase,UserAPIKeyAuth,TeamBase,UpdateTeamRequest; added toset_model_infoJSON-string parserschema.prisma+ migration —budget_limits Json?column onLiteLLM_VerificationTokenandLiteLLM_TeamTablekey_management_endpoints.py/team_endpoints.py— initializereset_atper window on create/updateauth_checks.py—_virtual_key_multi_budget_check()and_team_multi_budget_check()check each window against its Redis counter and raiseBudgetExceededErrorif exceededuser_api_key_auth.py— calls both checks fromcommon_checks()proxy_server.py— incrementsspend:key:{token}:window:{i}andspend:team:{team_id}:window:{i}Redis counters after each requestreset_budget_job.py—reset_budget_windows()resets only expired windows, updatesreset_atfor each, leaves non-expired windows untouchedUI changes
BudgetDurationDropdowncomponent