[April 6th] - Ishaan by ishaan-berri · Pull Request #25238 · BerriAI/litellm

ishaan-berri · 2026-04-06T20:27:03Z

…ATE (#25227)

Add STALE_OBJECT_CLEANUP_BATCH_SIZE constant

Configurable batch limit (default 1000) for stale managed object cleanup, preventing unbounded UPDATE queries from hitting 300K+ rows at once.

Batch-limit stale managed object cleanup with single bounded SQL query

Two fixes to _cleanup_stale_managed_objects:

Replace unbounded update_many with a single execute_raw using a subquery LIMIT, capping each poll cycle to STALE_OBJECT_CLEANUP_BATCH_SIZE rows. Zero rows loaded into Python memory — everything stays in Postgres. Uses the same PostgreSQL raw-SQL pattern as spend_log_cleanup.py (the proxy requires PostgreSQL per schema.prisma).
Extract _expire_stale_rows as a separate method for testability.

Keeps the file_purpose='response' filter to avoid incorrectly expiring long-running batch or fine-tune jobs that legitimately exceed the staleness cutoff.

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

…ATE (#25227) * Add STALE_OBJECT_CLEANUP_BATCH_SIZE constant Configurable batch limit (default 1000) for stale managed object cleanup, preventing unbounded UPDATE queries from hitting 300K+ rows at once. * Batch-limit stale managed object cleanup with single bounded SQL query Two fixes to _cleanup_stale_managed_objects: 1. Replace unbounded update_many with a single execute_raw using a subquery LIMIT, capping each poll cycle to STALE_OBJECT_CLEANUP_BATCH_SIZE rows. Zero rows loaded into Python memory — everything stays in Postgres. Uses the same PostgreSQL raw-SQL pattern as spend_log_cleanup.py (the proxy requires PostgreSQL per schema.prisma). 2. Extract _expire_stale_rows as a separate method for testability. Keeps the file_purpose='response' filter to avoid incorrectly expiring long-running batch or fine-tune jobs that legitimately exceed the staleness cutoff.

vercel · 2026-04-06T20:27:10Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Apr 7, 2026 0:27am

CLAassistant · 2026-04-06T20:27:11Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 3 committers have signed the CLA.

✅ otaviofbrito
❌ ishaan-berri
❌ github-actions[bot]
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

codspeed-hq · 2026-04-06T20:29:00Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing litellm_ishaan_april6 (1c238b6) with main (d132b1b)}

greptile-apps · 2026-04-06T20:30:17Z

Greptile Summary

This PR adds three related features: (1) per-entity multi-window budget tracking (budget_limits) for both keys and teams, (2) per-team-member model scoping (allowed_models on the budget table, default_team_member_models on the team table), and (3) a batched stale managed-object cleanup for the responses API.

Several issues were flagged in the prior review round and remain open (merge conflict markers in test_bedrock_common_utils.py, budget_limits absent from all three schema.prisma files, raw SQL in _expire_stale_rows, tools/system dropped from the count_tokens call). This round surfaces one new finding:

reset_budget_windows (P1): The new reset_budget_jobs.py method loads all keys and teams with budget_limits in a single unbounded find_many (no take/skip) and issues one update() per dirty record inside the loop. Both patterns violate CLAUDE.md's DB rules (bound large result sets with cursor pagination; batch writes rather than per-row calls)."

Confidence Score: 4/5

Not ready to merge — open P0/P1 findings from prior rounds (merge conflict breaks test file, budget_limits missing from Prisma schema breaks all multi-budget-window runtime paths) plus a new P1 unbounded-query/N+1 write pattern in reset_budget_windows.

Score of 4 is appropriate: there is one confirmed new P1 finding (unbounded find_many + N+1 updates in reset_budget_windows), and prior-round P1/P0 items (merge conflict markers in test file, budget_limits schema drift, raw SQL) are still open per file inspection. Multiple P1s keep the score at 4 rather than 5.

litellm/proxy/common_utils/reset_budget_job.py (new N+1 / unbounded query), tests/test_litellm/llms/bedrock/test_bedrock_common_utils.py (merge conflict), litellm/proxy/schema.prisma + schema.prisma + litellm-proxy-extras/litellm_proxy_extras/schema.prisma (budget_limits column missing), enterprise/litellm_enterprise/proxy/common_utils/check_responses_cost.py (raw SQL), litellm/proxy/proxy_server.py (tools/system dropped from count_tokens)

Important Files Changed

Filename	Overview
litellm/proxy/common_utils/reset_budget_job.py	Added reset_budget_windows() with unbounded find_many (no take/skip) and N+1 individual update calls inside a for-loop, violating CLAUDE.md DB rules for large result sets and batch writes
enterprise/litellm_enterprise/proxy/common_utils/check_responses_cost.py	Extracted _expire_stale_rows using raw SQL execute_raw with batch LIMIT; raw SQL pattern flagged in prior review as violating CLAUDE.md
litellm/proxy/_types.py	Added BudgetLimitEntry, budget_limits on GenerateRequestBase/TeamBase/UpdateTeamRequest/LiteLLM_VerificationToken, allowed_models on LiteLLM_BudgetTable; all syntactically valid in current HEAD
litellm/proxy/auth/auth_checks.py	Added _check_team_member_model_access, _team_multi_budget_check, _virtual_key_multi_budget_check; all correctly follow the existing get_team_membership caching pattern
litellm/proxy/proxy_server.py	Added per-window spend counter increments for budget_limits keys/teams; tools/system params dropped from count_tokens call (flagged in prior review)
tests/test_litellm/llms/bedrock/test_bedrock_common_utils.py	Unresolved git merge conflict markers at lines 37-43 prevent file from parsing as valid Python — all tests in module fail at collection time (flagged in prior review)
litellm/constants.py	Added STALE_OBJECT_CLEANUP_BATCH_SIZE constant with env-var override, max(1,...) guard, and sensible default of 1000; clean addition
litellm-proxy-extras/litellm_proxy_extras/schema.prisma	Added allowed_models to LiteLLM_BudgetTable and default_team_member_models to LiteLLM_TeamTable; budget_limits still absent from both LiteLLM_TeamTable and LiteLLM_VerificationToken (flagged in prior review)

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Incoming Request] --> B[user_api_key_auth]
    B --> C[common_checks]
    C --> D{team_object present?}
    D -- Yes --> E[_team_max_budget_check]
    D -- Yes --> F[_team_multi_budget_check]
    F --> F1[get_current_spend\nspend:team:ID:window:DURATION]
    F1 --> F2{spend >= max_budget?}
    F2 -- Yes --> ERR[BudgetExceededError]
    D -- Yes, with user_id --> G[_check_team_member_model_access]
    G --> G1[get_team_membership\ncache-first DB lookup]
    G1 --> G2{allowed_models non-empty?}
    G2 -- Yes --> G3[_can_object_call_model]
    G3 -- denied --> ERR
    C --> H{valid_token present?}
    H -- Yes --> I[_virtual_key_multi_budget_check]
    I --> I1[get_current_spend\nspend:key:TOKEN:window:DURATION]
    I1 --> I2{spend >= max_budget?}
    I2 -- Yes --> ERR
    C --> J[Request proceeds]

    K[Response complete] --> L[increment_spend_counters]
    L --> M[increment key spend counter]
    L --> N[increment per-window key counters\nspend:key:TOKEN:window:DURATION]
    L --> O[increment team spend counter]
    L --> P[increment per-window team counters\nspend:team:ID:window:DURATION]

    Q[ResetBudgetJob poll] --> R[reset_budget_windows]
    R --> R1[find_many keys with budget_limits\n⚠️ no take/skip limit]
    R1 --> R2[for each expired window\nreset Redis counter\nupdate DB one-by-one\n⚠️ N+1 writes]
    R --> R3[find_many teams with budget_limits\n⚠️ no take/skip limit]
    R3 --> R4[for each expired window\nreset Redis counter\nupdate DB one-by-one\n⚠️ N+1 writes]

_{Reviews (6): Last reviewed commit: "fix(tests): update upsert tests to refle..." | Re-trigger Greptile}

greptile-apps · 2026-04-06T20:30:20Z

+        return await self.prisma_client.db.execute_raw(
+            """
+            UPDATE "LiteLLM_ManagedObjectTable"
+            SET "status" = 'stale_expired'
+            WHERE "id" IN (
+                SELECT "id" FROM "LiteLLM_ManagedObjectTable"
+                WHERE "file_purpose" = 'response'
+                AND "status" NOT IN ('completed', 'complete', 'failed', 'expired', 'cancelled', 'stale_expired')
+                AND "created_at" < $1::timestamptz
+                ORDER BY "created_at" ASC
+                LIMIT $2
+            )
+            """,
+            cutoff,
+            batch_size,
+        )


Raw SQL bypasses ORM layer

The CLAUDE.md rule says: "Do not write raw SQL for proxy DB operations. Use Prisma model methods instead of execute_raw / query_raw". A Prisma-native implementation avoids hand-written SQL, keeps the code testable with simple mocks, and removes schema-drift risk — while still bounding the batch:

async def _expire_stale_rows(self, cutoff: datetime, batch_size: int) -> int: stale = await self.prisma_client.db.litellm_managedobjecttable.find_many( where={ "file_purpose": "response", "status": {"not_in": ["completed", "complete", "failed", "expired", "cancelled", "stale_expired"]}, "created_at": {"lt": cutoff}, }, order={"created_at": "asc"}, take=batch_size, select={"id": True}, ) if not stale: return 0 await self.prisma_client.db.litellm_managedobjecttable.update_many( where={"id": {"in": [r.id for r in stale]}}, data={"status": "stale_expired"}, ) return len(stale)

This is two DB round-trips instead of one, but it stays in the ORM layer and matches how the rest of the proxy interacts with the DB. Note: spend_log_cleanup.py also uses execute_raw as a precedent for DELETE … WHERE … IN (SELECT … LIMIT n) — this is a style nudge rather than a hard blocker, but worth aligning with the stated convention.

Context Used: CLAUDE.md (source)

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps · 2026-04-06T20:30:21Z

+    async def _expire_stale_rows(
+        self, cutoff: datetime, batch_size: int
+    ) -> int:
+        """Execute the bounded UPDATE that marks stale rows as 'stale_expired'.
+
+        Isolated so it can be swapped / mocked in tests without touching the
+        orchestration logic in ``_cleanup_stale_managed_objects``.
+
+        Uses PostgreSQL syntax (``$1::timestamptz``, ``LIMIT``, double-quoted
+        identifiers) which is the only dialect the proxy supports — every
+        ``schema.prisma`` in the repo sets ``provider = "postgresql"``.
+        Same pattern as ``spend_log_cleanup.py``.
+        """
+        return await self.prisma_client.db.execute_raw(
+            """
+            UPDATE "LiteLLM_ManagedObjectTable"
+            SET "status" = 'stale_expired'
+            WHERE "id" IN (
+                SELECT "id" FROM "LiteLLM_ManagedObjectTable"
+                WHERE "file_purpose" = 'response'
+                AND "status" NOT IN ('completed', 'complete', 'failed', 'expired', 'cancelled', 'stale_expired')
+                AND "created_at" < $1::timestamptz
+                ORDER BY "created_at" ASC
+                LIMIT $2
+            )
+            """,
+            cutoff,
+            batch_size,
+        )


No tests added for the new method

The PR's pre-submission checklist shows the test checkbox unchecked, and CLAUDE.md states: "Adding at least 1 test is a hard requirement". The _expire_stale_rows docstring explicitly calls out that it was "Isolated so it can be swapped / mocked in tests without touching the orchestration logic" — but no tests were added.

At minimum, a unit test in tests/test_litellm/ that mocks prisma_client.db.execute_raw should verify:

The batch cap (STALE_OBJECT_CLEANUP_BATCH_SIZE) is passed correctly.

The affected-row count is returned and triggers the warning log.

Zero rows → no warning emitted.

Without tests the batch-size guard and the refactored flow cannot be automatically regressed against.

@greptile-apps

…nt (#21352) * return actual status code - /count_tokens endpoint * Apply suggestion from @greptile-apps[bot] Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix greptile suggestion * rollback file * add test case --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: ishaan-berri <155045088+ishaan-berri@users.noreply.github.com>

greptile-apps · 2026-04-06T20:45:52Z

+            try:
+                result = await provider_counter.count_tokens(
+                    model_to_use=model_to_use or "",
+                    messages=messages,  # type: ignore
+                    contents=contents,
+                    deployment=deployment,
+                    request_model=request.model,
+                )


tools and system parameters silently dropped from token counting call

The refactor to add the httpx.HTTPStatusError handler also accidentally removed tools=tools and system=system from the count_tokens call. Both variables are captured from the request at lines 9064-9065 and are actively used by the Anthropic and Gemini token counters to include tool definitions and system prompts in the server-side token count.

With this change:

Requests that include tools will receive an underestimated token count (tool definitions can be hundreds/thousands of tokens)

Requests that include a system prompt will similarly receive an undercount

The BaseTokenCounter.count_tokens abstract signature explicitly accepts both params (tools: Optional[List[Dict[str, Any]]] = None, system: Optional[Any] = None), and AnthropicTokenCounter passes them straight through to anthropic_count_tokens_handler.handle_count_tokens_request.

Suggested change

try:

result = await provider_counter.count_tokens(

model_to_use=model_to_use or "",

messages=messages, # type: ignore

contents=contents,

deployment=deployment,

request_model=request.model,

)

try:

result = await provider_counter.count_tokens(

model_to_use=model_to_use or "",

messages=messages, # type: ignore

contents=contents,

deployment=deployment,

request_model=request.model,

tools=tools,

system=system,

)

…#24950) * fix(bedrock): strip [1m]/[200k] context window suffixes before cost lookup * test(bedrock): add test for [1m] context window suffix stripping in cost lookup * schema: add allowed_models to BudgetTable, default_team_member_models to TeamTable * migration: add allowed_models and default_team_member_models columns * types: add allowed_models to TeamMemberAddRequest, TeamMemberUpdateRequest, UpdateTeamRequest * utils: add allowed_models param to add_new_member, persist to budget table * common_utils: add allowed_models to _upsert_budget_and_membership * team endpoints: seed allowed_models on member_add, persist on member_update and team/update * auth: enforce per-member allowed_models at request time * networking: add allowed_models to Member type and teamMemberUpdateCall * TeamMemberTab: add Model Scope column showing per-member allowed_models * EditMembership: add Allowed Models multi-select field * TeamInfo: add default_team_member_models field in Settings tab * chore: sync schema.prisma copies from root * fix(team_member_update): update existing budget in-place instead of creating new one When a member already has a budget_id, patch only the fields the caller provided rather than always creating a fresh budget record. The old code ignored existing_budget_id entirely, so updating only allowed_models silently dropped the stored max_budget / tpm_limit / rpm_limit values. * fix(auth): pass llm_router to _check_team_member_model_access Without the router, _can_object_call_model cannot resolve wildcard model names (e.g. openai/*) or access-group names in allowed_models, causing legitimate requests to be denied. Thread the existing llm_router from _run_common_checks through to the new member-scope check. * feat(ui): add Team Member Settings accordion to Create Team modal Groups default_team_member_models, member budget/key duration, and tpm/rpm defaults into a single collapsible section. The model picker is filtered to only show the models selected for the team, and the copy distinguishes it from the team-level Models field. * feat(ui): consolidate Team Member Settings into accordion in edit team form Moves default_team_member_models + per-member budget/key/tpm/rpm fields into a collapsible "Team Member Settings" panel. Keeps the top-level form focused on team-wide settings (team models, team budget, tpm/rpm). * fix(ui): use tremor Accordion for Team Member Settings in edit team form * fix(ui): move Team Member Settings accordion above budget fields in Create Team * chore: fixes --------- Co-authored-by: Ishaan Jaffer <ishaanjaffer0324@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Yuneng Jiang <yuneng@berri.ai>

    } | null;
    created_at: string;
    access_group_ids?: string[];
+<<<<<<< worktree-rustling-wishing-kite


    access_group_ids?: string[];
+<<<<<<< worktree-rustling-wishing-kite
+    default_team_member_models?: string[];
+=======


    access_group_models?: string[];
    access_group_mcp_server_ids?: string[];
    access_group_agent_ids?: string[];
+>>>>>>> main


greptile-apps · 2026-04-06T20:52:45Z

+<<<<<<< worktree-rustling-wishing-kite
+    assert base_model == "anthropic.claude-3-5-sonnet-20240620-v1:0"
+
+=======
    assert base_model == "anthropic.claude-haiku-4-5-20251001-v1:0"

+>>>>>>> main


Unresolved git merge conflict markers

Lines 37–43 contain live conflict markers (<<<<<<< worktree-rustling-wishing-kite, =======, >>>>>>> main). Python cannot parse this file, so every test in test_bedrock_common_utils.py fails with a SyntaxError at import time.

The main-branch version is correct — the model under test is bedrock/us-gov.anthropic.claude-haiku-4-5-20251001-v1:0, so stripping the us-gov. cross-region prefix should yield anthropic.claude-haiku-4-5-20251001-v1:0. The worktree assertion (anthropic.claude-3-5-sonnet-20240620-v1:0) refers to a completely different model and would be wrong.

Resolve by keeping only the correct assertion:

Suggested change

<<<<<<< worktree-rustling-wishing-kite

assert base_model == "anthropic.claude-3-5-sonnet-20240620-v1:0"

=======

assert base_model == "anthropic.claude-haiku-4-5-20251001-v1:0"

>>>>>>> main

assert base_model == "anthropic.claude-haiku-4-5-20251001-v1:0"

@default

#25109) * feat: multiple concurrent budget windows per API key and team (#24883) * feat(proxy): add BudgetLimitEntry type and wire budget_limits into key/team models * feat(schema): add budget_limits Json column to VerificationToken and TeamTable * feat(migrations): add migration for budget_limits column on keys and teams * feat(keys): initialize budget_limits windows with reset_at on key create/update * feat(teams): initialize budget_limits windows with reset_at on team create/update * feat(auth): add _virtual_key_multi_budget_check and _team_multi_budget_check * feat(auth): call multi-budget checks from common_checks for keys and teams * feat(proxy): increment per-window Redis spend counters after each request * feat(budget): reset individual budget windows on schedule via reset_budget_job * feat(ui): add hourly option to BudgetDurationDropdown * feat(ui): add budget_limits field to KeyResponse type * feat(ui): add Budget Windows editor to key edit view * feat(ui): add Budget Windows editor to create key form * fix(proxy): strip budget_limits=None before Prisma upsert to fix login 500 Prisma rejects nullable JSON fields (Json? without @default) when passed as Python None — it needs the field omitted entirely so the DB stores NULL via the column's nullable constraint. This was breaking /v2/login because the UI session key creation path hit the upsert with budget_limits=None. * ui(key-edit): use antd InputNumber+Button for budget windows, add reset hints * ui(create-key): use antd InputNumber+Button for budget windows, add reset hints * docs(users): add multiple budget windows section with API + dashboard walkthrough * fix: BudgetExceededError returns HTTP 429 instead of 400 - Add status_code=429 to BudgetExceededError class - auth_exception_handler hardcoded code=400 → code=429 * fix: no-op else branch in multi-budget auth checks causes KeyError - BudgetLimitEntry objects must be coerced via model_dump() not left as-is - Move _virtual_key_multi_budget_check into common_checks (was asymmetric with _team_multi_budget_check which already lived there) * fix: len() on JSON string returns char count not window count Guard with isinstance check + json.loads() before iterating per-window Redis counters in increment_spend_counters * fix: silent except:pass hides Redis reset failures in reset_budget_windows Log Redis counter reset failures as warnings so they are observable * test: add unit tests for multi-budget window enforcement 5 tests covering: no budget_limits passes, under budget passes, over hourly window raises 429, over monthly window raises 429, BudgetLimitEntry objects coerced without KeyError * fix: key per-window counters stable across reorders (duration key, not index) * fix: team+key per-window spend increments use duration key, not index * fix: budget window reset uses duration key; log failures instead of swallowing * refactor: extract BudgetWindowsEditor to shared component * refactor: key_edit_view imports BudgetWindowsEditor from shared component * refactor: create_key_button imports BudgetWindowsEditor from shared component --------- Co-authored-by: Ishaan Jaffer <ishaanjaffer0324@gmail.com> * fix(reset_budget_job): extract _reset_expired_window helper to fix PLR0915 too many statements * feat(skills): Skills Registry & Hub — register skills, browse in AI Hub, public skill hub (#25118) * feat(skills): add domain and namespace fields to plugin types * feat(skills): store and return domain/namespace inside manifest_json * feat(skills): add /public/skill_hub endpoint for unauthenticated access * feat(skills): whitelist /public/skill_hub from auth requirements * feat(skills): add domain, namespace to Plugin and RegisterPluginRequest types * feat(skills): smart URL parser — paste github URL, auto-detect source type and name * feat(skills): replace enable toggle with Public badge, make rows clickable * feat(skills): add skill detail view with Overview and How to Use tabs * feat(skills): add MakeSkillPublicForm modal for publishing skills to the hub * feat(skills): rename panel to Skills, wire in skill detail view on row click * feat(skills): add skill hub table columns — name, description, domain, source, status * feat(skills): add SkillHubDashboard with stats row, domain dropdown filter, and table * feat(skills): add Skill Hub tab to AI Hub with Select Skills to Make Public button * feat(skills): move Skills to top-level nav item directly under MCP Servers * feat(skills): add skillHubPublicCall and NEXT_PUBLIC_BASE_URL support * feat(skills): add Skill Hub tab to public AI Hub page * feat(skills): add skills page routing in main app router * feat(skills): add /skills page route * chore: update package-lock after npm install * docs(skills): add Skills Gateway doc page with mermaid architecture diagram * docs(skills): add Skills Gateway to sidebar under Agent & MCP Gateway * docs(skills): add loom walkthrough video to Skills Gateway doc * chore: fixes --------- Co-authored-by: Ishaan Jaffer <ishaanjaffer0324@gmail.com> Co-authored-by: Yuneng Jiang <yuneng@berri.ai>

greptile-apps · 2026-04-06T21:08:40Z

+    budget_limits: Optional[List[BudgetLimitEntry]] = (
+        None  # multiple concurrent budget windows
+    default_team_member_models: Optional[List[str]] = (
+        None  # default allowed_models seeded onto new team members
+    )


Syntax Error: missing closing ) on budget_limits field

The budget_limits field opens a parenthesis on line 1764 that is never closed before default_team_member_models begins on line 1766. The only ) on line 1768 is intended for default_team_member_models, so Python's implicit line continuation keeps the budget_limits expression open and sees an annotation statement (default_team_member_models: ...) inside a value expression — which raises SyntaxError: invalid syntax at import time. This makes the entire proxy unlaunchable.

Suggested change

budget_limits: Optional[List[BudgetLimitEntry]] = (

None # multiple concurrent budget windows

default_team_member_models: Optional[List[str]] = (

None # default allowed_models seeded onto new team members

)

budget_limits: Optional[List[BudgetLimitEntry]] = (

None # multiple concurrent budget windows

)

default_team_member_models: Optional[List[str]] = (

None # default allowed_models seeded onto new team members

)

 import { useQueryClient } from "@tanstack/react-query";
 import { Accordion, AccordionBody, AccordionHeader, Button, Col, Grid, Text, TextInput, Title } from "@tremor/react";
-import { Button as Button2, Form, Input, Modal, Radio, Select, Switch, Tag, Tooltip } from "antd";
+import { Button as Button2, Form, Input, InputNumber, Modal, Radio, Select, Switch, Tag, Tooltip } from "antd";


 import { InfoCircleOutlined } from "@ant-design/icons";
 import { TextInput, Button as TremorButton } from "@tremor/react";
-import { Form, Input, Select, Switch, Tooltip } from "antd";
+import { Button as AntButton, Form, Input, InputNumber, Select, Switch, Tooltip } from "antd";


+        For keys and teams with budget_limits, reset any individual windows where
+        reset_at <= now. Only the expired windows are reset; other windows are untouched.
+        """
+        from litellm.proxy.proxy_server import spend_counter_cache


greptile-apps · 2026-04-06T21:46:21Z

+            all_keys = await self.prisma_client.db.litellm_verificationtoken.find_many(
+                where={"budget_limits": {"not": None}}  # type: ignore[arg-type]
+            )
+            for key in all_keys:
+                raw = key.budget_limits  # type: ignore[attr-defined]
+                if not raw:
+                    continue
+                windows: list = raw if isinstance(raw, list) else json.loads(raw)
+                changed = False
+                for window in windows:
+                    counter_key = f"spend:key:{key.token}:window:{window['budget_duration']}"
+                    if await ResetBudgetJob._reset_expired_window(
+                        window, counter_key, spend_counter_cache, now
+                    ):
+                        changed = True
+                if changed:
+                    await self.prisma_client.db.litellm_verificationtoken.update(
+                        where={"token": key.token},
+                        data={"budget_limits": json.dumps(windows)},  # type: ignore[arg-type]
+                    )
+        except Exception as e:
+            verbose_proxy_logger.exception(
+                "Failed to reset budget windows for keys: %s", e
+            )
+
+        # --- Teams ---
+        try:
+            all_teams = await self.prisma_client.db.litellm_teamtable.find_many(
+                where={"budget_limits": {"not": None}}  # type: ignore[arg-type]
+            )
+            for team in all_teams:
+                raw = team.budget_limits  # type: ignore[attr-defined]
+                if not raw:
+                    continue
+                windows = raw if isinstance(raw, list) else json.loads(raw)
+                changed = False
+                for window in windows:
+                    counter_key = f"spend:team:{team.team_id}:window:{window['budget_duration']}"
+                    if await ResetBudgetJob._reset_expired_window(
+                        window, counter_key, spend_counter_cache, now
+                    ):
+                        changed = True
+                if changed:
+                    await self.prisma_client.db.litellm_teamtable.update(
+                        where={"team_id": team.team_id},
+                        data={"budget_limits": json.dumps(windows)},  # type: ignore[arg-type]


budget_limits absent from all Prisma schema files — feature broken at runtime

The migration 20260401000000_add_budget_limits/migration.sql correctly adds budget_limits JSONB to LiteLLM_VerificationToken and LiteLLM_TeamTable in the database. However, none of the three Prisma schema files (schema.prisma, litellm/proxy/schema.prisma, litellm-proxy-extras/litellm_proxy_extras/schema.prisma) declare this field for those two models — confirmed by reading the current file, neither model block contains budget_limits.

Because Prisma generates its client from the schema (not from live DB introspection), every budget_limits reference in reset_budget_windows() will fail at runtime:

find_many(where={"budget_limits": {"not": None}}) — Prisma rejects unknown filter fields with a validation error

key.budget_limits / team.budget_limits attribute accesses — returned objects never carry this field (always None or AttributeError)

update(data={"budget_limits": json.dumps(windows)}) — Prisma rejects unknown write fields

All of these failures are silently swallowed by the surrounding except Exception blocks, so budget windows will never reset and per-window spend counters will accumulate indefinitely. The # type: ignore[arg-type] / # type: ignore[attr-defined] annotations throughout confirm the author is aware the field is absent from the Prisma schema.

Fix: Add budget_limits Json? to LiteLLM_VerificationToken and LiteLLM_TeamTable in all three schema.prisma files and regenerate the Prisma client. Per CLAUDE.md, schema changes must be kept in sync across all copies.

…h new field

…or for existing budget_id

vercel Bot deployed to Preview April 6, 2026 20:28 View deployment

greptile-apps Bot reviewed Apr 6, 2026

View reviewed changes

ishaan-berri temporarily deployed to integration-postgres April 6, 2026 20:42 — with GitHub Actions Inactive

ishaan-berri temporarily deployed to integration-redis-postgres April 6, 2026 20:42 — with GitHub Actions Inactive

ishaan-berri had a problem deploying to integration-postgres April 6, 2026 20:42 — with GitHub Actions Error

ishaan-berri temporarily deployed to integration-postgres April 6, 2026 20:42 — with GitHub Actions Inactive

vercel Bot deployed to Preview April 6, 2026 20:44 View deployment

greptile-apps Bot reviewed Apr 6, 2026

View reviewed changes

ishaan-berri temporarily deployed to integration-redis-postgres April 6, 2026 20:48 — with GitHub Actions Inactive

ishaan-berri temporarily deployed to integration-postgres April 6, 2026 20:48 — with GitHub Actions Inactive

ishaan-berri had a problem deploying to integration-postgres April 6, 2026 20:49 — with GitHub Actions Error

ishaan-berri temporarily deployed to integration-postgres April 6, 2026 20:49 — with GitHub Actions Inactive

vercel Bot deployed to Preview April 6, 2026 20:50 View deployment

github-advanced-security AI found potential problems Apr 6, 2026

View reviewed changes

greptile-apps Bot reviewed Apr 6, 2026

View reviewed changes

ishaan-berri had a problem deploying to integration-redis-postgres April 6, 2026 21:02 — with GitHub Actions Failure

ishaan-berri had a problem deploying to integration-postgres April 6, 2026 21:02 — with GitHub Actions Failure

chore: sync schema.prisma copies from root

a65a942

vercel Bot deployed to Preview April 6, 2026 21:04 View deployment

greptile-apps Bot reviewed Apr 6, 2026

View reviewed changes

fix(types): close unclosed paren on budget_limits field in TeamBase

cf9fdff

ishaan-berri temporarily deployed to integration-postgres April 6, 2026 21:38 — with GitHub Actions Inactive

ishaan-berri had a problem deploying to integration-postgres April 6, 2026 21:38 — with GitHub Actions Failure

ishaan-berri temporarily deployed to integration-redis-postgres April 6, 2026 21:38 — with GitHub Actions Inactive

vercel Bot deployed to Preview April 6, 2026 21:39 View deployment

github-advanced-security AI found potential problems Apr 6, 2026

View reviewed changes

Comment thread litellm/proxy/common_utils/reset_budget_job.py

For keys and teams with budget_limits, reset any individual windows where

reset_at <= now. Only the expired windows are reset; other windows are untouched.

"""

from litellm.proxy.proxy_server import spend_counter_cache

greptile-apps Bot reviewed Apr 6, 2026

View reviewed changes

ishaan-berri added 2 commits April 6, 2026 17:25

fix(tests): set default_team_member_models=None on team mocks to matc…

9e0fb6b

…h new field

fix(tests): update upsert tests to reflect new update-in-place behavi…

1c238b6

…or for existing budget_id

ishaan-berri temporarily deployed to integration-postgres April 7, 2026 00:25 — with GitHub Actions Inactive

ishaan-berri temporarily deployed to integration-redis-postgres April 7, 2026 00:25 — with GitHub Actions Inactive

ishaan-berri temporarily deployed to integration-postgres April 7, 2026 00:25 — with GitHub Actions Inactive

ishaan-berri had a problem deploying to integration-postgres April 7, 2026 00:25 — with GitHub Actions Failure

vercel Bot deployed to Preview April 7, 2026 00:27 View deployment

ishaan-berri closed this Apr 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[April 6th] - Ishaan#25238

[April 6th] - Ishaan#25238
ishaan-berri wants to merge 8 commits intomainfrom
litellm_ishaan_april6

ishaan-berri commented Apr 6, 2026

Uh oh!

vercel Bot commented Apr 6, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Apr 6, 2026 •

edited

Loading

Uh oh!

codspeed-hq Bot commented Apr 6, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Apr 6, 2026 •

edited

Loading

Important Files Changed

Uh oh!

greptile-apps Bot Apr 6, 2026

Uh oh!

greptile-apps Bot Apr 6, 2026

Uh oh!

greptile-apps Bot Apr 6, 2026

Uh oh!

greptile-apps Bot Apr 6, 2026

Uh oh!

greptile-apps Bot Apr 6, 2026

Uh oh!

greptile-apps Bot Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

ishaan-berri commented Apr 6, 2026

Relevant issues

Pre-Submission checklist

Delays in PR merge?

CI (LiteLLM team)

Type

Changes

Uh oh!

vercel Bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq Bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

greptile-apps Bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vercel Bot commented Apr 6, 2026 •

edited

Loading

CLAassistant commented Apr 6, 2026 •

edited

Loading

codspeed-hq Bot commented Apr 6, 2026 •

edited

Loading

greptile-apps Bot commented Apr 6, 2026 •

edited

Loading