Skip to content

Tag query fix#25094

Merged
ishaan-berri merged 5 commits intoBerriAI:litellm_ishaan_april4from
harish876:tag-query-fix
Apr 4, 2026
Merged

Tag query fix#25094
ishaan-berri merged 5 commits intoBerriAI:litellm_ishaan_april4from
harish876:tag-query-fix

Conversation

@harish876
Copy link
Copy Markdown
Contributor

@harish876 harish876 commented Apr 3, 2026

Relevant issues

Improves DailyTagSpend write fanout/QPS imbalance by decoupling tag spend flushes from the main spend scheduler and running tag commits at a longer interval.

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link: TBD

  • CI run for the last commit
    Link: TBD

  • Merge / cherry-pick CI run
    Links: TBD

Type

🐛 Bug Fix
🧹 Refactoring
🚄 Infrastructure
✅ Test

Changes

  • Split DailyTagSpend flushes into a separate scheduler job instead of running through the main update_spend flow.
  • Configured tag spend job interval to run at a longer cadence using DAILY_TAG_SPEND_BATCH_MULTIPLIER, reducing tag-write burst pressure.
  • Kept request-path enqueue behavior unchanged, so tag spend data still accumulates per request and is eventually flushed.
  • Added and updated tests for:
    • update_daily_tag_spend delegation behavior
    • non-raising/logging behavior on tag flush errors
    • retry-path behavior for tag spend DB writes in the DB writer layer
  • Observed improvement target: DailyTagSpend call rate now tracks closer to DailyTeamSpend, improving QPS parity and reducing tag table overrepresentation.
  • Docker dev Build for the latest version was failing due to a broken dependency to g++.

The fix reduces the QPS. The below numbers show the most run queries tracked by postgres in a duration of 4-5 minutes under a simulated load test. The validation here is the ratio of database calls between the DailyTagSpend and DailyTeamSpend must nearly 1, which means the same number of QPS is achieved by queries to these tables.

DailyTagSpend Query Performance Analysis

Before Fix

Configuration avg_qps_since_reset
Without Redis Transaction Buffer 0.31182049273784314215
With Redis Transaction Buffer 0.32670871516733775359

With Adjusted Batch Interval Fix

Configuration avg_qps_since_reset
Without Redis Transaction Buffer 0.16589514881214770545
With Redis Transaction Buffer 0.14892439673581266593

Summary

  • Before Fix: Redis buffer showed ~4.7% increase in QPS (0.327 vs 0.312)
  • After Batch Interval Fix: Redis buffer achieved ~10.3% reduction in QPS (0.149 vs 0.166)

Before Fix

Full Logs:

litellm=# select
  calls,
  left(query, 50) as query_prefix
from pg_stat_statements
where query ilike 'insert%'
order by calls desc
limit 5;
 calls |                    query_prefix                    
-------+----------------------------------------------------
    52 | INSERT INTO "public"."LiteLLM_DailyTagSpend" ("id"
    25 | INSERT INTO "public"."LiteLLM_DailyTeamSpend" ("id
    25 | INSERT INTO "public"."LiteLLM_DailyUserSpend" ("id
     1 | INSERT INTO "public"."LiteLLM_SpendLogs" ("model_i
     1 | INSERT INTO "public"."LiteLLM_SpendLogs" ("call_ty
(5 rows)


litellm=# SELECT
    left(query, 25) AS query_prefix,
    calls / NULLIF(
        EXTRACT(EPOCH FROM (now() - (
            SELECT stats_reset
            FROM pg_stat_statements_info
        ))),
        0
    ) AS avg_qps_since_reset
FROM pg_stat_statements
WHERE query LIKE 'INSERT INTO "public"."LiteLLM_DailyTagSpend" ("id"%'
ORDER BY calls DESC
LIMIT 10;
       query_prefix        |  avg_qps_since_reset   
---------------------------+------------------------
 INSERT INTO "public"."Lit | 0.31182049273784314215
(1 row)

Using Redis Transaction Buffer

litellm=# select
  calls,                            
  left(query, 50) as query_prefix
from pg_stat_statements
where query ilike 'insert%'
order by calls desc
limit 15;
 calls |                    query_prefix                    
-------+----------------------------------------------------
    50 | INSERT INTO "public"."LiteLLM_DailyTagSpend" ("id"
    25 | INSERT INTO "public"."LiteLLM_DailyUserSpend" ("id
    25 | INSERT INTO "public"."LiteLLM_DailyTeamSpend" ("id
(15 rows)

litellm=# SELECT
    left(query, 25) AS query_prefix,
    calls / NULLIF(
        EXTRACT(EPOCH FROM (now() - (
            SELECT stats_reset
            FROM pg_stat_statements_info
        ))),
        0
    ) AS avg_qps_since_reset
FROM pg_stat_statements
WHERE query LIKE 'INSERT INTO "public"."LiteLLM_DailyTagSpend" ("id"%'
ORDER BY calls DESC
LIMIT 10;
       query_prefix        |  avg_qps_since_reset   
---------------------------+------------------------
 INSERT INTO "public"."Lit | 0.32670871516733775359
(1 row)

With Adjusted Batch Interval Fix

Full Logs

litellm=# select
  calls,
  left(query, 50) as query_prefix
from pg_stat_statements
where query ilike 'insert%'
order by calls desc
limit 5;
 calls |                    query_prefix                    
-------+----------------------------------------------------
    53 | INSERT INTO "public"."LiteLLM_DailyUserSpend" ("id
    53 | INSERT INTO "public"."LiteLLM_DailyTeamSpend" ("id
    52 | INSERT INTO "public"."LiteLLM_DailyTagSpend" ("id"
     1 | INSERT INTO "public"."LiteLLM_SpendLogs" ("respons
     1 | INSERT INTO "public"."LiteLLM_SpendLogs" ("mcp_nam
(5 rows)

litellm=# SELECT
    left(query, 25) AS query_prefix,
    calls / NULLIF(
        EXTRACT(EPOCH FROM (now() - (
            SELECT stats_reset
            FROM pg_stat_statements_info
        ))),
        0
    ) AS avg_qps_since_reset
FROM pg_stat_statements
WHERE query LIKE 'INSERT INTO "public"."LiteLLM_DailyTagSpend" ("id"%'
ORDER BY calls DESC
LIMIT 10;
       query_prefix        |  avg_qps_since_reset   
---------------------------+------------------------
 INSERT INTO "public"."Lit | 0.16589514881214770545
(1 row)

Using Redis Transaction Buffer

litellm=# select
  calls,                            
  left(query, 50) as query_prefix
from pg_stat_statements
where query ilike 'insert%'
order by calls desc
limit 15;
 calls |                    query_prefix                    
-------+----------------------------------------------------
    50 | INSERT INTO "public"."LiteLLM_DailyTeamSpend" ("id
    50 | INSERT INTO "public"."LiteLLM_DailyUserSpend" ("id
    50 | INSERT INTO "public"."LiteLLM_DailyTagSpend" ("id"
     1 | INSERT INTO "public"."LiteLLM_SpendLogs" ("end_use

litellm=# SELECT
    left(query, 25) AS query_prefix,
    calls / NULLIF(
        EXTRACT(EPOCH FROM (now() - (
            SELECT stats_reset
            FROM pg_stat_statements_info
        ))),
        0
    ) AS avg_qps_since_reset
FROM pg_stat_statements
WHERE query LIKE 'INSERT INTO "public"."LiteLLM_DailyTagSpend" ("id"%'
ORDER BY calls DESC
LIMIT 10;
       query_prefix        |  avg_qps_since_reset   
---------------------------+------------------------
 INSERT INTO "public"."Lit | 0.14892439673581266593
(1 row)

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 3, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 4, 2026 6:41am

Request Review

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented Apr 3, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing harish876:tag-query-fix (e9336a4) with main (a5322c6)

Open in CodSpeed

@harish876 harish876 marked this pull request as ready for review April 4, 2026 05:21
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 4, 2026

Greptile Summary

This PR decouples DailyTagSpend flushes from the main spend scheduler by introducing a dedicated APScheduler job (update_daily_tag_spend) that runs at a 2.3× longer interval (DAILY_TAG_SPEND_BATCH_MULTIPLIER). The change removes daily_tag_spend_update_queue from the main store_in_memory_spend_updates_in_redis pipeline and adds two new instance methods — _commit_daily_tag_spend_to_db and _commit_daily_tag_spend_to_db_with_redis — to DBSpendUpdateWriter that are called exclusively by the new scheduler job. A Docker dev build fix (g++ dependency) is included as a minor side-fix.

Key changes:

  • db_spend_update_writer.py: Tag spend removed from _commit_spend_updates_to_db_without_redis_buffer; new _commit_daily_tag_spend_to_db[_with_redis] methods added, both correctly guarded and following the existing lock-then-drain pattern.
  • redis_update_buffer.py: Tag queue parameter removed from store_in_memory_spend_updates_in_redis; new store_in_memory_daily_tag_spend_updates_in_redis and get_all_daily_tag_spend_update_transactions_from_redis_buffer methods added for the Redis-buffered path.
  • proxy_server.py: New scheduler job registered at int(batch_writing_interval * DAILY_TAG_SPEND_BATCH_MULTIPLIER) seconds.
  • utils.py: New update_daily_tag_spend coroutine as the scheduler entry point; main update_spend docstring updated to note the tag spend omission.
  • Tests added in test_update_daily_tag_spend.py (delegation, error suppression, Redis-path branching, and retry logic) and test_redis_update_buffer.py (pipeline push/drain, early-return, no-Redis sentinel).
  • Missing test coverage: store_in_memory_daily_tag_spend_updates_in_redis and get_all_daily_tag_spend_update_transactions_from_redis_buffer — the two new Redis-path tag spend methods in redis_update_buffer.py — have no corresponding tests despite being on a critical data path when use_redis_transaction_buffer=true.

Confidence Score: 4/5

PR is safe to merge; architecture is sound and the performance improvement is well-evidenced, but quality gaps from the previous review cycle remain open.

The core logic — decoupling tag spend into a separate scheduler job with its own Redis lock — is correct and consistent with existing patterns. The non-Redis and Redis paths both follow established idioms. All new code paths have at least some test coverage. The score is held at 4 rather than 5 because (1) the two new Redis-specific tag-spend methods in redis_update_buffer.py have zero test coverage (new finding, distinct from the previously flagged test gap), and (2) the DAILY_TAG_SPEND_BATCH_MULTIPLIER constant remains a plain float literal with no env-var override — previously flagged and still unaddressed. Both are P2 quality issues that don't block correctness but are meaningful enough to warrant a maintainer pass before merge.

tests/test_litellm/proxy/db/db_transaction_queue/test_redis_update_buffer.py — needs tests for store_in_memory_daily_tag_spend_updates_in_redis and get_all_daily_tag_spend_update_transactions_from_redis_buffer

Important Files Changed

Filename Overview
litellm/proxy/db/db_spend_update_writer.py Tag spend removed from main spend handler; two new instance methods (_commit_daily_tag_spend_to_db and _commit_daily_tag_spend_to_db_with_redis) correctly implement the decoupled flush path, following the existing lock-then-drain Redis pattern.
litellm/proxy/db/db_transaction_queue/redis_update_buffer.py Tag queue removed from main pipeline method; two new Redis methods added for the separate tag spend path — no tests cover these new methods.
litellm/proxy/proxy_server.py New scheduler job registered for tag spend at 2.3× main interval; inline import used to break circular dependency (consistent with existing pattern in same function).
litellm/proxy/utils.py New update_daily_tag_spend scheduler entry point correctly delegates to the Redis or non-Redis path and suppresses exceptions with error logging.
litellm/constants.py New DAILY_TAG_SPEND_BATCH_MULTIPLIER=2.3 constant added; DB_DAILY_TAG_SPEND_UPDATE_JOB_NAME added for lock identity — multiplier is a plain float literal with no env-var override.
tests/proxy_unit_tests/test_update_daily_tag_spend.py Four unit tests added: delegation to correct commit method, error suppression, Redis-path branching, and retry-then-succeed behavior — all use mocks, no real network calls.
tests/test_litellm/proxy/db/db_transaction_queue/test_redis_update_buffer.py Pipeline push/drain tests updated to reflect removal of tag queue from main method; new Redis pipeline tests added, but store_in_memory_daily_tag_spend_updates_in_redis and get_all_daily_tag_spend_update_transactions_from_redis_buffer lack coverage.
docker/Dockerfile.dev Adds g++ to build-stage apt-get install to fix dev image build failure.

Sequence Diagram

sequenceDiagram
    participant Req as Incoming Request
    participant Queue as daily_tag_spend_update_queue
    participant MainSched as Main Scheduler<br/>(every N seconds)
    participant TagSched as Tag Scheduler<br/>(every 2.3xN seconds)
    participant Redis as Redis Buffer
    participant DB as PostgreSQL LiteLLM_DailyTagSpend

    Req->>Queue: enqueue tag spend transaction

    Note over MainSched: update_spend() runs
    MainSched->>MainSched: _commit_spend_updates_to_db (skips daily_tag_spend_update_queue)

    Note over TagSched: update_daily_tag_spend() runs
    alt use_redis_transaction_buffer=false
        TagSched->>Queue: flush_and_get_aggregated_daily_spend_update_transactions()
        Queue-->>TagSched: aggregated transactions
        TagSched->>DB: update_daily_tag_spend() with retries
    else use_redis_transaction_buffer=true
        TagSched->>Queue: flush_and_get_aggregated_daily_spend_update_transactions()
        Queue-->>TagSched: aggregated transactions
        TagSched->>Redis: store_in_memory_daily_tag_spend_updates_in_redis()
        TagSched->>TagSched: acquire_lock(DB_DAILY_TAG_SPEND_UPDATE_JOB_NAME)
        alt lock acquired
            TagSched->>Redis: get_all_daily_tag_spend_update_transactions_from_redis_buffer()
            Redis-->>TagSched: aggregated transactions
            TagSched->>DB: update_daily_tag_spend() with retries
        end
    end
Loading

Reviews (3): Last reviewed commit: "resolving circular import error flagged ..." | Re-trigger Greptile

Comment thread litellm/constants.py
Comment on lines +1396 to +1398
# The number of tag entries are higher than number of user, team entries. This leads to a higher QPS.
# This will run tag spcific tasks at a later time to smooth QPS
DAILY_TAG_SPEND_BATCH_MULTIPLIER = 2.3
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Hardcoded magic number, not env-var configurable — inconsistent with codebase pattern

Every other tuneable constant in this file is read from an environment variable (e.g. DEFAULT_FLUSH_INTERVAL_SECONDS, DEFAULT_BATCH_SIZE), allowing operators to adjust behaviour without a code change. DAILY_TAG_SPEND_BATCH_MULTIPLIER = 2.3 is a plain float literal with no override mechanism.

There is also a typo in the comment: "spcific" → "specific".

Suggested change
# The number of tag entries are higher than number of user, team entries. This leads to a higher QPS.
# This will run tag spcific tasks at a later time to smooth QPS
DAILY_TAG_SPEND_BATCH_MULTIPLIER = 2.3
# The number of tag entries are higher than number of user, team entries. This leads to a higher QPS.
# This will run tag-specific tasks at a longer cadence to smooth QPS.
DAILY_TAG_SPEND_BATCH_MULTIPLIER = float(
os.getenv("DAILY_TAG_SPEND_BATCH_MULTIPLIER", 2.3)
)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +43 to +95
async def test_daily_tag_spend_retries_then_succeeds():
prisma_client = MagicMock()
proxy_logging_obj = MagicMock()

mock_batcher = MagicMock()
mock_table = MagicMock()
mock_batcher.litellm_dailytagspend = mock_table

# Fail entering batch context 3 times with retryable DB errors, then succeed.
prisma_client.db.batch_.return_value.__aenter__ = AsyncMock(
side_effect=[
httpx.ConnectError("x"),
httpx.ConnectError("x"),
httpx.ConnectError("x"),
mock_batcher,
]
)

daily_spend_transactions: Dict[str, DailyTagSpendTransaction] = {
"k": {
"tag": "prod-tag",
"date": "2026-04-03",
"api_key": "key-1",
"model": "gpt-4o",
"model_group": None,
"custom_llm_provider": "openai",
"mcp_namespaced_tool_name": "",
"endpoint": "",
"prompt_tokens": 10,
"completion_tokens": 5,
"cache_read_input_tokens": 0,
"cache_creation_input_tokens": 0,
"spend": 0.01,
"api_requests": 1,
"successful_requests": 1,
"failed_requests": 0,
"request_id": None,
}
}

with patch("asyncio.sleep", new_callable=AsyncMock) as sleep_mock, patch(
"random.uniform", return_value=0
):
await DBSpendUpdateWriter.update_daily_tag_spend(
n_retry_times=3,
prisma_client=prisma_client,
proxy_logging_obj=proxy_logging_obj,
daily_spend_transactions=daily_spend_transactions,
)

assert prisma_client.db.batch_.return_value.__aenter__.await_count == 4
assert sleep_mock.await_count == 3
mock_table.upsert.assert_called_once()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Test exercises update_daily_tag_spend static method directly, but the new scheduler path goes through _commit_daily_tag_spend_to_db

test_daily_tag_spend_retries_then_succeeds calls DBSpendUpdateWriter.update_daily_tag_spend(...) directly with a hand-crafted daily_spend_transactions dict. This tests the underlying write/retry behaviour well, but there is no test that exercises the actual new code path triggered by the scheduler:

update_daily_tag_spend (utils.py)
  └─ _commit_daily_tag_spend_to_db
       └─ daily_tag_spend_update_queue.flush_and_get_aggregated_daily_spend_update_transactions()
       └─ DBSpendUpdateWriter.update_daily_tag_spend(...)

In particular, the flush_and_get_aggregated_daily_spend_update_transactions call is never exercised, so a bug where the queue is empty (e.g. because the main job already drained it to Redis) would never be caught by these tests. Consider adding a test that pre-populates proxy_logging_obj.db_spend_update_writer.daily_tag_spend_update_queue and verifies that the full chain from update_daily_tag_spend (in utils.py) actually writes to the DB.

Comment thread litellm/proxy/proxy_server.py Fixed
Copy link
Copy Markdown
Contributor

@ishaan-berri ishaan-berri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ishaan-berri ishaan-berri changed the base branch from main to litellm_ishaan_april4 April 4, 2026 16:51
@ishaan-berri ishaan-berri merged commit f4e69f4 into BerriAI:litellm_ishaan_april4 Apr 4, 2026
20 of 62 checks passed
ishaan-berri pushed a commit that referenced this pull request Apr 4, 2026
* feat(tag-spend): implement separate scheduler job for daily tag spend updates

* fix(docker): add g++ to build dependencies in Dockerfile

* initial test cases. TODO: check scheduler init and test cases in proxy_server related to it

* resolved QPS issue when redis transaction buffer is enabled

* resolving circular import error flagged by greptile
ishaan-berri added a commit that referenced this pull request Apr 4, 2026
* added support for metadata (#24261)

* added support for metadata

* fix: PR review - meta truthiness, BlobResourceContents mimeType, add Blob+empty meta tests

Made-with: Cursor

* pyproject to .25

* feat(teams): resolve access group models/MCPs/agents in team endpoints

Add access_group_models, access_group_mcp_server_ids, and
access_group_agent_ids to /team/info and /v2/team/list responses.
These fields contain resources inherited from access groups, kept
separate from direct assignments so the UI can distinguish the source.

Backend: _resolve_access_group_resources() helper resolves access
group resources via existing _get_*_from_access_groups() functions.

UI: Teams table and detail view show direct models as blue badges
and access-group-sourced models as green badges.

* perf(teams): single-pass access group resolution + asyncio.gather in list endpoint

- Fetch each access group object once and extract all 3 resource fields
  in a single pass instead of 3 separate calls (3N → N lookups)
- Use asyncio.gather to resolve access groups across teams concurrently
  in list_team_v2 instead of sequential awaits
- Add 5 unit tests for _resolve_access_group_resources

* docs: add default_team_params to config reference and update examples

- Add default_team_params to litellm_settings reference table in
  config_settings.md with all sub-fields documented
- Update self_serve.md and msft_sso.md examples to include
  team_member_permissions, tpm_limit, and rpm_limit
- Fix misleading comment that implied default_team_params only applies
  to SSO auto-created teams — it applies to all /team/new calls

* docs: clarify that models sub-field only applies to SSO auto-created teams

* fix: lazy import get_access_object to break cyclic import + short-circuit all-proxy-models display

- Remove get_access_object from module-level import in team_endpoints.py
  and use a lazy _get_access_object wrapper to avoid cyclic dependency
- Add _prisma_client is None early-exit guard in _resolve_access_group_resources
- Short-circuit UI to show "All Proxy Models" when team.models is empty
  or contains "all-proxy-models", skipping access group model resolution

* add: making organizations a select instead of read only badges

* fix(ui): only send organization_id when changed and use raw initial value

* fix(ui): add paginated team search to usage page filter

Replace the static team dropdown on the usage page with a new
TeamMultiSelect component that uses the paginated v2/team/list
endpoint with debounced server-side search and infinite scroll.

* fix(ui): fix imports and update placeholder for team multi select

* fix(ui): wire team_id filter to key alias dropdown on Virtual Keys tab

The Key Alias dropdown on the Virtual Keys page was showing aliases from
all teams regardless of which team was selected. The team_id was never
passed through the frontend chain to the backend /key/aliases endpoint.

- Backend: add optional team_id query param to /key/aliases endpoint
- networking.tsx: add team_id param to keyAliasesCall
- useKeyAliases: accept and forward team_id to API call and query key
- filter.tsx: pass allFilters context to custom filter components
- PaginatedKeyAliasSelect: read Team ID from allFilters and pass to hook

* fix(tests): correct mock targets in TestResolveAccessGroupResources

Three tests were patching the non-existent `get_access_object` instead
of `_get_access_object` (the lazy-import wrapper), causing AttributeError.
Also added missing `prisma_client` mock so tests get past the early-exit
guard and actually exercise the resolution logic.

* fix: use direct attribute access with or [] fallback in _resolve_access_group_resources

Replace getattr(ag, "field", []) with ag.field or [] for cleaner
access and safe handling if a field is None.

* fix(ui): remove model source legend from team detail view

The blue/green color distinction is self-explanatory; the legend added
visual clutter without providing enough value.

* fix(ui): add missing access_group fields to TeamData.team_info type

The TeamData interface was missing access_group_models,
access_group_mcp_server_ids, and access_group_agent_ids fields,
causing a TypeScript build failure.

* perf(teams): batch-fetch access groups in single DB query

Replace per-ID _resolve_access_group_resources loop with a single
find_many call that deduplicates IDs across all teams. Removes the
N+1 query pattern on cold cache for the team list endpoint.

* refactor(proxy): extract helpers to fix PLR0915 violations

Extract `_apply_non_admin_alias_scope` from `key_aliases`,
`_resolve_team_access_group_resources` from `team_info`, and
`_enforce_list_team_v2_access` from `list_team_v2` to bring each
function under ruff's 50-statement limit. No behavior changes.

* test(ui): update tests to match new team_id / access-group signatures

- useKeyAliases, PaginatedKeyAliasSelect: add trailing `undefined` to
  spy matchers for the new `team_id` param on `useInfiniteKeyAliases`
  and `keyAliasesCall`.
- EntityUsage: mock new `TeamMultiSelect` child so QueryClientProvider
  is not required for team-entity tests.
- ModelsCell: replace the overflow-accordion test with one that
  verifies the new collapse-on-`all-proxy-models` behavior (no
  accordion, single badge).

* fix(ui): send null (not '') for cleared organization_id on team update

AntD <Select allowClear> returns undefined when the user clears the
selection. Coalescing to "" caused the team-update payload to carry
organization_id: "" instead of null, relying on the backend to coerce
it. Send null directly so the intent is explicit at the source.

* poetry

* chore: regen poetry.lock for litellm-proxy-extras 0.4.64 bump

* chore: update Next.js build artifacts (2026-04-04 17:55 UTC, node v22.16.0)

---------

Co-authored-by: shivam <shivam@uni.minerva.edu>
Co-authored-by: Ryan Crabbe <ryan@berri.ai>
Co-authored-by: yuneng-jiang <yuneng@berri.ai>

* Tag query fix (#25094)

* feat(tag-spend): implement separate scheduler job for daily tag spend updates

* fix(docker): add g++ to build dependencies in Dockerfile

* initial test cases. TODO: check scheduler init and test cases in proxy_server related to it

* resolved QPS issue when redis transaction buffer is enabled

* resolving circular import error flagged by greptile

* fix(mypy): use Optional[str] for api_base in PydanticAI provider to match superclass signature

---------

Co-authored-by: Shivam Rawat <shivam@berri.ai>
Co-authored-by: shivam <shivam@uni.minerva.edu>
Co-authored-by: Ryan Crabbe <ryan@berri.ai>
Co-authored-by: yuneng-jiang <yuneng@berri.ai>
Co-authored-by: Harish <harishgokul01@gmail.com>
Co-authored-by: Ishaan Jaffer <ishaan@berri.ai>
fede-kamel pushed a commit to fede-kamel/litellm that referenced this pull request Apr 5, 2026
* added support for metadata (BerriAI#24261)

* added support for metadata

* fix: PR review - meta truthiness, BlobResourceContents mimeType, add Blob+empty meta tests

Made-with: Cursor

* pyproject to .25

* feat(teams): resolve access group models/MCPs/agents in team endpoints

Add access_group_models, access_group_mcp_server_ids, and
access_group_agent_ids to /team/info and /v2/team/list responses.
These fields contain resources inherited from access groups, kept
separate from direct assignments so the UI can distinguish the source.

Backend: _resolve_access_group_resources() helper resolves access
group resources via existing _get_*_from_access_groups() functions.

UI: Teams table and detail view show direct models as blue badges
and access-group-sourced models as green badges.

* perf(teams): single-pass access group resolution + asyncio.gather in list endpoint

- Fetch each access group object once and extract all 3 resource fields
  in a single pass instead of 3 separate calls (3N → N lookups)
- Use asyncio.gather to resolve access groups across teams concurrently
  in list_team_v2 instead of sequential awaits
- Add 5 unit tests for _resolve_access_group_resources

* docs: add default_team_params to config reference and update examples

- Add default_team_params to litellm_settings reference table in
  config_settings.md with all sub-fields documented
- Update self_serve.md and msft_sso.md examples to include
  team_member_permissions, tpm_limit, and rpm_limit
- Fix misleading comment that implied default_team_params only applies
  to SSO auto-created teams — it applies to all /team/new calls

* docs: clarify that models sub-field only applies to SSO auto-created teams

* fix: lazy import get_access_object to break cyclic import + short-circuit all-proxy-models display

- Remove get_access_object from module-level import in team_endpoints.py
  and use a lazy _get_access_object wrapper to avoid cyclic dependency
- Add _prisma_client is None early-exit guard in _resolve_access_group_resources
- Short-circuit UI to show "All Proxy Models" when team.models is empty
  or contains "all-proxy-models", skipping access group model resolution

* add: making organizations a select instead of read only badges

* fix(ui): only send organization_id when changed and use raw initial value

* fix(ui): add paginated team search to usage page filter

Replace the static team dropdown on the usage page with a new
TeamMultiSelect component that uses the paginated v2/team/list
endpoint with debounced server-side search and infinite scroll.

* fix(ui): fix imports and update placeholder for team multi select

* fix(ui): wire team_id filter to key alias dropdown on Virtual Keys tab

The Key Alias dropdown on the Virtual Keys page was showing aliases from
all teams regardless of which team was selected. The team_id was never
passed through the frontend chain to the backend /key/aliases endpoint.

- Backend: add optional team_id query param to /key/aliases endpoint
- networking.tsx: add team_id param to keyAliasesCall
- useKeyAliases: accept and forward team_id to API call and query key
- filter.tsx: pass allFilters context to custom filter components
- PaginatedKeyAliasSelect: read Team ID from allFilters and pass to hook

* fix(tests): correct mock targets in TestResolveAccessGroupResources

Three tests were patching the non-existent `get_access_object` instead
of `_get_access_object` (the lazy-import wrapper), causing AttributeError.
Also added missing `prisma_client` mock so tests get past the early-exit
guard and actually exercise the resolution logic.

* fix: use direct attribute access with or [] fallback in _resolve_access_group_resources

Replace getattr(ag, "field", []) with ag.field or [] for cleaner
access and safe handling if a field is None.

* fix(ui): remove model source legend from team detail view

The blue/green color distinction is self-explanatory; the legend added
visual clutter without providing enough value.

* fix(ui): add missing access_group fields to TeamData.team_info type

The TeamData interface was missing access_group_models,
access_group_mcp_server_ids, and access_group_agent_ids fields,
causing a TypeScript build failure.

* perf(teams): batch-fetch access groups in single DB query

Replace per-ID _resolve_access_group_resources loop with a single
find_many call that deduplicates IDs across all teams. Removes the
N+1 query pattern on cold cache for the team list endpoint.

* refactor(proxy): extract helpers to fix PLR0915 violations

Extract `_apply_non_admin_alias_scope` from `key_aliases`,
`_resolve_team_access_group_resources` from `team_info`, and
`_enforce_list_team_v2_access` from `list_team_v2` to bring each
function under ruff's 50-statement limit. No behavior changes.

* test(ui): update tests to match new team_id / access-group signatures

- useKeyAliases, PaginatedKeyAliasSelect: add trailing `undefined` to
  spy matchers for the new `team_id` param on `useInfiniteKeyAliases`
  and `keyAliasesCall`.
- EntityUsage: mock new `TeamMultiSelect` child so QueryClientProvider
  is not required for team-entity tests.
- ModelsCell: replace the overflow-accordion test with one that
  verifies the new collapse-on-`all-proxy-models` behavior (no
  accordion, single badge).

* fix(ui): send null (not '') for cleared organization_id on team update

AntD <Select allowClear> returns undefined when the user clears the
selection. Coalescing to "" caused the team-update payload to carry
organization_id: "" instead of null, relying on the backend to coerce
it. Send null directly so the intent is explicit at the source.

* poetry

* chore: regen poetry.lock for litellm-proxy-extras 0.4.64 bump

* chore: update Next.js build artifacts (2026-04-04 17:55 UTC, node v22.16.0)

---------

Co-authored-by: shivam <shivam@uni.minerva.edu>
Co-authored-by: Ryan Crabbe <ryan@berri.ai>
Co-authored-by: yuneng-jiang <yuneng@berri.ai>

* Tag query fix (BerriAI#25094)

* feat(tag-spend): implement separate scheduler job for daily tag spend updates

* fix(docker): add g++ to build dependencies in Dockerfile

* initial test cases. TODO: check scheduler init and test cases in proxy_server related to it

* resolved QPS issue when redis transaction buffer is enabled

* resolving circular import error flagged by greptile

* fix(mypy): use Optional[str] for api_base in PydanticAI provider to match superclass signature

---------

Co-authored-by: Shivam Rawat <shivam@berri.ai>
Co-authored-by: shivam <shivam@uni.minerva.edu>
Co-authored-by: Ryan Crabbe <ryan@berri.ai>
Co-authored-by: yuneng-jiang <yuneng@berri.ai>
Co-authored-by: Harish <harishgokul01@gmail.com>
Co-authored-by: Ishaan Jaffer <ishaan@berri.ai>
harish876 added a commit to harish876/litellm that referenced this pull request Apr 8, 2026
* added support for metadata (BerriAI#24261)

* added support for metadata

* fix: PR review - meta truthiness, BlobResourceContents mimeType, add Blob+empty meta tests

Made-with: Cursor

* pyproject to .25

* feat(teams): resolve access group models/MCPs/agents in team endpoints

Add access_group_models, access_group_mcp_server_ids, and
access_group_agent_ids to /team/info and /v2/team/list responses.
These fields contain resources inherited from access groups, kept
separate from direct assignments so the UI can distinguish the source.

Backend: _resolve_access_group_resources() helper resolves access
group resources via existing _get_*_from_access_groups() functions.

UI: Teams table and detail view show direct models as blue badges
and access-group-sourced models as green badges.

* perf(teams): single-pass access group resolution + asyncio.gather in list endpoint

- Fetch each access group object once and extract all 3 resource fields
  in a single pass instead of 3 separate calls (3N → N lookups)
- Use asyncio.gather to resolve access groups across teams concurrently
  in list_team_v2 instead of sequential awaits
- Add 5 unit tests for _resolve_access_group_resources

* docs: add default_team_params to config reference and update examples

- Add default_team_params to litellm_settings reference table in
  config_settings.md with all sub-fields documented
- Update self_serve.md and msft_sso.md examples to include
  team_member_permissions, tpm_limit, and rpm_limit
- Fix misleading comment that implied default_team_params only applies
  to SSO auto-created teams — it applies to all /team/new calls

* docs: clarify that models sub-field only applies to SSO auto-created teams

* fix: lazy import get_access_object to break cyclic import + short-circuit all-proxy-models display

- Remove get_access_object from module-level import in team_endpoints.py
  and use a lazy _get_access_object wrapper to avoid cyclic dependency
- Add _prisma_client is None early-exit guard in _resolve_access_group_resources
- Short-circuit UI to show "All Proxy Models" when team.models is empty
  or contains "all-proxy-models", skipping access group model resolution

* add: making organizations a select instead of read only badges

* fix(ui): only send organization_id when changed and use raw initial value

* fix(ui): add paginated team search to usage page filter

Replace the static team dropdown on the usage page with a new
TeamMultiSelect component that uses the paginated v2/team/list
endpoint with debounced server-side search and infinite scroll.

* fix(ui): fix imports and update placeholder for team multi select

* fix(ui): wire team_id filter to key alias dropdown on Virtual Keys tab

The Key Alias dropdown on the Virtual Keys page was showing aliases from
all teams regardless of which team was selected. The team_id was never
passed through the frontend chain to the backend /key/aliases endpoint.

- Backend: add optional team_id query param to /key/aliases endpoint
- networking.tsx: add team_id param to keyAliasesCall
- useKeyAliases: accept and forward team_id to API call and query key
- filter.tsx: pass allFilters context to custom filter components
- PaginatedKeyAliasSelect: read Team ID from allFilters and pass to hook

* fix(tests): correct mock targets in TestResolveAccessGroupResources

Three tests were patching the non-existent `get_access_object` instead
of `_get_access_object` (the lazy-import wrapper), causing AttributeError.
Also added missing `prisma_client` mock so tests get past the early-exit
guard and actually exercise the resolution logic.

* fix: use direct attribute access with or [] fallback in _resolve_access_group_resources

Replace getattr(ag, "field", []) with ag.field or [] for cleaner
access and safe handling if a field is None.

* fix(ui): remove model source legend from team detail view

The blue/green color distinction is self-explanatory; the legend added
visual clutter without providing enough value.

* fix(ui): add missing access_group fields to TeamData.team_info type

The TeamData interface was missing access_group_models,
access_group_mcp_server_ids, and access_group_agent_ids fields,
causing a TypeScript build failure.

* perf(teams): batch-fetch access groups in single DB query

Replace per-ID _resolve_access_group_resources loop with a single
find_many call that deduplicates IDs across all teams. Removes the
N+1 query pattern on cold cache for the team list endpoint.

* refactor(proxy): extract helpers to fix PLR0915 violations

Extract `_apply_non_admin_alias_scope` from `key_aliases`,
`_resolve_team_access_group_resources` from `team_info`, and
`_enforce_list_team_v2_access` from `list_team_v2` to bring each
function under ruff's 50-statement limit. No behavior changes.

* test(ui): update tests to match new team_id / access-group signatures

- useKeyAliases, PaginatedKeyAliasSelect: add trailing `undefined` to
  spy matchers for the new `team_id` param on `useInfiniteKeyAliases`
  and `keyAliasesCall`.
- EntityUsage: mock new `TeamMultiSelect` child so QueryClientProvider
  is not required for team-entity tests.
- ModelsCell: replace the overflow-accordion test with one that
  verifies the new collapse-on-`all-proxy-models` behavior (no
  accordion, single badge).

* fix(ui): send null (not '') for cleared organization_id on team update

AntD <Select allowClear> returns undefined when the user clears the
selection. Coalescing to "" caused the team-update payload to carry
organization_id: "" instead of null, relying on the backend to coerce
it. Send null directly so the intent is explicit at the source.

* poetry

* chore: regen poetry.lock for litellm-proxy-extras 0.4.64 bump

* chore: update Next.js build artifacts (2026-04-04 17:55 UTC, node v22.16.0)

---------

Co-authored-by: shivam <shivam@uni.minerva.edu>
Co-authored-by: Ryan Crabbe <ryan@berri.ai>
Co-authored-by: yuneng-jiang <yuneng@berri.ai>

* Tag query fix (BerriAI#25094)

* feat(tag-spend): implement separate scheduler job for daily tag spend updates

* fix(docker): add g++ to build dependencies in Dockerfile

* initial test cases. TODO: check scheduler init and test cases in proxy_server related to it

* resolved QPS issue when redis transaction buffer is enabled

* resolving circular import error flagged by greptile

* fix(mypy): use Optional[str] for api_base in PydanticAI provider to match superclass signature

---------

Co-authored-by: Shivam Rawat <shivam@berri.ai>
Co-authored-by: shivam <shivam@uni.minerva.edu>
Co-authored-by: Ryan Crabbe <ryan@berri.ai>
Co-authored-by: yuneng-jiang <yuneng@berri.ai>
Co-authored-by: Harish <harishgokul01@gmail.com>
Co-authored-by: Ishaan Jaffer <ishaan@berri.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants