fix(proxy): use find_unique directly to avoid 401 on nonexistent key#24064
fix(proxy): use find_unique directly to avoid 401 on nonexistent key#24064Jah-yee wants to merge 5 commits intoBerriAI:mainfrom
Conversation
- LangSmith reads cost exclusively from outputs['usage_metadata']['total_cost'] - Previously, response_cost was computed but never exposed to LangSmith - Now populates usage_metadata with token counts and total_cost from payload Fixes BerriAI#24001
- Fix reasoning_tokens lookup: use completion_tokens_details (Chat Completions API) before output_tokens_details (Responses API) - Add support for prompt_tokens_details (cached_tokens, cache_creation_tokens) in OTEL spans This affects all OTEL-based callbacks: Langfuse, Arize, Phoenix, Weave. Fixes: BerriAI#23990
- Add usage_metadata dict to outputs so LangSmith can display costs - Fixes issue where Cost column always shows '-' despite response_cost being computed - Maps prompt_tokens, completion_tokens, total_tokens, and response_cost to LangSmith format Fixes BerriAI#24001
- Replace prisma_client.get_data() with direct find_unique() in _get_and_validate_existing_key() and update_key_fn() - Fix incorrect error message from 'Team not found' to 'Key not found' Fixes BerriAI#24063
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
|
Greptile SummaryThis PR's stated goal — replacing However, the PR bundles many additional, unrelated changes that each carry their own risk:
The PR should be split or the backwards-incompatible additions should be protected behind flags before merging. Confidence Score: 2/5
|
| Filename | Overview |
|---|---|
| langsmith.py | Accidentally committed 561-line duplicate of litellm/integrations/langsmith.py at the repository root — should not exist and must be removed. |
| litellm/proxy/management_endpoints/key_management_endpoints.py | Core fix (replacing get_data() with find_unique()) is correct, but the PR bundles many unrelated, backwards-incompatible behaviour changes (user_id auto-assignment, stricter team validation for non-admins, new admin-access checks for block/unblock) without feature flags, plus a redundant DB lookup in _check_key_admin_access(). |
| litellm/integrations/langsmith.py | Adds usage_metadata (including input_cost, output_cost, total_cost) to LangSmith run outputs so LangSmith's cost column is populated — straightforward and well-scoped. |
| litellm/integrations/arize/_utils.py | Fixes reasoning token lookup to check both completion_tokens_details and output_tokens_details, and adds handling for prompt_tokens_details (cached/cache-creation tokens) — correct and non-breaking. |
| litellm/llms/vertex_ai/common_utils.py | Adds a deepcopy of the schema before mutation in add_object_type() to prevent callers' original schema from being modified — correct defensive fix. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[POST /key/update] --> B{Authorization header valid?}
B -- No --> C[401 Unauthorized]
B -- Yes --> D[find_unique on body key]
D --> E{Key exists in DB?}
E -- No --> F[404 Key not found]
E -- Yes --> G[_validate_update_key_data]
G --> H{max_budget changed?}
H -- Yes --> I[_check_key_admin_access\nextra DB lookup]
H -- No --> J[prepare_key_update_data]
I --> J
J --> K[Update DB row]
K --> L[200 Updated key response]
Last reviewed commit: "fix(proxy): use find..."
| @@ -0,0 +1,561 @@ | |||
| #### What this does #### | |||
There was a problem hiding this comment.
Accidentally committed duplicate file at repository root
This file (langsmith.py) appears to have been accidentally committed to the repository root. The canonical Langsmith integration lives at litellm/integrations/langsmith.py.
This root-level copy is an older version — it notably lacks the usage_metadata fix that is being added to litellm/integrations/langsmith.py in this same PR. Keeping both will cause confusion about which file is authoritative and may shadow or conflict with the real integration in certain import contexts.
This file should be removed from the PR.
| if not _is_proxy_admin and data.user_id is None: | ||
| data.user_id = user_api_key_dict.user_id | ||
| verbose_proxy_logger.warning( | ||
| "key/generate: auto-assigning user_id=%s for non-admin caller", | ||
| user_api_key_dict.user_id, | ||
| ) |
There was a problem hiding this comment.
Backwards-incompatible
user_id auto-assignment for non-admin callers
Non-admin users calling /key/generate without specifying a user_id will now silently have the caller's user_id injected into the request. Previously, they could create unbound keys with no user_id. This is a behaviour change that is not gated behind a feature flag.
Per the project's backwards-compatibility policy, silent mutation of request fields for existing callers should be controlled by a user-configurable flag (e.g. litellm.auto_assign_user_id_on_key_generate). Without this, any automation or integration that relied on creating unowned keys as a non-admin user will start receiving keys tied to their own account, potentially breaking access-control assumptions.
| if not _is_proxy_admin and data.user_id is None: | |
| data.user_id = user_api_key_dict.user_id | |
| verbose_proxy_logger.warning( | |
| "key/generate: auto-assigning user_id=%s for non-admin caller", | |
| user_api_key_dict.user_id, | |
| ) | |
| if not _is_proxy_admin and data.user_id is None and litellm.auto_assign_user_id_on_key_generate: | |
| data.user_id = user_api_key_dict.user_id | |
| verbose_proxy_logger.warning( | |
| "key/generate: auto-assigning user_id=%s for non-admin caller", | |
| user_api_key_dict.user_id, | |
| ) |
Rule Used: What: avoid backwards-incompatible changes without... (source)
| if not _is_proxy_admin: | ||
| raise HTTPException( | ||
| status_code=400, | ||
| detail=f"Team not found for team_id={data.team_id}. Non-admin users cannot create keys for non-existent teams.", | ||
| ) | ||
|
|
There was a problem hiding this comment.
Backwards-incompatible enforcement for non-admin users referencing non-existent teams
Previously, when team_table is None (team not found) and litellm.key_generation_settings is None, key_generation_check() would return True, allowing any caller to assign an arbitrary team_id. The new code raises a 400 for non-admin callers.
This is a breaking change for existing users who create keys with a team_id that doesn't exist in the team table (a common pattern when bootstrapping or using team IDs as logical tags without full team table management). It should be controlled by a feature flag rather than being enforced unconditionally.
The same applies to the equivalent guard in generate_key_fn() at line 1279 — both changes together silently break existing non-admin workflows.
Rule Used: What: avoid backwards-incompatible changes without... (source)
| return | ||
|
|
||
| ui_settings = await get_ui_settings_cached() | ||
| if ui_settings.get("disable_custom_api_keys", False) is True: |
There was a problem hiding this comment.
is True identity check may miss truthy non-boolean config values
Using is True is a strict identity check — it will evaluate to False if the setting is stored as a truthy non-boolean value such as 1, "true", or "yes" (common when config is sourced from environment variables or YAML). A regular truthy check is safer here:
| if ui_settings.get("disable_custom_api_keys", False) is True: | |
| if ui_settings.get("disable_custom_api_keys", False): |
|
|
||
| # Look up the target key to find its team | ||
| target_key_row = await prisma_client.db.litellm_verificationtoken.find_unique( | ||
| where={"token": hashed_token} | ||
| ) |
There was a problem hiding this comment.
Redundant DB lookup in
_check_key_admin_access()
When called from _validate_update_key_data(), the key row has already been fetched from the database (as existing_key_row). The hashed_key passed in is just existing_key_row.token. Despite this, _check_key_admin_access() performs another find_unique call on the same table to retrieve the identical row.
Every /key/update call that changes max_budget therefore makes two separate database round-trips for the same record. Consider adding an optional key_row parameter to allow reuse of the already-fetched object and skip the redundant query.
Summary
Failure Path (Before Fix)
/key/updatereturns401for a nonexistent bodykeyeven when theAuthorizationheader contains a valid master key. This happens becauseprisma_client.get_data()treats the token as an auth credential and raises a 401 when it doesn't exist in the DB.Fix
Replace
prisma_client.get_data()with direct Prismafind_unique()in both_get_and_validate_existing_key()andupdate_key_fn(), matching the pattern used in the/key/blockand/key/unblockfix (PR #23977). Also fixes an incorrect "Team not found" error message inupdate_key_fn.Testing
Fixes #24063