feat(guardrails): per-team opt-out for specific global guardrails#25575
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR adds per-guardrail opt-out for global ( Several edge-case concerns were surfaced in earlier review cycles (kill-switch toggle re-enabling opted-out globals, error state not guarded, duplicate entries for legacy teams); addressing those before merge would improve reliability of the opt-out feature. Confidence Score: 4/5Backend logic is safe; UI has open edge-case concerns from prior review cycles that could silently corrupt a team's opt-out configuration. The backend (custom_guardrail.py, litellm_pre_call_utils.py) is correct and well-covered by unit tests. New findings are P2 only. However, earlier rounds flagged several P1 UI issues (error-state data loss, kill-switch toggle re-enabling previously opted-out globals) that appear unaddressed in the current code, keeping the score at 4 rather than 5. ui/litellm-dashboard/src/components/team/TeamInfo.tsx — kill-switch toggle, error state, and legacy-team duplicate-entry edge cases.
|
| Filename | Overview |
|---|---|
| ui/litellm-dashboard/src/components/team/TeamInfo.tsx | Major UI overhaul: grouped select, per-guardrail opt-out save logic, kill-switch interaction, and GuardrailSettingsView integration; several edge cases flagged in prior review remain (error state, kill-switch toggle, duplicates). |
| litellm/integrations/custom_guardrail.py | Adds get_opted_out_global_guardrails_from_metadata and a per-guardrail skip in should_run_guardrail; renames disable_global_guardrail key to plural form (breaking change noted in prior review). |
| litellm/proxy/litellm_pre_call_utils.py | Correctly propagates opted_out_global_guardrails from team_metadata to the request metadata with an isinstance list guard; mirrors the existing disable_global_guardrails pattern. |
| ui/litellm-dashboard/src/app/(dashboard)/hooks/guardrails/useGuardrails.ts | Refactored to expose full guardrail objects and derived globalGuardrailNames/optionalGuardrailNames sets via React Query select; breaking change to hook return type handled correctly in consumers. |
| ui/litellm-dashboard/src/components/GuardrailSettingsView.tsx | New presentational component for displaying global vs team-specific guardrail state; handles kill-switch, opt-outs, and empty states correctly. |
| tests/test_litellm/integrations/test_custom_guardrail.py | Adds comprehensive tests for opted_out_global_guardrails covering root/litellm_metadata/metadata paths, malformed inputs, non-global guardrail immunity; existing tests updated from singular to plural key. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Incoming Request] --> B[litellm_pre_call_utils\nPropagate team metadata]
B --> C{team has\nopted_out_global_guardrails?}
C -- Yes --> D[Write to litellm_metadata.\nopted_out_global_guardrails]
C -- No --> E[Write to litellm_metadata.\ndisable_global_guardrails]
D --> E
E --> F[should_run_guardrail]
F --> G{default_on = True?}
G -- No --> H[Check requested_guardrails list]
G -- Yes --> I{guardrail_name in\nopted_out_global_guardrails?}
I -- Yes --> J[Return False\nSkip guardrail]
I -- No --> K{disable_global_guardrail\n= True?}
K -- Yes --> J
K -- No --> L{Event hook matches?}
L -- Yes --> M[Run Guardrail]
L -- No --> N[Return False]
H --> O{guardrail in\nrequested list?}
O -- Yes --> L
O -- No --> N
Reviews (7): Last reviewed commit: "test(ui/team): fix guardrails overview t..." | Re-trigger Greptile
… with opt-out list Renames the new per-guardrail opt-out field from `disabled_global_guardrails` to `opted_out_global_guardrails` to eliminate the one-character collision with the legacy `disable_global_guardrails` boolean kill switch. Adds a type guard on the new gate so a misnamed bool can't crash the guardrail check. Filters duplicates out of the team-edit guardrail display for legacy teams that have a global name persisted in `metadata.guardrails` from before this PR. Drops the unused `isGuardrailsLoading` and `guardrailsError` destructures left in AddModelForm after the hook refactor. Adds Python tests for the new gate behavior (root, litellm_metadata, metadata, non-matching name, empty list, malformed bool value, opt-in coexistence) and extends useGuardrails.test.ts to exercise the global / optional partition logic that the rebuilt hook performs in its `select` transform. Wires the legacy kill switch and the new opt-out list together in the team edit form so they can never fall out of sync: - Toggling the kill switch reactively updates the Guardrails Select via `onValuesChange` — switch on strips all globals from the selection, switch off re-adds them. Existing opt-in extras are preserved either way. - When the switch is on, global options in the Select are individually disabled (greyed out) so the user can still manage opt-in guardrails but cannot accidentally re-enable a global the kill switch is bypassing. - The save handler writes both fields together: `disable_global_guardrails` reflects the switch, and `opted_out_global_guardrails` is set to either every global (when the switch is on) or the user's explicit opt-outs. - `effectiveGuardrails` for the form's initialValues honors the kill switch on legacy teams so the form opens in a state that matches what the runtime gate is actually doing — fixes the visual lie where chips appeared active while the switch was bypassing them. The backend gate already reads the list as the primary path with the bool as a fallback, so untouched legacy teams keep working until they get edited, at which point they migrate naturally.
Rename disable_global_guardrail → disable_global_guardrails to match the key name used by litellm_pre_call_utils.py, the API endpoints, and the UI when propagating key/team metadata. The singular form was introduced in PR #16983 and has never matched the plural form written by the rest of the codebase, so the feature silently did nothing. Re-applies fix originally from #25488. Original commit could not be merged due to missing signature. Co-Authored-By: Remi Mabon <remi.mabon@redcare-pharmacy.com>
Add disabled_global_guardrails list field to team metadata that selectively skips named globals at request time. Coexists with the existing disable_global_guardrails boolean kill switch — the boolean kills all globals (including future ones), the new list selectively skips named ones (new globals auto-apply). The field lives in the team metadata JSON column; no schema migration. - litellm/proxy/litellm_pre_call_utils.py: propagate the field from team_metadata into per-request data.metadata - litellm/integrations/custom_guardrail.py: new get_disabled_global_guardrails_from_metadata helper plus a scoped-to-globals early return at the top of should_run_guardrail
Extend useGuardrails to return the full guardrail objects plus derived globalGuardrailNames / optionalGuardrailNames sets via React Query's select option, instead of just an array of names. Update its existing consumer (AddModelForm) to extract names from the new shape. The previous shape was tailored to AddModelForm's single use case (populate a Select with names). The team info per-guardrail opt-out work needs default_on per guardrail to split globals from non-globals, which the old shape couldn't provide. Consolidating into the existing hook gives both consumers one source of truth and one React Query cache entry instead of two parallel fetches. - useGuardrails.ts: rewrite return type, derive global/optional sets in select(); preserve the existing query key and auth-gate semantics - AddModelForm.tsx: extract names from data?.guardrails.map(...) - AddModelForm.test.tsx: update mock to return the new shape (also fixes a pre-existing shape mismatch in the mock) - useGuardrails.test.ts: update 3 assertions to read names via data?.guardrails.map(...) instead of asserting against the flat array
Replace the team info page's tags-mode guardrails Select with a grouped
multi-select that splits guardrails into "Global" (default_on=true) and
"Other" sections. On submit, derive both metadata fields from the single
selection array:
- metadata.guardrails: non-global opt-ins (additive)
- metadata.disabled_global_guardrails: globals the team has opted out of
The legacy disable_global_guardrails kill-switch toggle stays alongside
the multiselect and remains semantically distinct: it kills all global
guardrails including any added in the future, while the new list lets
new globals auto-apply unless explicitly excluded.
Other changes:
- Switch from a manual useEffect/useState guardrails fetch to the
consolidated useGuardrails hook
- Replace stale "Select existing guardrails or enter new ones" help
text with a single tooltip on each section's info icon
- Render selected chips with green/blue color coding via tagRender so
global vs non-global is visible without opening the dropdown
- Drop the redundant [Global] tag from inside the OptGroup options
(the group label already conveys it)
- Update the read-only display to show the effective set (globals not
opted out + additive opt-ins) with [Global] suffix on globals
… form The team edit form computed its `guardrails` initialValues from `globalGuardrailNames`, which is empty until `useGuardrails` resolves. Because Ant Design's `<Form initialValues>` is read once on mount, a user who clicked Edit before the query resolved would see globals missing from the Select and could save that stale state, silently opting the team out of every global guardrail. Gate the form on `!isGuardrailsLoading` so initialValues always sees the resolved set.
… with opt-out list Renames the new per-guardrail opt-out field from `disabled_global_guardrails` to `opted_out_global_guardrails` to eliminate the one-character collision with the legacy `disable_global_guardrails` boolean kill switch. Adds a type guard on the new gate so a misnamed bool can't crash the guardrail check. Filters duplicates out of the team-edit guardrail display for legacy teams that have a global name persisted in `metadata.guardrails` from before this PR. Drops the unused `isGuardrailsLoading` and `guardrailsError` destructures left in AddModelForm after the hook refactor. Adds Python tests for the new gate behavior (root, litellm_metadata, metadata, non-matching name, empty list, malformed bool value, opt-in coexistence) and extends useGuardrails.test.ts to exercise the global / optional partition logic that the rebuilt hook performs in its `select` transform. Wires the legacy kill switch and the new opt-out list together in the team edit form so they can never fall out of sync: - Toggling the kill switch reactively updates the Guardrails Select via `onValuesChange` — switch on strips all globals from the selection, switch off re-adds them. Existing opt-in extras are preserved either way. - When the switch is on, global options in the Select are individually disabled (greyed out) so the user can still manage opt-in guardrails but cannot accidentally re-enable a global the kill switch is bypassing. - The save handler writes both fields together: `disable_global_guardrails` reflects the switch, and `opted_out_global_guardrails` is set to either every global (when the switch is on) or the user's explicit opt-outs. - `effectiveGuardrails` for the form's initialValues honors the kill switch on legacy teams so the form opens in a state that matches what the runtime gate is actually doing — fixes the visual lie where chips appeared active while the switch was bypassing them. The backend gate already reads the list as the primary path with the bool as a fallback, so untouched legacy teams keep working until they get edited, at which point they migrate naturally.
…echeck The new effectiveGuardrails computation derived its element type from `info.metadata?.guardrails`, which is implicitly typed `any[]`, so the downstream `.map((name) => ...)` callback inferred `name` as implicit `any` and broke the production typecheck.
835d951 to
30c2357
Compare
| if "disable_global_guardrails" in data: | ||
| return data["disable_global_guardrails"] | ||
| metadata = data.get("litellm_metadata") or data.get("metadata", {}) | ||
| if "disable_global_guardrail" in metadata: | ||
| return metadata["disable_global_guardrail"] | ||
| if "disable_global_guardrails" in metadata: | ||
| return metadata["disable_global_guardrails"] | ||
| return False |
There was a problem hiding this comment.
Breaking rename of
disable_global_guardrail to disable_global_guardrails
get_disable_global_guardrail now exclusively checks for the plural key disable_global_guardrails, dropping the old singular form. Any caller who was passing disable_global_guardrail: true (singular) directly in request data or litellm_metadata/metadata will silently find the kill switch no longer working after this deploy. The old test cases were explicitly exercising this direct-request path with the singular key, confirming it was the supported API contract.
The team-metadata path was already writing the plural form (via litellm_pre_call_utils.py), so there's a legitimate fix here — but it should be additive rather than a hard rename:
def get_disable_global_guardrail(self, data: dict) -> Optional[bool]:
"""Returns True if the global guardrail should be disabled"""
# Check plural (new) then singular (backwards compat)
if "disable_global_guardrails" in data:
return data["disable_global_guardrails"]
if "disable_global_guardrail" in data: # legacy key
return data["disable_global_guardrail"]
metadata = data.get("litellm_metadata") or data.get("metadata", {})
if "disable_global_guardrails" in metadata:
return metadata["disable_global_guardrails"]
if "disable_global_guardrail" in metadata: # legacy key
return metadata["disable_global_guardrail"]
return FalseRule Used: What: avoid backwards-incompatible changes without... (source)
Addresses a11y feedback — global vs. non-global guardrails were distinguished only by color (green vs. blue). Adds GlobalOutlined next to global guardrails in (1) the selected-chip tagRender, (2) the dropdown OptGroup label, and (3) the team info read view badge.
…am-specific sections Replaces the flat guardrails list with two subsections under the Guardrails card, so the global vs. team-specific distinction is carried by the section headers instead of per-badge markers. The kill-switch state now renders in place of the Global subsection as "Bypassed for this team", and the separate "Disable Global Guardrails" field with its confusing "Disabled - Global guardrails active" badge is removed.
The Team Settings tab's read view listed every team field except guardrails. Adds a Guardrails entry after Status with the same Global / Team-specific subsections used on the Overview tab, so the kill switch state and per-section membership are visible without entering edit mode.
…m views Pulls the Global / Team-specific subsection rendering out of TeamInfo.tsx into a shared GuardrailSettingsView component with card and inline variants, used on both the team Overview tab (inside the existing Tremor Card) and the Team Settings tab read view. The Global subsection header now carries a GlobalOutlined icon, and since the icon is load-bearing the edit-form chip coloring is simplified to a single blue instead of green/blue.
| {isEditing && isGuardrailsLoading ? ( | ||
| <div className="p-4">Loading...</div> | ||
| ) : isEditing ? ( |
There was a problem hiding this comment.
Error state not guarded — silent opt-out data loss
The loading check gates on isGuardrailsLoading but not on the error state. When the guardrails query fails (network error, service down, etc.), isGuardrailsLoading becomes false while guardrailsData stays undefined, so globalGuardrailNames falls back to new Set<string>(). If the user opens the edit form in that state and saves — even without making changes — optedOutGlobalGuardrails is computed as [] and overwrites whatever was previously stored, silently re-enabling any globals the team had opted out of.
You'd also need to destructure isError: isGuardrailsError from the useGuardrails() call on line 183.
Converts GuardrailSettingsView from @tremor/react (Badge, Text) to antd (Tag, plain spans) as part of the Tremor migration. Also captures the "no new Tremor imports" rule in CLAUDE.md and expands the existing note in AGENTS.md with the specific antd equivalents and the yellow→gold gotcha.
| form.setFieldValue( | ||
| "guardrails", | ||
| checked ? nonGlobals : [...Array.from(globalGuardrailNames), ...nonGlobals], | ||
| ); |
There was a problem hiding this comment.
Kill-switch toggle ON → OFF silently re-enables previously opted-out globals
When a team has, say, guardrail A opted out and B active, the form initializes with guardrails = [B, ...]. Toggling the kill switch ON removes B; toggling it back OFF calls [...Array.from(globalGuardrailNames), ...nonGlobals], which unconditionally adds all global guardrails — including A — back to the selection. Saving at that point computes optedOutGlobalGuardrails = [] and silently re-enables A for this team.
The fix is to snapshot which globals were selected before the kill switch was enabled and restore only those on disable:
onValuesChange={(changedValues) => {
if ("disable_global_guardrails" in changedValues) {
const checked = changedValues.disable_global_guardrails === true;
const current = (form.getFieldValue("guardrails") || []) as string[];
const nonGlobals = current.filter((n) => !globalGuardrailNames.has(n));
const prevGlobals = current.filter((n) => globalGuardrailNames.has(n));
form.setFieldValue(
"guardrails",
checked ? nonGlobals : [...prevGlobals, ...nonGlobals],
);
}
}}
Using prevGlobals (globals that were selected before the toggle) instead of Array.from(globalGuardrailNames) preserves the team's existing per-guardrail opt-out state across a kill-switch round-trip.
Updates the expected header text to "Guardrails Settings" to match GuardrailSettingsView's rendering, and moves the mock guardrails from team_info.guardrails (legacy top-level path that nothing reads) to team_info.metadata.guardrails where the component actually looks. Also tightens the assertion to verify the individual guardrail names appear, not just the section header.
Summary
Adds a per-guardrail opt-out for global (default_on) guardrails on a team, alongside the existing all-or-nothing kill switch. Backed by a new
disabled_global_guardrails: List[str]field on team metadata that the request-time gate checks before running each global guardrail.CustomGuardrail.should_run_guardrailnow consultsdisabled_global_guardrailsand skips the guardrail when its name is present anddefault_on=True.litellm_pre_call_utilspropagates the field from team metadata onto the request.metadata.guardrails(extras) andmetadata.disabled_global_guardrails(globals NOT chosen).useGuardrailsnow exposes the full guardrail objects plus derivedglobalGuardrailNames/optionalGuardrailNamessets via React Query'sselect. The olduseGuardrailsList.ts(which only returned names) is removed andAddModelFormupdated to consume the new shape.CircleCI: https://app.circleci.com/pipelines/github/BerriAI/litellm/73458/workflows/3356a97c-841f-4c61-83ef-67527878801f
Screenshots
Test plan
useGuardrailshook tests pass (10 tests)TeamInfotests pass (25 tests)disabled_global_guardrailssaves as[]and globals still apply