feat(guardrails): optional skip system message in unified guardrail inputs#25481
Conversation
…nputs Made-with: Cursor
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR adds an optional Confidence Score: 5/5Safe to merge — feature is opt-in, backward compatible, and no P0/P1 issues found. All remaining findings from prior rounds are P2 or lower (inline import comment, Anthropic test coverage gap). Core logic is sound: task_mappings continue to reference original message indices so write-back is correct, tri-state semantics are consistently implemented across both handlers and all UI forms, and update_in_memory_litellm_params propagates the new field correctly via Pydantic dict. No files require special attention.
|
| Filename | Overview |
|---|---|
| litellm/llms/base_llm/guardrail_translation/utils.py | Adds resolution helper and filter function; inline import litellm is justified to avoid circular import but warrants a comment (flagged in prior review). |
| litellm/llms/openai/chat/guardrail_translation/handler.py | Correctly applies skip flag to both texts_to_check (via _extract_inputs) and structured_messages; task_mappings remain consistent since msg_idx still refers to the original list. |
| litellm/llms/anthropic/chat/guardrail_translation/handler.py | Skip flag applied correctly to both _extract_input_text_and_images and structured_messages; Anthropic system is a top-level field (not a message), so the flag primarily filters the OpenAI-converted structured_messages. |
| litellm/proxy/guardrails/guardrail_registry.py | Sets skip_system_message_in_guardrail on the callback via setattr after init; update_in_memory_litellm_params (via vars() iteration) also covers the update path since the field is declared in LitellmParams. |
| litellm/types/guardrails.py | Adds skip_system_message_in_guardrail: Optional[bool] = None to BaseLitellmParams with clear docstring; tri-state semantics (None = inherit) are well-defined. |
| tests/test_litellm/proxy/guardrails/guardrail_hooks/unified_guardrails/test_unified_guardrail.py | Unit tests cover utility functions, per-guardrail override semantics, and the OpenAI handler integration; Anthropic handler path is not directly tested (flagged in prior review). |
| ui/litellm-dashboard/src/components/guardrails/guardrail_info_helpers.tsx | Adds SkipSystemMessageChoice type and two mapping helpers; logic is correct and properly handles the tri-state. |
| ui/litellm-dashboard/src/components/guardrails/add_guardrail_form.tsx | Adds skip_system_message_choice selector; omits key for inherit on create, which correctly results in None default on backend. |
| ui/litellm-dashboard/src/components/guardrails/edit_guardrail_form.tsx | Adds skip_system_message_choice to edit form; deletes key for inherit while guardrail_info.tsx sends null — both lead to None on backend for full-replace semantics. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["Guardrail pre-call hook"] --> B["effective_skip_system_message_for_guardrail(guardrail)"]
B --> C{Per-guardrail\nskip set?}
C -->|"skip_system_message_in_guardrail = True/False"| D["Use per-guardrail value"]
C -->|"None (inherit)"| E["Read litellm.skip_system_message_in_guardrail"]
D --> F{skip = True?}
E --> F
F -->|Yes| G["Filter system msgs from texts_to_check"]
F -->|Yes| H["openai_messages_without_system(structured_messages)"]
F -->|No| I["Include all messages in texts_to_check"]
F -->|No| J["Pass full messages as structured_messages"]
G --> K["GenericGuardrailAPIInputs\n(filtered texts + structured_messages)"]
H --> K
I --> K
J --> K
K --> L["guardrail.apply_guardrail(inputs)"]
L --> M["Apply responses back via task_mappings\n(original message indices preserved)"]
M --> N["LLM receives unchanged full messages"]
Reviews (3): Last reviewed commit: "fix(guardrails): type structured_message..." | Re-trigger Greptile
Add a tri-state control (inherit / yes / no) when creating or editing guardrails so admins can set litellm_params.skip_system_message_in_guardrail without YAML. Table edit merges existing litellm_params before PUT to avoid wiping content-filter and other provider fields. Document the dashboard flow in the guardrails quick start with a screenshot. Made-with: Cursor
Use AllMessageValues in openai_messages_without_system and cast adapter request messages so GenericGuardrailAPIInputs matches TypedDict. Made-with: Cursor
c13be44
into
litellm_internal_staging_04_11_2026
Summary
Adds optional exclusion of
role: systemmessages from unified guardrail evaluation inputs (flattenedtextsandstructured_messages) on OpenAI Chat Completions and Anthropic Messages, while leaving the request payload to the LLM unchanged.Fixes LIT-2382
Configuration
litellm_settings.skip_system_message_in_guardrail(also setslitellm.skip_system_message_in_guardrail).litellm_params.skip_system_message_in_guardrail(true/false/ omit to inherit). Registry sets the attribute on the callback after init.Implementation
litellm/llms/base_llm/guardrail_translation/utils.py— resolution helper and OpenAI-shaped message filter.litellm_settingstable in config reference.