Skip to content

feat(guardrails): optional skip system message in unified guardrail inputs#25481

Merged
krrish-berri-2 merged 3 commits intolitellm_internal_staging_04_11_2026from
litellm_skip-system-message-unified-guardrails
Apr 11, 2026
Merged

feat(guardrails): optional skip system message in unified guardrail inputs#25481
krrish-berri-2 merged 3 commits intolitellm_internal_staging_04_11_2026from
litellm_skip-system-message-unified-guardrails

Conversation

@Sameerlite
Copy link
Copy Markdown
Collaborator

@Sameerlite Sameerlite commented Apr 10, 2026

Summary

Adds optional exclusion of role: system messages from unified guardrail evaluation inputs (flattened texts and structured_messages) on OpenAI Chat Completions and Anthropic Messages, while leaving the request payload to the LLM unchanged.

Fixes LIT-2382

Configuration

  • Global: litellm_settings.skip_system_message_in_guardrail (also sets litellm.skip_system_message_in_guardrail).
  • Per guardrail: litellm_params.skip_system_message_in_guardrail (true / false / omit to inherit). Registry sets the attribute on the callback after init.

Implementation

  • litellm/llms/base_llm/guardrail_translation/utils.py — resolution helper and OpenAI-shaped message filter.
  • OpenAI and Anthropic chat guardrail translation handlers apply the flag when building inputs.
  • Docs: guardrails quick start + litellm_settings table in config reference.
image

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 10, 2026 7:00pm

Request Review

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented Apr 10, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing litellm_skip-system-message-unified-guardrails (2abfaa5) with main (d0e347a)

Open in CodSpeed

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 10, 2026

Greptile Summary

This PR adds an optional skip_system_message_in_guardrail flag that excludes role: system messages from unified guardrail evaluation inputs (texts and structured_messages) while leaving the LLM payload intact. The feature supports both global (litellm_settings.skip_system_message_in_guardrail) and per-guardrail override semantics, and is surfaced in the dashboard UI as a tri-state selector (Yes / No / Use global default).

Confidence Score: 5/5

Safe to merge — feature is opt-in, backward compatible, and no P0/P1 issues found.

All remaining findings from prior rounds are P2 or lower (inline import comment, Anthropic test coverage gap). Core logic is sound: task_mappings continue to reference original message indices so write-back is correct, tri-state semantics are consistently implemented across both handlers and all UI forms, and update_in_memory_litellm_params propagates the new field correctly via Pydantic dict.

No files require special attention.

Important Files Changed

Filename Overview
litellm/llms/base_llm/guardrail_translation/utils.py Adds resolution helper and filter function; inline import litellm is justified to avoid circular import but warrants a comment (flagged in prior review).
litellm/llms/openai/chat/guardrail_translation/handler.py Correctly applies skip flag to both texts_to_check (via _extract_inputs) and structured_messages; task_mappings remain consistent since msg_idx still refers to the original list.
litellm/llms/anthropic/chat/guardrail_translation/handler.py Skip flag applied correctly to both _extract_input_text_and_images and structured_messages; Anthropic system is a top-level field (not a message), so the flag primarily filters the OpenAI-converted structured_messages.
litellm/proxy/guardrails/guardrail_registry.py Sets skip_system_message_in_guardrail on the callback via setattr after init; update_in_memory_litellm_params (via vars() iteration) also covers the update path since the field is declared in LitellmParams.
litellm/types/guardrails.py Adds skip_system_message_in_guardrail: Optional[bool] = None to BaseLitellmParams with clear docstring; tri-state semantics (None = inherit) are well-defined.
tests/test_litellm/proxy/guardrails/guardrail_hooks/unified_guardrails/test_unified_guardrail.py Unit tests cover utility functions, per-guardrail override semantics, and the OpenAI handler integration; Anthropic handler path is not directly tested (flagged in prior review).
ui/litellm-dashboard/src/components/guardrails/guardrail_info_helpers.tsx Adds SkipSystemMessageChoice type and two mapping helpers; logic is correct and properly handles the tri-state.
ui/litellm-dashboard/src/components/guardrails/add_guardrail_form.tsx Adds skip_system_message_choice selector; omits key for inherit on create, which correctly results in None default on backend.
ui/litellm-dashboard/src/components/guardrails/edit_guardrail_form.tsx Adds skip_system_message_choice to edit form; deletes key for inherit while guardrail_info.tsx sends null — both lead to None on backend for full-replace semantics.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["Guardrail pre-call hook"] --> B["effective_skip_system_message_for_guardrail(guardrail)"]
    B --> C{Per-guardrail\nskip set?}
    C -->|"skip_system_message_in_guardrail = True/False"| D["Use per-guardrail value"]
    C -->|"None (inherit)"| E["Read litellm.skip_system_message_in_guardrail"]
    D --> F{skip = True?}
    E --> F
    F -->|Yes| G["Filter system msgs from texts_to_check"]
    F -->|Yes| H["openai_messages_without_system(structured_messages)"]
    F -->|No| I["Include all messages in texts_to_check"]
    F -->|No| J["Pass full messages as structured_messages"]
    G --> K["GenericGuardrailAPIInputs\n(filtered texts + structured_messages)"]
    H --> K
    I --> K
    J --> K
    K --> L["guardrail.apply_guardrail(inputs)"]
    L --> M["Apply responses back via task_mappings\n(original message indices preserved)"]
    M --> N["LLM receives unchanged full messages"]
Loading

Reviews (3): Last reviewed commit: "fix(guardrails): type structured_message..." | Re-trigger Greptile

Comment thread litellm/llms/base_llm/guardrail_translation/utils.py
Comment thread litellm/__init__.py Dismissed
Add a tri-state control (inherit / yes / no) when creating or editing
guardrails so admins can set litellm_params.skip_system_message_in_guardrail
without YAML. Table edit merges existing litellm_params before PUT to avoid
wiping content-filter and other provider fields.

Document the dashboard flow in the guardrails quick start with a screenshot.

Made-with: Cursor
Use AllMessageValues in openai_messages_without_system and cast adapter
request messages so GenericGuardrailAPIInputs matches TypedDict.

Made-with: Cursor
@Sameerlite Sameerlite temporarily deployed to integration-postgres April 10, 2026 18:58 — with GitHub Actions Inactive
@Sameerlite Sameerlite temporarily deployed to integration-postgres April 10, 2026 18:58 — with GitHub Actions Inactive
@Sameerlite Sameerlite temporarily deployed to integration-postgres April 10, 2026 18:58 — with GitHub Actions Inactive
@krrish-berri-2 krrish-berri-2 changed the base branch from main to litellm_internal_staging_04_11_2026 April 11, 2026 15:53
@krrish-berri-2 krrish-berri-2 merged commit c13be44 into litellm_internal_staging_04_11_2026 Apr 11, 2026
104 of 108 checks passed
@krrish-berri-2 krrish-berri-2 deleted the litellm_skip-system-message-unified-guardrails branch April 11, 2026 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants