Skip to content

fix(proxy): preserve dict guardrail HTTPException.detail + bedrock context#25558

Merged
krrish-berri-2 merged 1 commit intoBerriAI:litellm_internal_staging_04_11_2026from
michelligabriele:fix/bedrock-guardrail-error-clarity
Apr 11, 2026
Merged

fix(proxy): preserve dict guardrail HTTPException.detail + bedrock context#25558
krrish-berri-2 merged 1 commit intoBerriAI:litellm_internal_staging_04_11_2026from
michelligabriele:fix/bedrock-guardrail-error-clarity

Conversation

@michelligabriele
Copy link
Copy Markdown
Collaborator

Title

fix(proxy): preserve dict guardrail HTTPException.detail + bedrock context

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🐛 Bug Fix

Changes

The bug

When a guardrail raises HTTPException(status_code=..., detail={dict}) — as Bedrock Guardrails (and any guardrail that wants to return structured violation context) does — the proxy collapses that dict into a string via str() at two error-handling sites in litellm/proxy/common_request_processing.py. The result is a Python-repr blob (single quotes, escaped commas) inside error.message on both the streaming SSE error frame and the non-streaming JSON response. The wire format is technically valid JSON wrapping an invalid JSON message field — unparseable by any client.

For a Bedrock guardrail block on a streaming request, the user previously saw:

data: {"error": {"message": "{'error': 'Violated guardrail policy', 'bedrock_guardrail_response': '...'}", "code": 400}}

data: [DONE]

Two related defects compound the unclarity even after a hypothetical serialization fix:

  • The dispatcher (ProxyLogging) never enriches the error with the originating guardrail's name or lifecycle stage. Even with clean serialization, the user can't tell which of their configured guardrails fired or at what stage.
  • The Bedrock guardrail in particular discards the rich assessments list returned by apply_guardrail, keeping only the canned outputs[].text. The user can't tell whether they tripped PII detection, a topic policy, a content filter, or a custom word list.

This PR addresses all three layers in one commit so the customer-facing error is actually clear, not just technically parseable.

The fix — three layers, one commit

L1 — Centralize dict-detail serialization at the proxy boundary

litellm/proxy/common_request_processing.py

  • New module-level helper _serialize_http_exception_detail(detail) -> Tuple[str, Optional[dict]]. Documented fallback chain so the dominant guardrail shapes both round-trip cleanly:
    1. detail['error'] if str (Bedrock-style flat)
    2. detail['error']['message'] if detail['error'] is a dict with a str message (PANW Prisma AIRS-style nested)
    3. detail['message'] if str
    4. json.dumps(detail) — JSON, never Python repr
  • Wired into both error sites:
    • Streamingcreate_response() exception branch. The SSE error frame is rebuilt to mirror ProxyException.to_dict() exactly, so streaming and non-streaming surfaces emit byte-identical error objects.
    • Non-streaming_handle_llm_api_exception() HTTPException branch.
  • ProxyException itself is not modified — the # DO NOT MODIFY THIS constraint at _types.py:3398 is respected.

L2 — Enrich HTTPException.detail with guardrail name + lifecycle stage at the dispatcher

litellm/proxy/utils.py

  • New module-level helper _enrich_http_exception_with_guardrail_context(exc, callback). Mutates the exception's detail dict in place via setdefault to add guardrail_name and guardrail_mode, taken from the callback instance. Uses setdefault so guardrails that already populate these fields explicitly win over the inferred defaults. No-op for non-HTTPException, non-dict-detail, or callbacks without guardrail_name. Never raises.
  • Two ProxyLogging static-method helpers:
    • _run_guardrail_task_with_enrichment(callback, coro) — wraps an awaited coroutine, enriches on except, re-raises.
    • _wrap_streaming_iterator_with_enrichment(callback, gen) — wraps a chained async generator and enriches on iteration error. Needed because async_post_call_streaming_iterator_hook builds wrapped generators rather than awaiting coroutines, so exceptions raise during the consumer's async for rather than at construction. Each layer of the chain attributes its own callback.
  • Wired into all four guardrail hook dispatch sites in ProxyLogging:
    • _process_guardrail_callback (pre_call) — direct enrichment in the existing except block.
    • during_call_hook — both branches (apply_guardrail unified path + bare callback.async_moderation_hook path) wrap the task in _run_guardrail_task_with_enrichment.
    • post_call_success_hook — both branches wrap the inner await in a try/except that enriches and re-raises (the existing outer try/except at the loop level only re-raises, so per-callback attribution requires inner wrapping).
    • async_post_call_streaming_iterator_hook — all 3 branches (regular async_post_call_streaming_iterator_hook + apply_guardrail unified + fallback) wrap each chained generator in _wrap_streaming_iterator_with_enrichment.

This means the other 27 guardrail hook implementations get L1 (clean serialization) and L2 (guardrail name + lifecycle stage) for free, with no per-provider changes.

L3 — Surface Bedrock assessments at the source

litellm/proxy/guardrails/guardrail_hooks/bedrock_guardrails.py

  • New private method _extract_blocked_assessments(response) -> List[dict] on BedrockGuardrail. Walks the same five policy categories that the existing _should_raise_guardrail_blocked_exception() already iterates (topicPolicy, contentPolicy, wordPolicy × {customWords, managedWordLists}, sensitiveInformationPolicy × {piiEntities, regexes}, contextualGroundingPolicy) and emits a structured list of {policy, matches} entries. Each match preserves the originating sub-key (piiEntities, regexes, customWords, etc.) under category, plus type/action/match where available. Mirrors the existing iteration so future maintenance keeps the two methods in sync naturally.
  • _get_http_exception_for_blocked_guardrail() extended to also add guardrailIdentifier and guardrailVersion (both already on self from the constructor) plus the assessments list to the detail dict. Existing error and bedrock_guardrail_response keys are preserved at the top level — additive only, no rename, no restructure. The disable_exception_on_block=True GuardrailInterventionNormalStringError branch is untouched.

How the three layers compose

L3 puts rich provider-specific content into HTTPException.detail. L2 enriches that same detail dict with the dispatcher-level context (which guardrail, which stage). L1 makes sure all of it survives the trip to the client without being stringified, on both the streaming and non-streaming surfaces.

Final wire shape

For a Bedrock PII block, both surfaces now return (byte-identical error object):

{
  "error": {
    "message": "Violated guardrail policy",
    "type": "None",
    "param": "None",
    "code": "400",
    "provider_specific_fields": {
      "error": "Violated guardrail policy",
      "bedrock_guardrail_response": "Sorry, the model cannot answer this question. Prompt is blocked",
      "guardrailIdentifier": "<id>",
      "guardrailVersion": "<version>",
      "assessments": [
        {
          "policy": "sensitiveInformationPolicy",
          "matches": [
            {"category": "piiEntities", "type": "NAME", "match": "<matched-term>", "action": "BLOCKED"}
          ]
        }
      ],
      "guardrail_name": "<configured-name>",
      "guardrail_mode": "post_call"
    }
  }
}

Streaming wraps this in data: {...}\n\ndata: [DONE]\n\n and non-streaming returns it as the response body. The user now sees: which guardrail blocked them, at what lifecycle stage, the Bedrock guardrail identifier (so they can find it in the AWS console), and the exact assessment that fired down to the matched term and policy sub-category.

Backwards-compatibility notes

  • Streaming SSE error frame gains type, param, and stringifies code to align with ProxyException.to_dict() (which is already the contract on the non-streaming surface). Realistically nothing was parsing the prior Python-repr blob inside message (it was invalid JSON), so this is not a regression in any meaningful sense — but flagging it explicitly. Two existing test assertions in test_common_request_processing.py are updated for the new shape (test_create_streaming_response_generator_raises_unexpected_exception, test_create_streaming_response_generator_raises_http_exception).
  • Bedrock detail dict is purely additive: existing error and bedrock_guardrail_response keys remain at the top level. Any existing client parsing those is unaffected. The new fields (guardrailIdentifier, guardrailVersion, assessments) appear alongside, never replacing.
  • L2 in-place mutation uses setdefault throughout, so guardrails that explicitly set guardrail_name or guardrail_mode in their detail dict (none currently do, but it's possible) win over the inferred values. The helper never raises and is safe to call on non-HTTPException, non-dict-detail, or callbacks without guardrail_name.
  • Performance: all three layers add a handful of dict ops on the error path only — no impact on the success path.
  • Rollback: revert the commit to fully restore prior behavior. No migrations, no schema changes, no dependency changes.

Tests added

L1 — tests/test_litellm/proxy/test_common_request_processing.py:

  • test_serialize_http_exception_detail_helper — direct unit coverage for all branches of the helper (str / flat-dict / nested-error-dict / top-level-message-dict / opaque-dict / non-str non-dict).
  • test_create_streaming_response_http_exception_dict_detail_bedrock_shape — full Bedrock dict detail survives as provider_specific_fields in the SSE frame.
  • test_create_streaming_response_http_exception_dict_detail_nested_error_shape — PANW-style {"error": {"message": ...}} shape extracts error.message while preserving the full payload.
  • New TestHandleLLMApiExceptionDictDetail class — non-streaming branch coverage (dict detail preserved + string detail unchanged).
  • Updated 2 existing assertions for the new SSE shape.

L2 — tests/test_litellm/proxy/test_proxy_utils.py:

  • 5 new direct unit tests on _enrich_http_exception_with_guardrail_context (dict-detail enriches / string-detail noop / setdefault does not overwrite / non-HTTPException noop / callback without guardrail_name noop).

L3 — tests/test_litellm/proxy/guardrails/guardrail_hooks/test_bedrock_guardrails.py:

  • 4 new tests on _extract_blocked_assessments (PII entity / multiple policies / only-anonymized empty / no-assessments empty).
  • 2 new end-to-end tests on _get_http_exception_for_blocked_guardrail (with assessments and identifier / no blocked assessments omits the field).

All targeted suites pass locally:

tests/test_litellm/proxy/test_common_request_processing.py                       83 passed
tests/test_litellm/proxy/test_proxy_utils.py                                     21 passed
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_bedrock_guardrails.py   25 passed

Out of scope (deliberate)

  • ProxyException# DO NOT MODIFY THIS constraint respected.
  • The other 27 guardrail hook files — they get L1 + L2 for free. L3-equivalent provider-specific assessment surfacing is a follow-up per provider.
  • The Zscaler-specific branch from [Bug]: when LLM reponse got blocked by Zguard, the user facing message contain unexpected fallback #20610 — once L1+L2 are in, it becomes redundant; removing it is a separate cleanup PR.
  • GuardrailInterventionNormalStringError fallback in the Bedrock helper — separate disable_exception_on_block=True code path, out of scope.
  • Refactoring ProxyLogging dispatch to use _execute_guardrail_hook for all four hook types — the current per-site wrap is the smallest possible change; full unification is a separate refactor.

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 11, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 11, 2026 0:11am

Request Review

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 11, 2026

Greptile Summary

This PR fixes three compounding issues when a guardrail raises HTTPException(detail={...}): the dict was str()-mangled into a Python-repr blob on both the streaming SSE frame and non-streaming JSON response, the dispatcher never attributed which guardrail fired or at what lifecycle stage, and Bedrock specifically dropped its rich assessment data. The fix is a clean three-layer composition — centralized serialization helper (L1), in-place detail enrichment at every guardrail dispatch site (L2), and structured assessment extraction in the Bedrock hook (L3) — with comprehensive mock-based tests for each layer.

  • The SSE error.code field changes from an integer to a string (e.g. 400\"400\") to align with ProxyException.to_dict(). Clients doing strict integer comparison against the streaming error code will see different output; this is acknowledged in the PR but is not gated by a feature flag per the project style guide.

Confidence Score: 5/5

Safe to merge; all remaining findings are P2 style suggestions on a niche edge case and an acknowledged backwards-incompatible format alignment.

The three-layer fix is well-structured, additive, and comprehensively tested with 16 new mock-only tests. The one P2 concern (streaming path calling bare json.dumps on a dict that could contain a Mode Pydantic object when tag-based mode selection is used) is a niche edge case that does not affect standard Bedrock or string-mode guardrail configurations. All other findings are style or backwards-compat notes already documented in the PR.

litellm/proxy/utils.py — _enrich_http_exception_with_guardrail_context stores event_hook (which can be a Mode Pydantic model) directly into the dict that later passes through bare json.dumps in the streaming error path.

Important Files Changed

Filename Overview
litellm/proxy/common_request_processing.py New _serialize_http_exception_detail helper correctly extracts message and preserves dict payload; both streaming and non-streaming error sites wired consistently.
litellm/proxy/utils.py L2 enrichment helpers are clean; event_hook could be a Mode Pydantic object that is not JSON-serializable via the plain json.dumps in the streaming error path.
litellm/proxy/guardrails/guardrail_hooks/bedrock_guardrails.py New _extract_blocked_assessments mirrors the existing _should_raise_guardrail_blocked_exception iteration correctly; additive changes to _get_http_exception_for_blocked_guardrail preserve existing keys.
tests/test_litellm/proxy/test_common_request_processing.py Two existing assertions updated for new SSE shape (intentional); five new tests cover L1 helper branches and both Bedrock/PANW dict shapes end-to-end.
tests/test_litellm/proxy/test_proxy_utils.py Five new unit tests cover all branches of _enrich_http_exception_with_guardrail_context including setdefault non-overwrite and no-op cases.
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_bedrock_guardrails.py Six new tests cover L3 assessment extraction (PII, multi-policy, ANONYMIZED-only, no assessments) and end-to-end _get_http_exception_for_blocked_guardrail; all mock-based with no network calls.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Guardrail raises HTTPException\ndetail = dict] --> B{Lifecycle stage}
    B --> |pre_call| C[_process_guardrail_callback\nexcept block]
    B --> |during_call| D[_run_guardrail_task_with_enrichment]
    B --> |post_call| E[post_call_success_hook\ntry/except wrapper]
    B --> |streaming| F[_wrap_streaming_iterator_with_enrichment]
    C --> G[_enrich_http_exception_with_guardrail_context\nadd guardrail_name + guardrail_mode via setdefault]
    D --> G
    E --> G
    F --> G
    G --> H{Request type}
    H --> |streaming| I[create_response exception branch\n_serialize_http_exception_detail]
    H --> |non-streaming| J[_handle_llm_api_exception\n_serialize_http_exception_detail]
    I --> K[SSE error frame\nmessage + type + param + code + provider_specific_fields]
    J --> L[ProxyException\nmessage + provider_specific_fields]
    M[BedrockGuardrail raises\nHTTPException] --> N[L3: _get_http_exception_for_blocked_guardrail\nadds guardrailIdentifier + guardrailVersion\n+ assessments from _extract_blocked_assessments]
    N --> A
Loading

Reviews (1): Last reviewed commit: "fix(proxy): preserve dict guardrail HTTP..." | Re-trigger Greptile

Comment thread litellm/proxy/utils.py
Comment on lines +316 to +326
if not isinstance(exc, HTTPException):
return
detail = getattr(exc, "detail", None)
if not isinstance(detail, dict):
return
guardrail_name = getattr(callback, "guardrail_name", None)
if guardrail_name:
detail.setdefault("guardrail_name", guardrail_name)
event_hook = getattr(callback, "event_hook", None)
if event_hook:
detail.setdefault("guardrail_mode", event_hook)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 guardrail_mode may store a non-serializable value in the streaming error path

callback.event_hook can be a Mode Pydantic model (for tag-based mode selection) or a List[GuardrailEventHooks] — neither of which is serializable by the standard json.dumps call in create_response's error_gen_message. When a streaming request hits such a guardrail, the inner json.dumps({'error': error_obj}) would raise TypeError: Object of type Mode is not JSON serializable, causing the stream to break rather than returning a clean 400 error frame.

Consider coercing event_hook to a plain string before storing it:

event_hook = getattr(callback, "event_hook", None)
if event_hook is not None:
    if isinstance(event_hook, list):
        mode_str: Any = [str(h) for h in event_hook]
    else:
        mode_str = str(event_hook)  # works for GuardrailEventHooks (str enum) and Mode
    detail.setdefault("guardrail_mode", mode_str)

Comment on lines 886 to 932
@@ -919,13 +922,130 @@ async def test_create_streaming_response_generator_raises_http_exception(
expected_error_data = {
"error": {
"message": "Content blocked by guardrail",
"code": 400,
"type": "None",
"param": "None",
"code": "400",
}
}
assert len(content) == 2
assert content[0] == f"data: {json.dumps(expected_error_data)}\n\n"
assert content[1] == "data: [DONE]\n\n"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Existing test assertions updated for new SSE frame shape

Two assertions were changed to match the new streaming error frame format — "code" is now a string ("500" / "400") instead of an integer, and "type" / "param" fields were added. These updates reflect the intentional alignment with ProxyException.to_dict() shape and are documented in the PR's backwards-compatibility notes, so they don't weaken coverage. Worth noting for reviewers: clients that previously compared error.code === 400 (strict integer equality) will see different output.

Rule Used: What: Flag any modifications to existing tests and... (source)

@krrish-berri-2 krrish-berri-2 changed the base branch from main to litellm_internal_staging_04_11_2026 April 11, 2026 16:40
@krrish-berri-2 krrish-berri-2 merged commit 363f9fe into BerriAI:litellm_internal_staging_04_11_2026 Apr 11, 2026
48 of 50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants