Skip to content

fix: ensure litellm_metadata is attached to pre_call guardrail to align with post_call guardrail#25641

Merged
krrish-berri-2 merged 5 commits intoBerriAI:litellm_oss_staging_04_14_2026_p1from
GraySwanAI:fix/send-litellm-metadata-guardrail-precall
Apr 15, 2026
Merged

fix: ensure litellm_metadata is attached to pre_call guardrail to align with post_call guardrail#25641
krrish-berri-2 merged 5 commits intoBerriAI:litellm_oss_staging_04_14_2026_p1from
GraySwanAI:fix/send-litellm-metadata-guardrail-precall

Conversation

@agrattan0820
Copy link
Copy Markdown
Contributor

@agrattan0820 agrattan0820 commented Apr 13, 2026

Relevant issues

N/A

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Screenshots / Proof of Fix

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 13, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 14, 2026 6:02pm

Request Review

@agrattan0820
Copy link
Copy Markdown
Contributor Author

@greptileai

@agrattan0820 agrattan0820 changed the title fix: ensure litellm_metadata is attached to pre_call guardrail to align with post_call guardrail fix: ensure litellm_metadata is attached to pre_call guardrail to align with post_call guardrail Apr 13, 2026
@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented Apr 13, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing GraySwanAI:fix/send-litellm-metadata-guardrail-precall (9b5cec0) with main (4a71583)

Open in CodSpeed

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 13, 2026

Greptile Summary

This PR fixes a parity gap where litellm_metadata (user auth context) was available to post_call guardrails via user_api_key_dict but absent from the data dict passed to pre_call and during_call guardrails. A new module-level helper _ensure_litellm_metadata is introduced and called in async_pre_call_hook and async_moderation_hook before process_input_messages, reusing the existing BaseTranslation.transform_user_api_key_dict_to_metadata static method. Three focused unit tests cover the new helper and the GraySwan payload path.

Confidence Score: 5/5

  • Safe to merge — targeted fix with good test coverage and no functional correctness issues.
  • The only finding is a P2 style violation (inline import vs. module-level import per CLAUDE.md). The logic itself is correct, the fix aligns pre/during-call guardrail behavior with post-call, and all three new tests are pure unit tests with no network calls.
  • No files require special attention beyond the inline import style fix in unified_guardrail.py.

Important Files Changed

Filename Overview
litellm/proxy/guardrails/guardrail_hooks/unified_guardrail/unified_guardrail.py Adds module-level _ensure_litellm_metadata helper and calls it in async_pre_call_hook and async_moderation_hook before process_input_messages, aligning pre-call metadata availability with post-call. One style issue: the BaseTranslation import is inline inside the if block instead of at module level.
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_grayswan.py Adds three new unit tests: forwarding of litellm_metadata through _prepare_payload, population of litellm_metadata from UserAPIKeyAuth, and no-op when litellm_metadata is already present. Long assertion lines reformatted for Black compliance. All tests are pure mocks with no network calls.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Proxy
    participant UnifiedLLMGuardrails
    participant _ensure_litellm_metadata
    participant GuardrailTranslation
    participant LLM

    Client->>Proxy: Request
    Proxy->>UnifiedLLMGuardrails: async_pre_call_hook(data, user_api_key_dict)
    Note over UnifiedLLMGuardrails: NEW: _ensure_litellm_metadata called
    UnifiedLLMGuardrails->>_ensure_litellm_metadata: data, user_api_key_dict
    _ensure_litellm_metadata-->>UnifiedLLMGuardrails: data['litellm_metadata'] populated
    UnifiedLLMGuardrails->>GuardrailTranslation: process_input_messages(data)
    GuardrailTranslation-->>UnifiedLLMGuardrails: modified data (with litellm_metadata)
    UnifiedLLMGuardrails-->>Proxy: modified data

    Proxy->>LLM: LLM call
    LLM-->>Proxy: response

    Proxy->>UnifiedLLMGuardrails: async_post_call_success_hook(data, user_api_key_dict, response)
    UnifiedLLMGuardrails->>GuardrailTranslation: process_output_response(response, user_api_key_dict, request_data)
    Note over GuardrailTranslation: user_api_key_dict always available here
    GuardrailTranslation-->>UnifiedLLMGuardrails: processed response
    UnifiedLLMGuardrails-->>Proxy: response
    Proxy-->>Client: Response
Loading

Reviews (4): Last reviewed commit: "revert: undo usage of `_guardrail_litell..." | Re-trigger Greptile

Comment on lines +55 to +74
def _ensure_litellm_metadata(self, data: dict) -> None:
"""
Populate data['litellm_metadata'] from data['metadata']['user_api_key_auth']
if not already present.

This makes user API key info available to guardrails during pre-call
processing, matching the behavior of process_output_response which
receives user_api_key_dict directly.
"""
if "litellm_metadata" in data:
return
metadata = data.get("metadata")
if not isinstance(metadata, dict):
return
user_api_key_auth = metadata.get("user_api_key_auth")
if user_api_key_auth is None:
return
user_metadata = self.transform_user_api_key_dict_to_metadata(user_api_key_auth)
if user_metadata:
data["litellm_metadata"] = user_metadata
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Dead code — instance method never called in production

BaseTranslation._ensure_litellm_metadata is defined here and tested in isolation, but no code path actually calls self._ensure_litellm_metadata(data) — not OpenAIChatCompletionsHandler.process_input_messages, not any other subclass, and not the hooks in unified_guardrail.py (which call the separate module-level _ensure_litellm_metadata(data, user_api_key_dict) instead). This leaves two parallel implementations with different signatures that serve the same goal, making it unclear which one should be used and when. Consider either wiring this method into subclasses' process_input_messages implementations, or removing it and consolidating the logic in unified_guardrail.py.

Comment on lines +58 to +60
from litellm.llms.base_llm.guardrail_translation.base_translation import (
BaseTranslation,
)
@agrattan0820 agrattan0820 marked this pull request as ready for review April 13, 2026 17:10
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2d4ea48fec

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


def _ensure_litellm_metadata(data: dict, user_api_key_dict: UserAPIKeyAuth) -> None:
"""Populate data['litellm_metadata'] from user_api_key_dict if absent."""
if "litellm_metadata" not in data:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid forcing litellm_metadata on pre-call request payloads

Adding litellm_metadata unconditionally for unified pre/during-call guardrails changes downstream metadata selection logic: get_metadata_variable_name_from_kwargs() prefers litellm_metadata whenever that key exists, while several logging/redaction paths (e.g. should_redact_message_logging) expect request headers in metadata["headers"]. For routes that normally use metadata (like /chat/completions), this causes header-based redaction controls to be skipped whenever unified guardrails run, so requests that should be redacted can be logged unredacted.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@ishaan-berri ishaan-berri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see _guardrail_litellm_metadata no where in the codebase. Why are we using this ?

user_api_key_dict
)
if user_metadata:
data["_guardrail_litellm_metadata"] = user_metadata
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this qa'ed with an e2e call to a real LLM API? i suspect this would get sent straight to it

instead wouldn't it be better to just create a litellm_metadata field if missing?

@agrattan0820
Copy link
Copy Markdown
Contributor Author

Ah, apologies. I made the change based off of Codex's comment. Reverted it back to how I initially had it which was just using litellm_metadata!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9b5cec0ff4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

CallTypes(call_type)
]()

_ensure_litellm_metadata(data, user_api_key_dict)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid injecting litellm_metadata on metadata-based routes

On legacy proxy routes (for example /v1/chat/completions), request headers are attached under data["metadata"]["headers"], but this new _ensure_litellm_metadata(...) call adds a litellm_metadata key before the request is logged. The redaction path still selects the metadata container using get_metadata_variable_name_from_kwargs, which prefers litellm_metadata whenever that key exists, so header-controlled redaction flags are skipped and sensitive prompts/responses can be logged unredacted when unified guardrails run. Fresh evidence from the current tree: litellm_core_utils/redact_messages.py still reads headers from the selected metadata field (lines 223-233), while litellm_pre_call_utils.py still writes headers to metadata on these routes (lines 977-1001).

Useful? React with 👍 / 👎.

@krrish-berri-2 krrish-berri-2 changed the base branch from main to litellm_oss_staging_04_14_2026_p1 April 15, 2026 03:08
@krrish-berri-2 krrish-berri-2 merged commit 9ff6c63 into BerriAI:litellm_oss_staging_04_14_2026_p1 Apr 15, 2026
48 of 51 checks passed
@agrattan0820 agrattan0820 deleted the fix/send-litellm-metadata-guardrail-precall branch April 15, 2026 21:18
krrish-berri-2 pushed a commit that referenced this pull request Apr 16, 2026
…ne (#25777)

* feat(proxy): add NO_OPENAPI env var to disable /openapi.json endpoint (#25696)

* feat(proxy): add NO_OPENAPI env var to disable /openapi.json endpoint - Fixes #25538

* test(proxy): add tests for _get_openapi_url

---------

Co-authored-by: Progressive-engg <lov.kumari55@gmail.com>

* feat(prometheus): add api_provider label to spend metric (#25693)

* feat(prometheus): add api_provider label to spend metric

Add `api_provider` to `litellm_spend_metric` labels so users can
build Grafana dashboards that break down spend by cloud provider
(e.g. bedrock, anthropic, openai, azure, vertex_ai).

The `api_provider` label already exists in UserAPIKeyLabelValues and
is populated from `standard_logging_payload["custom_llm_provider"]`,
but was not included in the spend metric's label list.

* add api_provider to requests metric + add test

Address review feedback:
- Add api_provider to litellm_requests_metric too (same call-site as
  spend metric, keeps label sets in sync)
- Add test_api_provider_in_spend_and_requests_metrics following the
  existing pattern in test_prometheus_labels.py

* fix: ensure `litellm_metadata` is attached to `pre_call` guardrail to align with `post_call` guardrail (#25641)

* fix: ensure `litellm_metadata` is attached to pre_call to align with post_call

* refactor: remove unused BaseTranslation._ensure_litellm_metadata

* refactor: module level imports for ensure_litellm_metadata and CodeQL

* fix: update based off of Codex comment

* revert: undo usage of `_guardrail_litellm_metadata`

* feat: add pricing entry for openrouter/google/gemini-3.1-flash-lite-preview (#25610)

* fix(bedrock): skip synthetic tool injection for json_object with no schema (#25740)

When response_format={"type": "json_object"} is sent without a JSON
schema, _create_json_tool_call_for_response_format builds a tool with an
empty schema (properties: {}). The model follows the empty schema and
returns {} instead of the actual JSON the caller asked for.

This patch:
- Skips synthetic json_tool_call injection when no schema is provided.
  The model already returns JSON when the prompt asks for it.
- Fixes finish_reason: after _filter_json_mode_tools strips all
  synthetic tool calls, finish_reason stays "tool_calls" instead of
  "stop". Callers (like the OpenAI SDK) misinterpret this as a pending
  tool invocation.

json_schema requests with an explicit schema are unchanged.

Co-authored-by: Claude <noreply@anthropic.com>

* fix(utils): allowed_openai_params must not forward unset params as None

`_apply_openai_param_overrides` iterated `allowed_openai_params` and
unconditionally wrote `optional_params[param] = non_default_params.pop(param, None)`
for each entry. If the caller listed a param name but did not actually
send that param in the request, the pop returned `None` and `None` was
still written to `optional_params`. The openai SDK then rejected it as
a top-level kwarg:

    AsyncCompletions.create() got an unexpected keyword argument 'enable_thinking'

Reproducer (from #25697):

    allowed_openai_params = ["chat_template_kwargs", "enable_thinking"]
    body = {"chat_template_kwargs": {"enable_thinking": False}}

Here `enable_thinking` is only present nested inside
`chat_template_kwargs`, so the helper should forward
`chat_template_kwargs` and leave `enable_thinking` alone. Instead it
wrote `optional_params["enable_thinking"] = None`.

Fix: only forward a param if it was actually present in
`non_default_params`. Behavior is unchanged for the happy path (param
sent → still forwarded), and the explicit `None` leakage is gone.

Adds a regression test exercising the helper in isolation so the test
does not depend on any provider-specific `map_openai_params` plumbing.

Fixes #25697

---------

Co-authored-by: lovek629 <59618812+lovek629@users.noreply.github.com>
Co-authored-by: Progressive-engg <lov.kumari55@gmail.com>
Co-authored-by: Ori Kotek <ori.k@codium.ai>
Co-authored-by: Alexander Grattan <51346343+agrattan0820@users.noreply.github.com>
Co-authored-by: Mohana Siddhartha Chivukula <103447836+iamsiddhu3007@users.noreply.github.com>
Co-authored-by: Amiram Mizne <amiramm@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Sameerlite pushed a commit that referenced this pull request Apr 16, 2026
…ne (#25777)

* feat(proxy): add NO_OPENAPI env var to disable /openapi.json endpoint (#25696)

* feat(proxy): add NO_OPENAPI env var to disable /openapi.json endpoint - Fixes #25538

* test(proxy): add tests for _get_openapi_url

---------

Co-authored-by: Progressive-engg <lov.kumari55@gmail.com>

* feat(prometheus): add api_provider label to spend metric (#25693)

* feat(prometheus): add api_provider label to spend metric

Add `api_provider` to `litellm_spend_metric` labels so users can
build Grafana dashboards that break down spend by cloud provider
(e.g. bedrock, anthropic, openai, azure, vertex_ai).

The `api_provider` label already exists in UserAPIKeyLabelValues and
is populated from `standard_logging_payload["custom_llm_provider"]`,
but was not included in the spend metric's label list.

* add api_provider to requests metric + add test

Address review feedback:
- Add api_provider to litellm_requests_metric too (same call-site as
  spend metric, keeps label sets in sync)
- Add test_api_provider_in_spend_and_requests_metrics following the
  existing pattern in test_prometheus_labels.py

* fix: ensure `litellm_metadata` is attached to `pre_call` guardrail to align with `post_call` guardrail (#25641)

* fix: ensure `litellm_metadata` is attached to pre_call to align with post_call

* refactor: remove unused BaseTranslation._ensure_litellm_metadata

* refactor: module level imports for ensure_litellm_metadata and CodeQL

* fix: update based off of Codex comment

* revert: undo usage of `_guardrail_litellm_metadata`

* feat: add pricing entry for openrouter/google/gemini-3.1-flash-lite-preview (#25610)

* fix(bedrock): skip synthetic tool injection for json_object with no schema (#25740)

When response_format={"type": "json_object"} is sent without a JSON
schema, _create_json_tool_call_for_response_format builds a tool with an
empty schema (properties: {}). The model follows the empty schema and
returns {} instead of the actual JSON the caller asked for.

This patch:
- Skips synthetic json_tool_call injection when no schema is provided.
  The model already returns JSON when the prompt asks for it.
- Fixes finish_reason: after _filter_json_mode_tools strips all
  synthetic tool calls, finish_reason stays "tool_calls" instead of
  "stop". Callers (like the OpenAI SDK) misinterpret this as a pending
  tool invocation.

json_schema requests with an explicit schema are unchanged.

Co-authored-by: Claude <noreply@anthropic.com>

* fix(utils): allowed_openai_params must not forward unset params as None

`_apply_openai_param_overrides` iterated `allowed_openai_params` and
unconditionally wrote `optional_params[param] = non_default_params.pop(param, None)`
for each entry. If the caller listed a param name but did not actually
send that param in the request, the pop returned `None` and `None` was
still written to `optional_params`. The openai SDK then rejected it as
a top-level kwarg:

    AsyncCompletions.create() got an unexpected keyword argument 'enable_thinking'

Reproducer (from #25697):

    allowed_openai_params = ["chat_template_kwargs", "enable_thinking"]
    body = {"chat_template_kwargs": {"enable_thinking": False}}

Here `enable_thinking` is only present nested inside
`chat_template_kwargs`, so the helper should forward
`chat_template_kwargs` and leave `enable_thinking` alone. Instead it
wrote `optional_params["enable_thinking"] = None`.

Fix: only forward a param if it was actually present in
`non_default_params`. Behavior is unchanged for the happy path (param
sent → still forwarded), and the explicit `None` leakage is gone.

Adds a regression test exercising the helper in isolation so the test
does not depend on any provider-specific `map_openai_params` plumbing.

Fixes #25697

---------

Co-authored-by: lovek629 <59618812+lovek629@users.noreply.github.com>
Co-authored-by: Progressive-engg <lov.kumari55@gmail.com>
Co-authored-by: Ori Kotek <ori.k@codium.ai>
Co-authored-by: Alexander Grattan <51346343+agrattan0820@users.noreply.github.com>
Co-authored-by: Mohana Siddhartha Chivukula <103447836+iamsiddhu3007@users.noreply.github.com>
Co-authored-by: Amiram Mizne <amiramm@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants