Litellm oss staging 04 08 2026 by krrish-berri-2 · Pull Request #25397 · BerriAI/litellm

krrish-berri-2 · 2026-04-09T04:36:09Z

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

…25340)

#25337) * fix(vertex_ai): normalize Gemini finish_reason enum through map_finish_reason in streaming handler In the legacy vertex_ai SDK streaming path, the raw Gemini finish_reason enum name (e.g. "STOP", "MAX_TOKENS") was stored directly into self.received_finish_reason without being mapped to OpenAI-compatible values. The finish_reason_handler then compared against lowercase "stop", causing the case mismatch to prevent the tool_call override from ever firing. This fix applies map_finish_reason() so all Gemini enum names are normalized before storage.Refactor finish reason handling to use map_finish_reason function. * refactor: use module-level map_finish_reason import; drop redundant inline import map_finish_reason is already imported at module scope (line 49) via `from .core_helpers import map_finish_reason, process_response_headers`. The inline import added in the previous commit was redundant. Addressed Greptile review feedback.Removed unnecessary import of map_finish_reason from core_helpers. * test: add unit tests for Gemini legacy vertex finish_reason normalisation Added tests to ensure finish_reason normalization for Gemini legacy vertex tool calls and stop reasons.

* fix: remove leading space from license public_key.pem PEM must begin with -----BEGIN; a leading ASCII space breaks cryptography.load_pem_public_key on older cryptography (e.g. 41.x), causing OpenSSL no start line / deserialize errors. Made-with: Cursor * test: assert license public_key.pem loads as valid PEM Regression guard for leading whitespace before -----BEGIN, which breaks load_pem_public_key on older cryptography (e.g. 41.x). Made-with: Cursor

…25331) DashScope inherits OpenAIGPTConfig which strips cache_control from messages and tools by default. Override remove_cache_control_flag_from_messages_and_tools() to preserve cache_control, following the same pattern used by ZAI, MiniMax, and Databricks. Verified through 10-round multi-turn conversation tests: - Explicit caching works correctly: cached_tokens grows each round from R4 onwards, with cache_creation_tokens reported on first cache build. - Implicit caching is not affected: models that rely on implicit prefix-matching caching produce identical cached_tokens with and without this change, confirmed by comparing results against both the reverted codebase and direct API calls bypassing litellm. - No errors or regressions observed on any model, including those that do not support explicit caching — the DashScope API silently ignores unrecognized cache_control fields. Fixes #25330 Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ai/gpt-oss-120b (#25263) * fix: expose reasoning effort fields in get_model_info and add together_ai/gpt-oss-120b - litellm/utils.py: pass supports_none_reasoning_effort and supports_xhigh_reasoning_effort through _get_model_info_helper so get_model_info() returns them (previously silently dropped). Fixes #25096. - model_prices_and_context_window.json: add together_ai/openai/gpt-oss-120b with supports_reasoning: true so reasoning_effort is accepted for this model without requiring drop_params. Fixes #25132. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: consolidate duplicate together_ai/openai/gpt-oss-120b entry and sync backup file * fix: link commit to GitHub account for CLA verification --------- Co-authored-by: Austin Varga <austin@knowmi.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

vercel · 2026-04-09T04:36:16Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Apr 13, 2026 3:19am

CLAassistant · 2026-04-09T04:36:18Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
6 out of 8 committers have signed the CLA.

✅ michelligabriele
✅ avarga1
✅ silencedoctor
✅ milan-berri
✅ Sameerlite
✅ acebot712
❌ abhyudayareddy
❌ krrish-berri-2
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

codspeed-hq · 2026-04-09T04:37:56Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing litellm_oss_staging_04_08_2026 (fa605d8) with main (5544803)}

greptile-apps · 2026-04-09T04:39:21Z

Greptile Summary

This PR adds a new PromptGuard guardrail integration (prompt injection detection, PII redaction, topic filtering) along with a Vertex AI streaming fix (normalising proto finish_reason names through map_finish_reason()), a DashScope cache_control preservation override, a public_key.pem leading-whitespace fix, and two new model-info capability flags (supports_none_reasoning_effort, supports_xhigh_reasoning_effort).

Redact silent pass-through (promptguard.py lines 186–196): when PromptGuard returns decision: \"redact\" but omits redacted_messages, the original content is returned to the LLM with no log and no error — the intended redaction is silently skipped regardless of block_on_error.

Confidence Score: 4/5

Safe to merge after addressing the redact silent pass-through bug in PromptGuard

One P1 bug: when PromptGuard returns decision='redact' without a redacted_messages payload, the original (potentially PII-containing) content is forwarded to the LLM unmodified and no warning is emitted. All other changes — streaming fix, DashScope override, PEM fix, model data updates — are correct and well-tested.

litellm/proxy/guardrails/guardrail_hooks/promptguard/promptguard.py (redact decision silent pass-through)

Important Files Changed

Filename	Overview
litellm/proxy/guardrails/guardrail_hooks/promptguard/promptguard.py	New PromptGuard guardrail integration; silent pass-through bug when redact decision carries no redacted_messages
litellm/proxy/guardrails/guardrail_hooks/promptguard/init.py	Standard guardrail initializer and registry wiring for PromptGuard; looks correct
litellm/litellm_core_utils/streaming_handler.py	Bug fix: wraps raw Vertex AI proto enum finish_reason name through map_finish_reason() so tool_calls override works correctly
litellm/llms/dashscope/chat/transformation.py	Overrides remove_cache_control_flag_from_messages_and_tools to preserve cache_control for DashScope provider
litellm/proxy/auth/public_key.pem	Fixes leading whitespace on -----BEGIN PUBLIC KEY----- header that caused PEM load failure
litellm/types/proxy/guardrails/guardrail_hooks/promptguard.py	Defines Pydantic config model for PromptGuard with api_key, api_base, and block_on_error fields
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_promptguard.py	Comprehensive mock-only test suite for PromptGuard guardrail; no real network calls

Sequence Diagram

sequenceDiagram
    participant Proxy
    participant PromptGuardGuardrail
    participant PromptGuardAPI
    participant LLM

    Proxy->>PromptGuardGuardrail: apply_guardrail(inputs, "request")
    PromptGuardGuardrail->>PromptGuardAPI: POST /api/v1/guard
    alt API error
        PromptGuardAPI-->>PromptGuardGuardrail: Exception
        alt block_on_error=True
            PromptGuardGuardrail-->>Proxy: GuardrailRaisedException
        else block_on_error=False
            PromptGuardGuardrail-->>Proxy: original inputs
        end
    else decision=allow
        PromptGuardAPI-->>PromptGuardGuardrail: decision:allow
        PromptGuardGuardrail-->>Proxy: inputs unchanged
        Proxy->>LLM: original messages
    else decision=block
        PromptGuardAPI-->>PromptGuardGuardrail: decision:block
        PromptGuardGuardrail-->>Proxy: GuardrailRaisedException
    else decision=redact with redacted_messages
        PromptGuardAPI-->>PromptGuardGuardrail: decision:redact + redacted_messages
        PromptGuardGuardrail-->>Proxy: inputs with redacted content
        Proxy->>LLM: redacted messages
    else decision=redact WITHOUT redacted_messages
        PromptGuardAPI-->>PromptGuardGuardrail: decision:redact only
        Note over PromptGuardGuardrail: Bug: original inputs returned silently
        PromptGuardGuardrail-->>Proxy: original inputs unredacted
        Proxy->>LLM: unredacted messages
    end

_{Reviews (4): Last reviewed commit: "Merge pull request #25616 from BerriAI/m..." | Re-trigger Greptile}

greptile-apps · 2026-04-09T04:39:25Z

+from litellm.types.llms.openai import ChatCompletionToolParam
+
 from litellm.secret_managers.main import get_secret_str
 from litellm.types.llms.openai import AllMessageValues


Duplicate import from the same module

ChatCompletionToolParam and AllMessageValues are both imported from litellm.types.llms.openai in two separate statements. Consolidate them into one import per the project's style conventions.

Suggested change

from litellm.types.llms.openai import ChatCompletionToolParam

from litellm.secret_managers.main import get_secret_str

from litellm.types.llms.openai import AllMessageValues

from litellm.types.llms.openai import AllMessageValues, ChatCompletionToolParam

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

* Add PromptGuard guardrail integration Add PromptGuard as a first-class guardrail vendor in LiteLLM's proxy, supporting prompt injection detection, PII redaction, topic filtering, entity blocklists, and hallucination detection via PromptGuard's /api/v1/guard API endpoint. Backend: - Add PROMPTGUARD to SupportedGuardrailIntegrations enum - Implement PromptGuardGuardrail (CustomGuardrail subclass) with apply_guardrail handling allow/block/redact decisions - Add Pydantic config model with api_key, api_base, ui_friendly_name - Auto-discovered via guardrail_hooks/promptguard/__init__.py registries Frontend: - Add PromptGuard partner card to Guardrail Garden with eval scores - Add preset configuration for quick setup - Add logo to guardrailLogoMap Tests: - 30 unit tests covering configuration, allow/block/redact actions, request payload construction, error handling, config model, and registry wiring * Fix redact path and init ordering per review feedback - P1: Update structured_messages (not just texts) when PromptGuard returns a redact decision, so PII redaction is effective for the primary LLM message path - P2: Validate credentials before allocating the HTTPX client so resources aren't acquired if PromptGuardMissingCredentials is raised - Add tests for structured_messages redaction and texts-only redaction * Harden PromptGuard integration: fail-open, event hooks, images, docs - Add block_on_error config (default fail-closed, configurable fail-open) - Declare supported_event_hooks (pre_call, post_call) like other vendors - Forward images from GenericGuardrailAPIInputs to PromptGuard API - Wrap API call in try/except for resilient error handling - Add comprehensive documentation page with config examples - Register docs page in sidebar alongside other guardrail providers - Expand test suite from 32 to 40 tests covering new functionality * Fix dict[str, Any] -> Dict[str, Any] for Python 3.8 compat * Address remaining Greptile feedback: timeout, redact guard - Add explicit 10s timeout to async_handler.post() to prevent indefinite hangs when PromptGuard API is unresponsive - Guard redact path: only update inputs["texts"] when the key was originally present, avoiding phantom key injection - Add test: redact with structured_messages only does not create texts key (41 tests total) * Fix CI lint: black formatting, add PromptGuardConfigModel to LitellmParams - Reformat promptguard.py to match CI black version (parenthesization) - Add PromptGuardConfigModel as base class of LitellmParams for proper Pydantic schema validation, consistent with all other guardrail vendors - Use litellm_params.block_on_error directly (now a typed field) * Address Greptile review: redact path, null decision, error context - P1: Filter _extract_texts_from_messages to user-role messages only, preventing system/assistant content from being injected into texts - P1: Strengthen test_redact_updates_structured_messages assertion from weak `in` check to strict equality, catching the injection bug - P2: Use `result.get("decision") or "allow"` to handle explicit null decision values (not just absent keys) - P2: Wrap bare exception re-raise in GuardrailRaisedException so the caller knows which guardrail failed (block_on_error=True path) - P2: Add static Promptguard entry in guardrail_provider_map so the preset works before populateGuardrailProviderMap is called - Add test for explicit null decision treated as allow * Fix black formatting: collapse f-string in error message

…metrics …" (#25438) This reverts commit 8d945c8.

merge main

greptile-apps · 2026-04-13T03:17:38Z

+        if decision == "redact":
+            redacted = result.get("redacted_messages")
+            if redacted:
+                if structured_messages:
+                    inputs["structured_messages"] = redacted
+                if "texts" in inputs:
+                    extracted = self._extract_texts_from_messages(
+                        redacted,
+                    )
+                    if extracted:
+                        inputs["texts"] = extracted
+
+        return inputs


Silent redaction bypass when redacted_messages is absent

When PromptGuard returns decision: "redact" but redacted_messages is null or missing, the if redacted: guard silently falls through and the original unredacted inputs are returned unchanged — the PII/sensitive content passes to the LLM with no log warning and no error. This undermines the security guarantee of the redact path regardless of block_on_error.

Suggested change

if decision == "redact":

redacted = result.get("redacted_messages")

if redacted:

if structured_messages:

inputs["structured_messages"] = redacted

if "texts" in inputs:

extracted = self._extract_texts_from_messages(

redacted,

)

if extracted:

inputs["texts"] = extracted

return inputs

if decision == "redact":

redacted = result.get("redacted_messages")

if redacted:

if structured_messages:

inputs["structured_messages"] = redacted

if "texts" in inputs:

extracted = self._extract_texts_from_messages(

redacted,

)

if extracted:

inputs["texts"] = extracted

else:

verbose_proxy_logger.warning(

"PromptGuard returned decision='redact' but no "

"redacted_messages; original content will be used."

)

michelligabriele and others added 5 commits April 8, 2026 21:22

fix(proxy): set key_alias=user_id in JWT auth for Prometheus metrics (#…

8d945c8

…25340)

krrish-berri-2 requested a review from Sameerlite April 9, 2026 04:36

greptile-apps bot reviewed Apr 9, 2026

View reviewed changes

Sameerlite approved these changes Apr 9, 2026

View reviewed changes

krrish-berri-2 temporarily deployed to integration-postgres April 9, 2026 15:12 — with GitHub Actions Inactive

krrish-berri-2 had a problem deploying to integration-postgres April 9, 2026 15:12 — with GitHub Actions Error

krrish-berri-2 temporarily deployed to integration-postgres April 9, 2026 15:12 — with GitHub Actions Inactive

vercel bot deployed to Preview April 9, 2026 15:17 View deployment

Revert "fix(proxy): set key_alias=user_id in JWT auth for Prometheus …

f8243ee

…metrics …" (#25438) This reverts commit 8d945c8.

krrish-berri-2 had a problem deploying to integration-postgres April 9, 2026 18:32 — with GitHub Actions Error

krrish-berri-2 temporarily deployed to integration-postgres April 9, 2026 18:32 — with GitHub Actions Inactive

vercel bot deployed to Preview April 9, 2026 18:33 View deployment

Merge pull request #25616 from BerriAI/main

fa605d8

merge main

Sameerlite temporarily deployed to integration-postgres April 13, 2026 03:13 — with GitHub Actions Inactive

Sameerlite had a problem deploying to integration-postgres April 13, 2026 03:13 — with GitHub Actions Error

Sameerlite temporarily deployed to integration-postgres April 13, 2026 03:13 — with GitHub Actions Inactive

greptile-apps bot reviewed Apr 13, 2026

View reviewed changes

vercel bot deployed to Preview April 13, 2026 03:19 View deployment

Sameerlite merged commit 5e80e07 into main Apr 13, 2026
103 of 108 checks passed

Sameerlite deleted the litellm_oss_staging_04_08_2026 branch April 13, 2026 03:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Litellm oss staging 04 08 2026#25397

Litellm oss staging 04 08 2026#25397
Sameerlite merged 8 commits intomainfrom
litellm_oss_staging_04_08_2026

krrish-berri-2 commented Apr 9, 2026

Uh oh!

vercel bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Apr 9, 2026 •

edited

Loading

Uh oh!

codspeed-hq bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Apr 9, 2026 •

edited

Loading

Important Files Changed

Uh oh!

greptile-apps bot Apr 9, 2026

Uh oh!

greptile-apps bot Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

Uh oh!

Conversation

krrish-berri-2 commented Apr 9, 2026

Relevant issues

Pre-Submission checklist

Delays in PR merge?

CI (LiteLLM team)

Type

Changes

Uh oh!

vercel bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

greptile-apps bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

vercel bot commented Apr 9, 2026 •

edited

Loading

CLAassistant commented Apr 9, 2026 •

edited

Loading

codspeed-hq bot commented Apr 9, 2026 •

edited

Loading

greptile-apps bot commented Apr 9, 2026 •

edited

Loading