[Infra] Merge dev branch by yuneng-berri · Pull Request #25983 · BerriAI/litellm

yuneng-berri · 2026-04-17T23:17:07Z

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Screenshots / Proof of Fix

Type

🚄 Infrastructure

Changes

Validate org admin role against all requested organizations instead of returning on first match. Scope team list queries to the caller's permitted organizations when filtering by user_id.

Verify that an org admin of org-A cannot operate on org-B, and that an admin of both orgs can operate on both.

…g consistency Read guardrail control flags (disable_global_guardrails, opted_out_global_guardrails) from admin-configured key metadata instead of the request body. This ensures callers cannot override admin security policies. Fix tag-based routing to enforce strict tag checks regardless of whether the request includes tags. Fix budget limiter to use the same dynamic metadata key resolution as the tag router for consistent tag extraction.

…olution Extract _get_admin_metadata() in CustomGuardrail to deduplicate metadata lookup. Hoist tag resolution above the deployment loop in budget limiter. Update stale comment in tag routing.

…ging from dynamic params Include user_api_key_team_metadata alongside user_api_key_metadata in _get_admin_metadata() so team-level guardrail settings are respected. Key-level settings take precedence over team-level. Remove turn_off_message_logging from _supported_callback_params so it cannot be set via request metadata. Admin controls logging globally or via key/team configuration. Update tests to verify user-injected guardrail flags are ignored while admin-configured flags are respected.

…g removal Verify turn_off_message_logging is no longer extracted from request kwargs since it is now admin-only.

…lied URLs Add validate_url() utility that resolves DNS once, validates all IPs against private network ranges, and rewrites the URL to connect to the validated IP directly. Prevents DNS rebinding by pinning to the resolved IP. Disable follow_redirects to prevent redirect-based SSRF bypasses. Applied to all user-supplied URL entry points: - Image URL fetching in chat completions - Token counter image dimension fetching - RAG file ingestion - MCP OpenAPI spec loading

Add safe_get() and async_safe_get() helpers that validate each redirect hop before following. For HTTPS, rely on TLS certificate binding instead of URL rewriting. Simplify call sites to use the new helpers.

…ests Check URL scheme before calling safe_get in token counter to avoid unnecessary DNS resolution on base64-encoded image data. Add 14 unit tests for validate_url covering blocked networks, scheme validation, URL rewriting, and DNS failure handling.

…sabled _is_blocked_ip now returns True (blocked) for unparseable addresses instead of False (allowed). HTTPS URLs are rewritten to validated IPs when ssl_verify is disabled, closing the DNS rebinding window that exists without TLS certificate binding.

Read Location header directly instead of response.next_request (which is None when follow_redirects=False). Resolve relative redirect URLs with httpx.URL.join(). Remove unused imports.

Pass follow_redirects through in HTTPHandler.get() — previously the parameter was accepted but never forwarded to the underlying httpx client, making sync redirect protection ineffective. Include port in Host header when non-default (e.g. example.com:8080). Fix redirect loop to read Location header directly instead of response.next_request (which is None when follow_redirects=False).

SDK core modules (image_handling, token_counter) should not import from litellm.proxy. Move url_utils.py to litellm_core_utils/ so bare SDK installs without proxy dependencies still work.

Greptile P1: six tests in test_url_utils.py performed real DNS lookups to example.com, violating the tests/test_litellm/ mock-only rule and risking offline CI failures. Add mock_dns_public and mock_dns_failure fixtures that monkeypatch socket.getaddrinfo on the url_utils module. Greptile P2: move 'import httpx' from inside _extract_redirect_url to module-level imports per CLAUDE.md style guide.

Greptile P2: _get_admin_metadata used 'litellm_metadata or metadata', meaning a caller sending a non-empty litellm_metadata would shadow admin config the proxy had injected into data['metadata']. Admin exemptions would be silently ignored. Check both keys and prefer whichever contains admin fields. Add regression test covering the shadowing scenario.

Two litellm-level flags wired through litellm_settings YAML: - user_url_validation (bool, default True): master switch. When False, safe_get/async_safe_get bypass validation and call client.get directly. - user_url_allowed_hosts (List[str], default []): per-host allowlist. Entries are 'host' (matches any port) or 'host:port' (port-specific). Matched hosts skip the blocked-networks check but still resolve DNS and still rewrite HTTP to the validated IP, preserving rebinding protection within the permitted name. Also fix an existing Host header bug: IPv6 literals (e.g. 2001:db8::1) were emitted unbracketed, producing ambiguous values like '2001:db8::1:8080' per RFC 7230 5.4. Bracket them consistently in _format_host_header.

Expand the pre-call metadata strip to also remove user_api_key_metadata and user_api_key_team_metadata. The proxy writes these fields into data[_metadata_variable_name] with admin-authoritative values, but only into that one metadata key; the caller's value in the OTHER metadata key (metadata vs litellm_metadata) would otherwise persist and be picked up by _get_admin_metadata, letting a caller supply their own 'admin' config to disable guardrails, opt out of global policies, etc. VERIA-28 (High): Security Policy and Guardrail Bypass via Unsanitized Request Metadata. Add regression test at the proxy boundary verifying the strip, and extend the guardrail test to cover the post-strip admin-config path.

…icast and Azure Wire Server Replace the hand-maintained _BLOCKED_NETWORKS CIDR list with a default-deny check based on ipaddress.is_global (RFC 6890 semantics, implemented by Python's stdlib). Also reject multicast explicitly — is_global returns True for public multicast allocations, which are not legitimate HTTP targets. Only globally-routable cloud-fabric IPs need explicit exceptions; the canonical list contains one entry today: Azure Wire Server (168.63.129.16), an in-fabric service reachable from any Azure VM. Coverage delta picked up automatically via is_global: - Alibaba Cloud metadata (100.100.100.200, CGNAT) - Legacy Oracle metadata (192.0.0.192, IETF Protocol Assignments) - IPv4 documentation ranges (192.0.2.0/24, 198.51.100.0/24, 203.0.113.0/24) - IPv4 reserved/future-use (240.0.0.0/4) and broadcast - IPv6 documentation (2001:db8::/32) Also fix two issues Greptile flagged: - HTTP relative-redirect hops lost the original hostname because _extract_redirect_url joined the Location against the rewritten (IP-based) URL. Join against the pre-rewrite URL so the next hop's Host header keeps the original hostname. - Two unit tests performed real socket.getaddrinfo('localhost') calls. Monkeypatch them. Add coverage tests for every cloud-metadata IP from the canonical SSRF dictionary (AWS/GCP/Azure/Alibaba/Oracle/DO/OpenStack) plus the new multicast/reserved/documentation/broadcast ranges, and a regression test for redirect-hostname preservation.

…ent_tags VERIA-28 (High) follow-up: tag-based routing and tag budget enforcement read metadata.tags directly from the request, letting an attacker reach restricted tag-routed deployments or misattribute spend to a victim team's tag. Strip metadata.tags (and litellm_metadata.tags) at the pre-call boundary unless the caller's key or team metadata opts in with allow_client_tags=True. Default-deny: existing clients that need to pass routing tags must have the flag set explicitly on their key or team. Preserves the tag-routing feature for admins who trust their callers; closes the injection path for everyone else.

Two pre-existing tests codified the pre-fix behavior where any caller- supplied metadata.tags would flow through to spend logs and routing: - test_add_key_or_team_level_spend_logs_metadata_to_request exercised the request/key/team tag merge. Set allow_client_tags=True on the key metadata so the merge path is still tested under the new regime. - test_create_file_with_nested_litellm_metadata asserted that litellm_metadata[tags] form-data propagated to the handler. Drop the tag field; the test still proves nested form-parser correctness via spend_logs_metadata and environment.

Silent strip is the worst debug UX: admin's client sends routing tags, they disappear, admin can't figure out why. Emit a warning naming the metadata key the tags came from and telling the admin exactly which flag to set if this is intentional.

test_add_litellm_data_to_request_duplicate_tags tests the request/key tag merge when tags overlap. The merge requires caller-supplied tags to flow through — set allow_client_tags=True on the key so the merge path stays testable under the new default-deny regime.

Veria AI caught a bypass: metadata can arrive as a JSON string via multipart/form-data or extra_body, and the existing strip block ran before the string-to-dict parse. The isinstance(_user_meta, dict) guard returned False on the string, the strip was skipped, and then the parse turned the string into a dict — leaving user_api_key_metadata / user_api_key_team_metadata / _pipeline_managed_guardrails / tags intact in the parsed dict. Move the strip to run AFTER the parse and BEFORE the merge of litellm_metadata into data[_metadata_variable_name], closing the bypass for both raw-dict and string-encoded payloads. Regression test: test_add_litellm_data_to_request_strips_string_encoded_admin_injection.

…keys Per VERIA-28's secondary recommendation. The existing check only gated metadata.guardrails. User-supplied values for disable_global_guardrails (plural and the original singular typo variant) and opted_out_global_guardrails are already silently ignored by _get_admin_metadata at read time, but the silent-ignore makes diagnosis confusing and relies on one specific read site catching them. Reject at auth time with a 403 when any of: - guardrails list (existing) - disable_global_guardrails (new) - disable_global_guardrail (new — historical singular-key variant) - opted_out_global_guardrails (new) are present in metadata, litellm_metadata, or at the request root, and the caller's team lacks can_modify_guardrails. Defense in depth: the strip at the pre-call layer still runs; this check fails loudly one layer earlier so operators see an explicit 403 rather than a silent-ignore.

Close three variant bypasses adjacent to VERIA-28 found during post-fix variant audit: 1. _guardrail_modification_check had the same isinstance(dict) bypass Veria-AI just flagged on the pre-call strip. A caller sending `{"metadata": "{…}"}` as a JSON-encoded string (multipart/form-data or extra_body) skipped the guard, got parsed to dict downstream, and reached guardrail logic with bypass flags intact. Coerce strings via safe_json_loads before evaluating. 2. The allow_client_tags strip only covered body metadata.tags and litellm_metadata.tags — caller-supplied tags arriving via the x-litellm-tags header or root-level data["tags"] bypassed it. Gate add_request_tag_to_metadata's result on the same flag. 3. requester_metadata was deepcopied BEFORE the strip, so attacker injections (user_api_key_metadata shadows, disallowed tags, _pipeline_managed_guardrails) persisted in the snapshot. The PANW guardrail (and any future consumer) trusting requester_metadata would see forged values. Move the deepcopy to after the strip. Regression tests added for each.

…itellm_yj_apr17 # Conflicts: # litellm/__init__.py

yuneng-berri · 2026-04-18T00:38:53Z

@greptileai please re-review after merging latest litellm_internal_staging

…itellm_yj_apr17

greptile-apps · 2026-04-18T16:24:31Z

+
+    Raises:
+        SSRFError: If the URL scheme is invalid or the hostname resolves
+            to a private/internal IP address.
+    """
+    parsed = urlparse(url)
+
+    if parsed.scheme not in _ALLOWED_SCHEMES:
+        raise SSRFError(f"URL scheme '{parsed.scheme}' is not allowed")
+
+    hostname = parsed.hostname
+    if not hostname:
+        raise SSRFError("URL has no hostname")
+
+    port = parsed.port


Blocking DNS call inside async function stalls the event loop

validate_url() calls socket.getaddrinfo(), a blocking POSIX syscall, with no timeout. When async_safe_get() invokes it, it blocks the entire asyncio event loop for the duration of the DNS lookup. On a busy proxy that frequently processes image URLs (via async_convert_url_to_base64), this serialises all concurrent requests for every DNS resolution.

Use asyncio.get_event_loop().run_in_executor (or asyncio.to_thread on Python 3.9+) to run the blocking call off the event loop:

import asyncio async def _async_validate_url(url: str) -> tuple[str, str]: loop = asyncio.get_event_loop() # run_in_executor moves the blocking getaddrinfo call to a thread pool return await loop.run_in_executor(None, validate_url, url)

Then in async_safe_get, replace validate_url(url) with await _async_validate_url(url). The synchronous safe_get / validate_url path is unaffected.

…y zero bug) `should_create_missing_views()` had `and result[0]["reltuples"]` which is falsy when reltuples=0. On a fresh empty PostgreSQL table, CREATE INDEX sets reltuples=0, causing the guard to return False and skip view creation entirely. Views like MonthlyGlobalSpendPerKey are never created, and the /global/spend/logs endpoint returns 500. Fix: change to `and result[0]["reltuples"] is not None` so reltuples=0 (empty table) and reltuples=-1 (unanalyzed table) both correctly return True. Also harden test_vertex_ai.py to return None instead of crashing with JSONDecodeError when the spend-logs endpoint returns a non-JSON 500 response, and add unit tests covering all three reltuples branches (0, -1, positive).

…itellm_yj_apr17

stuxf added 30 commits April 16, 2026 21:06

fix(proxy): enforce organization boundaries in admin operations

91bfbe6

Validate org admin role against all requested organizations instead of returning on first match. Scope team list queries to the caller's permitted organizations when filtering by user_id.

test: add regression tests for cross-org admin escalation

815a2be

Verify that an org admin of org-A cannot operate on org-B, and that an admin of both orgs can operate on both.

refactor: extract admin metadata helper, hoist loop-invariant tag res…

3cd5796

…olution Extract _get_admin_metadata() in CustomGuardrail to deduplicate metadata lookup. Hoist tag resolution above the deployment loop in budget limiter. Update stale comment in tag routing.

test: update dynamic callback params test for turn_off_message_loggin…

413f898

…g removal Verify turn_off_message_logging is no longer extracted from request kwargs since it is now admin-only.

fix(proxy): add safe_get/async_safe_get with redirect validation

d15196b

Add safe_get() and async_safe_get() helpers that validate each redirect hop before following. For HTTPS, rely on TLS certificate binding instead of URL rewriting. Simplify call sites to use the new helpers.

fix: preserve caller headers across redirect hops in safe_get

037fb57

test: mock SSRF validation in openapi spec URL test

62ec396

fix: redirect loop was dead code, clean up imports

e2a0c96

Read Location header directly instead of response.next_request (which is None when follow_redirects=False). Resolve relative redirect URLs with httpx.URL.join(). Remove unused imports.

refactor: move url_utils to litellm_core_utils to avoid proxy dependency

1ba2be7

SDK core modules (image_handling, token_counter) should not import from litellm.proxy. Move url_utils.py to litellm_core_utils/ so bare SDK installs without proxy dependencies still work.

test: bypass SSRF validation in image handling tests

30c6556

chore: remove unused asyncio import

f5a9218

style: use 'is not None' for port check (handle port 0 explicitly)

aa2f05f

yuneng-berri temporarily deployed to integration-postgres April 18, 2026 00:20 — with GitHub Actions Inactive

yuneng-berri had a problem deploying to integration-postgres April 18, 2026 00:20 — with GitHub Actions Error

yuneng-berri temporarily deployed to integration-postgres April 18, 2026 00:20 — with GitHub Actions Inactive

Merge remote-tracking branch 'origin/litellm_internal_staging' into l…

11c3270

…itellm_yj_apr17 # Conflicts: # litellm/__init__.py

yuneng-berri temporarily deployed to integration-postgres April 18, 2026 00:37 — with GitHub Actions Inactive

yuneng-berri had a problem deploying to integration-postgres April 18, 2026 00:37 — with GitHub Actions Error

yuneng-berri temporarily deployed to integration-postgres April 18, 2026 00:37 — with GitHub Actions Inactive

Merge remote-tracking branch 'origin/litellm_internal_staging' into l…

ecf65f5

…itellm_yj_apr17

yuneng-berri temporarily deployed to integration-postgres April 18, 2026 16:17 — with GitHub Actions Inactive

greptile-apps Bot reviewed Apr 18, 2026

View reviewed changes

yuneng-berri temporarily deployed to integration-postgres April 18, 2026 18:00 — with GitHub Actions Inactive

yuneng-berri had a problem deploying to integration-postgres April 18, 2026 18:00 — with GitHub Actions Error

yuneng-berri temporarily deployed to integration-postgres April 18, 2026 18:00 — with GitHub Actions Inactive

Merge remote-tracking branch 'origin/litellm_internal_staging' into l…

f483f1e

…itellm_yj_apr17

yuneng-berri had a problem deploying to integration-postgres April 18, 2026 20:19 — with GitHub Actions Error

yuneng-berri temporarily deployed to integration-postgres April 18, 2026 20:19 — with GitHub Actions Inactive

yuneng-berri requested a review from ryan-crabbe-berri April 18, 2026 21:40

ryan-crabbe-berri approved these changes Apr 18, 2026

View reviewed changes

yuneng-berri merged commit e690519 into litellm_internal_staging Apr 18, 2026
97 of 99 checks passed

yuneng-berri deleted the litellm_yj_apr17 branch April 18, 2026 21:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Infra] Merge dev branch#25983

[Infra] Merge dev branch#25983
yuneng-berri merged 48 commits intolitellm_internal_stagingfrom
litellm_yj_apr17

yuneng-berri commented Apr 17, 2026

Uh oh!

yuneng-berri commented Apr 18, 2026

Uh oh!

greptile-apps Bot Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

yuneng-berri commented Apr 17, 2026

Relevant issues

Pre-Submission checklist

Delays in PR merge?

CI (LiteLLM team)

Screenshots / Proof of Fix

Type

Changes

Uh oh!

yuneng-berri commented Apr 18, 2026

Uh oh!

greptile-apps Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants