Skip to content

[Infra] Merge dev branch#25983

Merged
yuneng-berri merged 48 commits intolitellm_internal_stagingfrom
litellm_yj_apr17
Apr 18, 2026
Merged

[Infra] Merge dev branch#25983
yuneng-berri merged 48 commits intolitellm_internal_stagingfrom
litellm_yj_apr17

Conversation

@yuneng-berri
Copy link
Copy Markdown
Collaborator

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Screenshots / Proof of Fix

Type

🚄 Infrastructure

Changes

stuxf added 30 commits April 16, 2026 21:06
Validate org admin role against all requested organizations instead
of returning on first match. Scope team list queries to the caller's
permitted organizations when filtering by user_id.
Verify that an org admin of org-A cannot operate on org-B, and that
an admin of both orgs can operate on both.
…g consistency

Read guardrail control flags (disable_global_guardrails, opted_out_global_guardrails)
from admin-configured key metadata instead of the request body. This ensures
callers cannot override admin security policies.

Fix tag-based routing to enforce strict tag checks regardless of whether the
request includes tags. Fix budget limiter to use the same dynamic metadata
key resolution as the tag router for consistent tag extraction.
…olution

Extract _get_admin_metadata() in CustomGuardrail to deduplicate metadata
lookup. Hoist tag resolution above the deployment loop in budget limiter.
Update stale comment in tag routing.
…ging from dynamic params

Include user_api_key_team_metadata alongside user_api_key_metadata in
_get_admin_metadata() so team-level guardrail settings are respected.
Key-level settings take precedence over team-level.

Remove turn_off_message_logging from _supported_callback_params so it
cannot be set via request metadata. Admin controls logging globally
or via key/team configuration.

Update tests to verify user-injected guardrail flags are ignored while
admin-configured flags are respected.
…g removal

Verify turn_off_message_logging is no longer extracted from request
kwargs since it is now admin-only.
…lied URLs

Add validate_url() utility that resolves DNS once, validates all IPs
against private network ranges, and rewrites the URL to connect to the
validated IP directly. Prevents DNS rebinding by pinning to the resolved
IP. Disable follow_redirects to prevent redirect-based SSRF bypasses.

Applied to all user-supplied URL entry points:
- Image URL fetching in chat completions
- Token counter image dimension fetching
- RAG file ingestion
- MCP OpenAPI spec loading
Add safe_get() and async_safe_get() helpers that validate each
redirect hop before following. For HTTPS, rely on TLS certificate
binding instead of URL rewriting. Simplify call sites to use the
new helpers.
…ests

Check URL scheme before calling safe_get in token counter to avoid
unnecessary DNS resolution on base64-encoded image data.

Add 14 unit tests for validate_url covering blocked networks, scheme
validation, URL rewriting, and DNS failure handling.
…sabled

_is_blocked_ip now returns True (blocked) for unparseable addresses
instead of False (allowed). HTTPS URLs are rewritten to validated IPs
when ssl_verify is disabled, closing the DNS rebinding window that
exists without TLS certificate binding.
Read Location header directly instead of response.next_request (which
is None when follow_redirects=False). Resolve relative redirect URLs
with httpx.URL.join(). Remove unused imports.
Pass follow_redirects through in HTTPHandler.get() — previously the
parameter was accepted but never forwarded to the underlying httpx
client, making sync redirect protection ineffective.

Include port in Host header when non-default (e.g. example.com:8080).

Fix redirect loop to read Location header directly instead of
response.next_request (which is None when follow_redirects=False).
SDK core modules (image_handling, token_counter) should not import
from litellm.proxy. Move url_utils.py to litellm_core_utils/ so
bare SDK installs without proxy dependencies still work.
Greptile P1: six tests in test_url_utils.py performed real DNS
lookups to example.com, violating the tests/test_litellm/ mock-only
rule and risking offline CI failures. Add mock_dns_public and
mock_dns_failure fixtures that monkeypatch socket.getaddrinfo on
the url_utils module.

Greptile P2: move 'import httpx' from inside _extract_redirect_url
to module-level imports per CLAUDE.md style guide.
Greptile P2: _get_admin_metadata used 'litellm_metadata or metadata',
meaning a caller sending a non-empty litellm_metadata would shadow
admin config the proxy had injected into data['metadata']. Admin
exemptions would be silently ignored.

Check both keys and prefer whichever contains admin fields. Add
regression test covering the shadowing scenario.
Two litellm-level flags wired through litellm_settings YAML:

- user_url_validation (bool, default True): master switch. When False,
  safe_get/async_safe_get bypass validation and call client.get
  directly.
- user_url_allowed_hosts (List[str], default []): per-host allowlist.
  Entries are 'host' (matches any port) or 'host:port' (port-specific).
  Matched hosts skip the blocked-networks check but still resolve DNS
  and still rewrite HTTP to the validated IP, preserving rebinding
  protection within the permitted name.

Also fix an existing Host header bug: IPv6 literals (e.g. 2001:db8::1)
were emitted unbracketed, producing ambiguous values like
'2001:db8::1:8080' per RFC 7230 5.4. Bracket them consistently in
_format_host_header.
Expand the pre-call metadata strip to also remove user_api_key_metadata
and user_api_key_team_metadata. The proxy writes these fields into
data[_metadata_variable_name] with admin-authoritative values, but only
into that one metadata key; the caller's value in the OTHER metadata
key (metadata vs litellm_metadata) would otherwise persist and be
picked up by _get_admin_metadata, letting a caller supply their own
'admin' config to disable guardrails, opt out of global policies, etc.

VERIA-28 (High): Security Policy and Guardrail Bypass via Unsanitized
Request Metadata.

Add regression test at the proxy boundary verifying the strip, and
extend the guardrail test to cover the post-strip admin-config path.
…icast and Azure Wire Server

Replace the hand-maintained _BLOCKED_NETWORKS CIDR list with a
default-deny check based on ipaddress.is_global (RFC 6890 semantics,
implemented by Python's stdlib). Also reject multicast explicitly —
is_global returns True for public multicast allocations, which are
not legitimate HTTP targets.

Only globally-routable cloud-fabric IPs need explicit exceptions; the
canonical list contains one entry today: Azure Wire Server
(168.63.129.16), an in-fabric service reachable from any Azure VM.

Coverage delta picked up automatically via is_global:
- Alibaba Cloud metadata (100.100.100.200, CGNAT)
- Legacy Oracle metadata (192.0.0.192, IETF Protocol Assignments)
- IPv4 documentation ranges (192.0.2.0/24, 198.51.100.0/24, 203.0.113.0/24)
- IPv4 reserved/future-use (240.0.0.0/4) and broadcast
- IPv6 documentation (2001:db8::/32)

Also fix two issues Greptile flagged:
- HTTP relative-redirect hops lost the original hostname because
  _extract_redirect_url joined the Location against the rewritten
  (IP-based) URL. Join against the pre-rewrite URL so the next hop's
  Host header keeps the original hostname.
- Two unit tests performed real socket.getaddrinfo('localhost')
  calls. Monkeypatch them.

Add coverage tests for every cloud-metadata IP from the canonical
SSRF dictionary (AWS/GCP/Azure/Alibaba/Oracle/DO/OpenStack) plus the
new multicast/reserved/documentation/broadcast ranges, and a
regression test for redirect-hostname preservation.
…ent_tags

VERIA-28 (High) follow-up: tag-based routing and tag budget enforcement
read metadata.tags directly from the request, letting an attacker reach
restricted tag-routed deployments or misattribute spend to a victim
team's tag.

Strip metadata.tags (and litellm_metadata.tags) at the pre-call boundary
unless the caller's key or team metadata opts in with
allow_client_tags=True. Default-deny: existing clients that need to pass
routing tags must have the flag set explicitly on their key or team.

Preserves the tag-routing feature for admins who trust their callers;
closes the injection path for everyone else.
Two pre-existing tests codified the pre-fix behavior where any caller-
supplied metadata.tags would flow through to spend logs and routing:

- test_add_key_or_team_level_spend_logs_metadata_to_request exercised
  the request/key/team tag merge. Set allow_client_tags=True on the key
  metadata so the merge path is still tested under the new regime.

- test_create_file_with_nested_litellm_metadata asserted that
  litellm_metadata[tags] form-data propagated to the handler. Drop the
  tag field; the test still proves nested form-parser correctness via
  spend_logs_metadata and environment.
Silent strip is the worst debug UX: admin's client sends routing tags,
they disappear, admin can't figure out why. Emit a warning naming the
metadata key the tags came from and telling the admin exactly which
flag to set if this is intentional.
test_add_litellm_data_to_request_duplicate_tags tests the request/key
tag merge when tags overlap. The merge requires caller-supplied tags to
flow through — set allow_client_tags=True on the key so the merge path
stays testable under the new default-deny regime.
Veria AI caught a bypass: metadata can arrive as a JSON string via
multipart/form-data or extra_body, and the existing strip block ran
before the string-to-dict parse. The isinstance(_user_meta, dict)
guard returned False on the string, the strip was skipped, and then
the parse turned the string into a dict — leaving user_api_key_metadata
/ user_api_key_team_metadata / _pipeline_managed_guardrails / tags
intact in the parsed dict.

Move the strip to run AFTER the parse and BEFORE the merge of
litellm_metadata into data[_metadata_variable_name], closing the bypass
for both raw-dict and string-encoded payloads.

Regression test: test_add_litellm_data_to_request_strips_string_encoded_admin_injection.
…keys

Per VERIA-28's secondary recommendation. The existing check only gated
metadata.guardrails. User-supplied values for disable_global_guardrails
(plural and the original singular typo variant) and opted_out_global_guardrails
are already silently ignored by _get_admin_metadata at read time, but the
silent-ignore makes diagnosis confusing and relies on one specific read
site catching them.

Reject at auth time with a 403 when any of:
- guardrails list (existing)
- disable_global_guardrails (new)
- disable_global_guardrail (new — historical singular-key variant)
- opted_out_global_guardrails (new)

are present in metadata, litellm_metadata, or at the request root, and the
caller's team lacks can_modify_guardrails. Defense in depth: the strip at
the pre-call layer still runs; this check fails loudly one layer earlier
so operators see an explicit 403 rather than a silent-ignore.
Close three variant bypasses adjacent to VERIA-28 found during post-fix
variant audit:

1. _guardrail_modification_check had the same isinstance(dict) bypass
   Veria-AI just flagged on the pre-call strip. A caller sending
   `{"metadata": "{…}"}` as a JSON-encoded string (multipart/form-data
   or extra_body) skipped the guard, got parsed to dict downstream, and
   reached guardrail logic with bypass flags intact. Coerce strings via
   safe_json_loads before evaluating.

2. The allow_client_tags strip only covered body metadata.tags and
   litellm_metadata.tags — caller-supplied tags arriving via the
   x-litellm-tags header or root-level data["tags"] bypassed it. Gate
   add_request_tag_to_metadata's result on the same flag.

3. requester_metadata was deepcopied BEFORE the strip, so attacker
   injections (user_api_key_metadata shadows, disallowed tags,
   _pipeline_managed_guardrails) persisted in the snapshot. The PANW
   guardrail (and any future consumer) trusting requester_metadata
   would see forged values. Move the deepcopy to after the strip.

Regression tests added for each.
@yuneng-berri yuneng-berri temporarily deployed to integration-postgres April 18, 2026 00:20 — with GitHub Actions Inactive
@yuneng-berri yuneng-berri temporarily deployed to integration-postgres April 18, 2026 00:20 — with GitHub Actions Inactive
@yuneng-berri yuneng-berri temporarily deployed to integration-postgres April 18, 2026 00:20 — with GitHub Actions Inactive
…itellm_yj_apr17

# Conflicts:
#	litellm/__init__.py
@yuneng-berri yuneng-berri temporarily deployed to integration-postgres April 18, 2026 00:37 — with GitHub Actions Inactive
@yuneng-berri yuneng-berri temporarily deployed to integration-postgres April 18, 2026 00:37 — with GitHub Actions Inactive
@yuneng-berri yuneng-berri temporarily deployed to integration-postgres April 18, 2026 00:37 — with GitHub Actions Inactive
@yuneng-berri
Copy link
Copy Markdown
Collaborator Author

@greptileai please re-review after merging latest litellm_internal_staging

@yuneng-berri yuneng-berri temporarily deployed to integration-postgres April 18, 2026 16:17 — with GitHub Actions Inactive
@yuneng-berri yuneng-berri temporarily deployed to integration-postgres April 18, 2026 16:17 — with GitHub Actions Inactive
@yuneng-berri yuneng-berri temporarily deployed to integration-postgres April 18, 2026 16:17 — with GitHub Actions Inactive
@yuneng-berri yuneng-berri temporarily deployed to integration-postgres April 18, 2026 16:17 — with GitHub Actions Inactive
Comment on lines +133 to +147

Raises:
SSRFError: If the URL scheme is invalid or the hostname resolves
to a private/internal IP address.
"""
parsed = urlparse(url)

if parsed.scheme not in _ALLOWED_SCHEMES:
raise SSRFError(f"URL scheme '{parsed.scheme}' is not allowed")

hostname = parsed.hostname
if not hostname:
raise SSRFError("URL has no hostname")

port = parsed.port
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Blocking DNS call inside async function stalls the event loop

validate_url() calls socket.getaddrinfo(), a blocking POSIX syscall, with no timeout. When async_safe_get() invokes it, it blocks the entire asyncio event loop for the duration of the DNS lookup. On a busy proxy that frequently processes image URLs (via async_convert_url_to_base64), this serialises all concurrent requests for every DNS resolution.

Use asyncio.get_event_loop().run_in_executor (or asyncio.to_thread on Python 3.9+) to run the blocking call off the event loop:

import asyncio

async def _async_validate_url(url: str) -> tuple[str, str]:
    loop = asyncio.get_event_loop()
    # run_in_executor moves the blocking getaddrinfo call to a thread pool
    return await loop.run_in_executor(None, validate_url, url)

Then in async_safe_get, replace validate_url(url) with await _async_validate_url(url). The synchronous safe_get / validate_url path is unaffected.

…y zero bug)

`should_create_missing_views()` had `and result[0]["reltuples"]` which is
falsy when reltuples=0. On a fresh empty PostgreSQL table, CREATE INDEX sets
reltuples=0, causing the guard to return False and skip view creation entirely.
Views like MonthlyGlobalSpendPerKey are never created, and the
/global/spend/logs endpoint returns 500.

Fix: change to `and result[0]["reltuples"] is not None` so reltuples=0
(empty table) and reltuples=-1 (unanalyzed table) both correctly return True.

Also harden test_vertex_ai.py to return None instead of crashing with
JSONDecodeError when the spend-logs endpoint returns a non-JSON 500 response,
and add unit tests covering all three reltuples branches (0, -1, positive).
@yuneng-berri yuneng-berri temporarily deployed to integration-postgres April 18, 2026 18:00 — with GitHub Actions Inactive
@yuneng-berri yuneng-berri temporarily deployed to integration-postgres April 18, 2026 18:00 — with GitHub Actions Inactive
@yuneng-berri yuneng-berri temporarily deployed to integration-postgres April 18, 2026 18:00 — with GitHub Actions Inactive
@yuneng-berri yuneng-berri merged commit e690519 into litellm_internal_staging Apr 18, 2026
97 of 99 checks passed
@yuneng-berri yuneng-berri deleted the litellm_yj_apr17 branch April 18, 2026 21:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants