Skip to content

fix(proxy): add URL validation for user-supplied URLs#25906

Merged
yuneng-berri merged 15 commits intoBerriAI:litellm_yj_apr17from
stuxf:fix/ssrf-url-validation
Apr 17, 2026
Merged

fix(proxy): add URL validation for user-supplied URLs#25906
yuneng-berri merged 15 commits intoBerriAI:litellm_yj_apr17from
stuxf:fix/ssrf-url-validation

Conversation

@stuxf
Copy link
Copy Markdown
Collaborator

@stuxf stuxf commented Apr 16, 2026

Relevant issues

Adds SSRF protection for user-supplied URLs across multiple endpoints.

Reopens #25837 (original base litellm_yj_apr15 was deleted); rebased onto litellm_internal_staging. No content changes.

Pre-Submission checklist

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🐛 Bug Fix

Changes

1. New shared utility: litellm/litellm_core_utils/url_utils.py

  • validate_url(url) — resolves DNS, validates all IPs against private network ranges (RFC1918, link-local, loopback, IMDS, carrier-grade NAT). For HTTP URLs, rewrites the URL to the validated IP to prevent DNS rebinding. For HTTPS, relies on TLS certificate binding.
  • safe_get(client, url) / async_safe_get(client, url) — fetch with SSRF protection on every redirect hop. Each redirect target is validated before the request is made. Caps at 10 redirects.

2. Applied to user-supplied URL entry points

  • Image URL fetching (image_handling.py) — convert_url_to_base64 and async_convert_url_to_base64 now use safe_get/async_safe_get
  • Token counter (token_counter.py) — image dimension fetching uses safe_get
  • RAG ingestion (base_ingestion.py) — file URL fetching uses async_safe_get
  • MCP OpenAPI spec loading (openapi_to_mcp_generator.py) — spec URL fetching uses async_safe_get

Protection against three SSRF vectors

  • Direct private IP — blocked by IP validation
  • DNS rebinding — blocked by resolve-and-rewrite (HTTP) or TLS certificate binding (HTTPS)
  • Redirect chain to private IP — blocked by per-hop validation in safe_get

stuxf added 11 commits April 16, 2026 21:07
…lied URLs

Add validate_url() utility that resolves DNS once, validates all IPs
against private network ranges, and rewrites the URL to connect to the
validated IP directly. Prevents DNS rebinding by pinning to the resolved
IP. Disable follow_redirects to prevent redirect-based SSRF bypasses.

Applied to all user-supplied URL entry points:
- Image URL fetching in chat completions
- Token counter image dimension fetching
- RAG file ingestion
- MCP OpenAPI spec loading
Add safe_get() and async_safe_get() helpers that validate each
redirect hop before following. For HTTPS, rely on TLS certificate
binding instead of URL rewriting. Simplify call sites to use the
new helpers.
…ests

Check URL scheme before calling safe_get in token counter to avoid
unnecessary DNS resolution on base64-encoded image data.

Add 14 unit tests for validate_url covering blocked networks, scheme
validation, URL rewriting, and DNS failure handling.
…sabled

_is_blocked_ip now returns True (blocked) for unparseable addresses
instead of False (allowed). HTTPS URLs are rewritten to validated IPs
when ssl_verify is disabled, closing the DNS rebinding window that
exists without TLS certificate binding.
Read Location header directly instead of response.next_request (which
is None when follow_redirects=False). Resolve relative redirect URLs
with httpx.URL.join(). Remove unused imports.
Pass follow_redirects through in HTTPHandler.get() — previously the
parameter was accepted but never forwarded to the underlying httpx
client, making sync redirect protection ineffective.

Include port in Host header when non-default (e.g. example.com:8080).

Fix redirect loop to read Location header directly instead of
response.next_request (which is None when follow_redirects=False).
SDK core modules (image_handling, token_counter) should not import
from litellm.proxy. Move url_utils.py to litellm_core_utils/ so
bare SDK installs without proxy dependencies still work.
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 16, 2026 10:27pm

Request Review

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 16, 2026

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 16, 2026

Greptile Summary

This PR adds SSRF protection to all user-supplied URL entry points (image fetching, token counting, RAG ingestion, MCP OpenAPI spec loading) via a new url_utils.py module. The implementation uses DNS resolve-and-rewrite for HTTP (preventing rebinding), hop-by-hop redirect validation, and TLS certificate binding for HTTPS. Admin escape hatches (user_url_validation toggle and user_url_allowed_hosts allowlist) are wired through litellm.__init__. The Azure Wire Server (168.63.129.16) is explicitly added to the cloud-metadata exception list, and the RFC 6890 ip.is_global check covers the remaining private ranges. Previous review concerns have been addressed.

Confidence Score: 5/5

Safe to merge — SSRF protection is correctly implemented with no active bugs in current callers.

All previously raised concerns (inline httpx import, Azure Wire Server gap, no opt-out for internal hosts) have been addressed. Remaining findings are P2: the async retry loop silently swallows SSRFError without logging (sync path does log), and the **kwargs pass-through creates a latent TypeError risk for future callers using HTTPHandler. Neither affects correctness today.

litellm/litellm_core_utils/prompt_templates/image_handling.py (silent SSRFError in async retry loop)

Important Files Changed

Filename Overview
litellm/litellm_core_utils/url_utils.py New SSRF utility: DNS resolve-and-rewrite for HTTP, TLS binding for HTTPS, RFC 6890 is_global check plus Azure Wire Server exception, per-hop redirect validation, admin allowlist and master switch.
litellm/litellm_core_utils/prompt_templates/image_handling.py Replaces bare client.get() with safe_get/async_safe_get; async path silently swallows SSRFError across all 3 retries with no log entry, while sync path does log it.
litellm/litellm_core_utils/token_counter.py get_image_dimensions now guards URL fetch behind startswith check and uses safe_get; SSRFError is silently swallowed and falls through to base64 decode (same behaviour as prior network failures).
litellm/rag/ingestion/base_ingestion.py Single-line swap of client.get() to async_safe_get(); SSRFError propagates to caller unchanged, which is acceptable.
litellm/proxy/_experimental/mcp_server/openapi_to_mcp_generator.py Single-line swap to async_safe_get() for HTTP spec loading; SSRF validation applied before spec fetch.
tests/test_litellm/litellm_core_utils/test_url_utils.py Comprehensive mocked unit tests covering blocked IPs, DNS failure, allowlist, redirect hostname preservation, and master switch — no real network calls.
tests/test_litellm/litellm_core_utils/test_image_handling.py Uses autouse fixture to bypass SSRF in image handling tests; existing size-limit and streaming tests preserved with no weakened assertions.
tests/mcp_tests/test_openapi_spec_path_url.py Mocked async handler tests for URL and local-file spec loading; correctly monkeypatches async_safe_get to bypass DNS in test environment.
litellm/init.py Adds user_url_validation (bool, default True) and user_url_allowed_hosts (List[str], default []) globals for admin opt-out.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User-supplied URL] --> B{user_url_validation\nenabled?}
    B -- No --> C[client.get with follow_redirects=True]
    B -- Yes --> D[validate_url]
    D --> E{Scheme\nhttp/https?}
    E -- No --> F[SSRFError: scheme not allowed]
    E -- Yes --> G[socket.getaddrinfo]
    G --> H{DNS\nresolution OK?}
    H -- No --> I[SSRFError: DNS failed]
    H -- Yes --> J{Host in\nallowlist?}
    J -- No --> K{All IPs\nglobal + not\ncloud-fabric?}
    K -- No --> L[SSRFError: blocked address]
    K -- Yes --> M{HTTPS +\nssl_verify=True?}
    J -- Yes --> M
    M -- Yes --> N[Return original URL\nTLS pins hostname]
    M -- No --> O[Rewrite URL to\nvalidated IP]
    N --> P[client.get\nfollow_redirects=False]
    O --> P
    P --> Q{Response\nis redirect?}
    Q -- No --> R[Return response]
    Q -- Yes --> S{Redirect\ncount < 10?}
    S -- No --> T[SSRFError: too many redirects]
    S -- Yes --> U[Extract Location header\nresolve vs original hostname]
    U --> D
Loading

Reviews (5): Last reviewed commit: "style: use 'is not None' for port check ..." | Re-trigger Greptile

Comment thread litellm/litellm_core_utils/url_utils.py Outdated
Comment thread litellm/litellm_core_utils/url_utils.py
Greptile P1: six tests in test_url_utils.py performed real DNS
lookups to example.com, violating the tests/test_litellm/ mock-only
rule and risking offline CI failures. Add mock_dns_public and
mock_dns_failure fixtures that monkeypatch socket.getaddrinfo on
the url_utils module.

Greptile P2: move 'import httpx' from inside _extract_redirect_url
to module-level imports per CLAUDE.md style guide.
Two litellm-level flags wired through litellm_settings YAML:

- user_url_validation (bool, default True): master switch. When False,
  safe_get/async_safe_get bypass validation and call client.get
  directly.
- user_url_allowed_hosts (List[str], default []): per-host allowlist.
  Entries are 'host' (matches any port) or 'host:port' (port-specific).
  Matched hosts skip the blocked-networks check but still resolve DNS
  and still rewrite HTTP to the validated IP, preserving rebinding
  protection within the permitted name.

Also fix an existing Host header bug: IPv6 literals (e.g. 2001:db8::1)
were emitted unbracketed, producing ambiguous values like
'2001:db8::1:8080' per RFC 7230 5.4. Bracket them consistently in
_format_host_header.
Comment thread litellm/litellm_core_utils/url_utils.py Outdated
…icast and Azure Wire Server

Replace the hand-maintained _BLOCKED_NETWORKS CIDR list with a
default-deny check based on ipaddress.is_global (RFC 6890 semantics,
implemented by Python's stdlib). Also reject multicast explicitly —
is_global returns True for public multicast allocations, which are
not legitimate HTTP targets.

Only globally-routable cloud-fabric IPs need explicit exceptions; the
canonical list contains one entry today: Azure Wire Server
(168.63.129.16), an in-fabric service reachable from any Azure VM.

Coverage delta picked up automatically via is_global:
- Alibaba Cloud metadata (100.100.100.200, CGNAT)
- Legacy Oracle metadata (192.0.0.192, IETF Protocol Assignments)
- IPv4 documentation ranges (192.0.2.0/24, 198.51.100.0/24, 203.0.113.0/24)
- IPv4 reserved/future-use (240.0.0.0/4) and broadcast
- IPv6 documentation (2001:db8::/32)

Also fix two issues Greptile flagged:
- HTTP relative-redirect hops lost the original hostname because
  _extract_redirect_url joined the Location against the rewritten
  (IP-based) URL. Join against the pre-rewrite URL so the next hop's
  Host header keeps the original hostname.
- Two unit tests performed real socket.getaddrinfo('localhost')
  calls. Monkeypatch them.

Add coverage tests for every cloud-metadata IP from the canonical
SSRF dictionary (AWS/GCP/Azure/Alibaba/Oracle/DO/OpenStack) plus the
new multicast/reserved/documentation/broadcast ranges, and a
regression test for redirect-hostname preservation.
@yuneng-berri yuneng-berri changed the base branch from litellm_internal_staging to litellm_yj_apr17 April 17, 2026 19:18
@yuneng-berri yuneng-berri merged commit 7c66edb into BerriAI:litellm_yj_apr17 Apr 17, 2026
42 of 44 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants