[Infra] Merge dev branch#25983
Conversation
Validate org admin role against all requested organizations instead of returning on first match. Scope team list queries to the caller's permitted organizations when filtering by user_id.
Verify that an org admin of org-A cannot operate on org-B, and that an admin of both orgs can operate on both.
…g consistency Read guardrail control flags (disable_global_guardrails, opted_out_global_guardrails) from admin-configured key metadata instead of the request body. This ensures callers cannot override admin security policies. Fix tag-based routing to enforce strict tag checks regardless of whether the request includes tags. Fix budget limiter to use the same dynamic metadata key resolution as the tag router for consistent tag extraction.
…olution Extract _get_admin_metadata() in CustomGuardrail to deduplicate metadata lookup. Hoist tag resolution above the deployment loop in budget limiter. Update stale comment in tag routing.
…ging from dynamic params Include user_api_key_team_metadata alongside user_api_key_metadata in _get_admin_metadata() so team-level guardrail settings are respected. Key-level settings take precedence over team-level. Remove turn_off_message_logging from _supported_callback_params so it cannot be set via request metadata. Admin controls logging globally or via key/team configuration. Update tests to verify user-injected guardrail flags are ignored while admin-configured flags are respected.
…g removal Verify turn_off_message_logging is no longer extracted from request kwargs since it is now admin-only.
…lied URLs Add validate_url() utility that resolves DNS once, validates all IPs against private network ranges, and rewrites the URL to connect to the validated IP directly. Prevents DNS rebinding by pinning to the resolved IP. Disable follow_redirects to prevent redirect-based SSRF bypasses. Applied to all user-supplied URL entry points: - Image URL fetching in chat completions - Token counter image dimension fetching - RAG file ingestion - MCP OpenAPI spec loading
Add safe_get() and async_safe_get() helpers that validate each redirect hop before following. For HTTPS, rely on TLS certificate binding instead of URL rewriting. Simplify call sites to use the new helpers.
…ests Check URL scheme before calling safe_get in token counter to avoid unnecessary DNS resolution on base64-encoded image data. Add 14 unit tests for validate_url covering blocked networks, scheme validation, URL rewriting, and DNS failure handling.
…sabled _is_blocked_ip now returns True (blocked) for unparseable addresses instead of False (allowed). HTTPS URLs are rewritten to validated IPs when ssl_verify is disabled, closing the DNS rebinding window that exists without TLS certificate binding.
Read Location header directly instead of response.next_request (which is None when follow_redirects=False). Resolve relative redirect URLs with httpx.URL.join(). Remove unused imports.
Pass follow_redirects through in HTTPHandler.get() — previously the parameter was accepted but never forwarded to the underlying httpx client, making sync redirect protection ineffective. Include port in Host header when non-default (e.g. example.com:8080). Fix redirect loop to read Location header directly instead of response.next_request (which is None when follow_redirects=False).
SDK core modules (image_handling, token_counter) should not import from litellm.proxy. Move url_utils.py to litellm_core_utils/ so bare SDK installs without proxy dependencies still work.
Greptile P1: six tests in test_url_utils.py performed real DNS lookups to example.com, violating the tests/test_litellm/ mock-only rule and risking offline CI failures. Add mock_dns_public and mock_dns_failure fixtures that monkeypatch socket.getaddrinfo on the url_utils module. Greptile P2: move 'import httpx' from inside _extract_redirect_url to module-level imports per CLAUDE.md style guide.
Greptile P2: _get_admin_metadata used 'litellm_metadata or metadata', meaning a caller sending a non-empty litellm_metadata would shadow admin config the proxy had injected into data['metadata']. Admin exemptions would be silently ignored. Check both keys and prefer whichever contains admin fields. Add regression test covering the shadowing scenario.
Two litellm-level flags wired through litellm_settings YAML: - user_url_validation (bool, default True): master switch. When False, safe_get/async_safe_get bypass validation and call client.get directly. - user_url_allowed_hosts (List[str], default []): per-host allowlist. Entries are 'host' (matches any port) or 'host:port' (port-specific). Matched hosts skip the blocked-networks check but still resolve DNS and still rewrite HTTP to the validated IP, preserving rebinding protection within the permitted name. Also fix an existing Host header bug: IPv6 literals (e.g. 2001:db8::1) were emitted unbracketed, producing ambiguous values like '2001:db8::1:8080' per RFC 7230 5.4. Bracket them consistently in _format_host_header.
Expand the pre-call metadata strip to also remove user_api_key_metadata and user_api_key_team_metadata. The proxy writes these fields into data[_metadata_variable_name] with admin-authoritative values, but only into that one metadata key; the caller's value in the OTHER metadata key (metadata vs litellm_metadata) would otherwise persist and be picked up by _get_admin_metadata, letting a caller supply their own 'admin' config to disable guardrails, opt out of global policies, etc. VERIA-28 (High): Security Policy and Guardrail Bypass via Unsanitized Request Metadata. Add regression test at the proxy boundary verifying the strip, and extend the guardrail test to cover the post-strip admin-config path.
…icast and Azure Wire Server
Replace the hand-maintained _BLOCKED_NETWORKS CIDR list with a
default-deny check based on ipaddress.is_global (RFC 6890 semantics,
implemented by Python's stdlib). Also reject multicast explicitly —
is_global returns True for public multicast allocations, which are
not legitimate HTTP targets.
Only globally-routable cloud-fabric IPs need explicit exceptions; the
canonical list contains one entry today: Azure Wire Server
(168.63.129.16), an in-fabric service reachable from any Azure VM.
Coverage delta picked up automatically via is_global:
- Alibaba Cloud metadata (100.100.100.200, CGNAT)
- Legacy Oracle metadata (192.0.0.192, IETF Protocol Assignments)
- IPv4 documentation ranges (192.0.2.0/24, 198.51.100.0/24, 203.0.113.0/24)
- IPv4 reserved/future-use (240.0.0.0/4) and broadcast
- IPv6 documentation (2001:db8::/32)
Also fix two issues Greptile flagged:
- HTTP relative-redirect hops lost the original hostname because
_extract_redirect_url joined the Location against the rewritten
(IP-based) URL. Join against the pre-rewrite URL so the next hop's
Host header keeps the original hostname.
- Two unit tests performed real socket.getaddrinfo('localhost')
calls. Monkeypatch them.
Add coverage tests for every cloud-metadata IP from the canonical
SSRF dictionary (AWS/GCP/Azure/Alibaba/Oracle/DO/OpenStack) plus the
new multicast/reserved/documentation/broadcast ranges, and a
regression test for redirect-hostname preservation.
…ent_tags VERIA-28 (High) follow-up: tag-based routing and tag budget enforcement read metadata.tags directly from the request, letting an attacker reach restricted tag-routed deployments or misattribute spend to a victim team's tag. Strip metadata.tags (and litellm_metadata.tags) at the pre-call boundary unless the caller's key or team metadata opts in with allow_client_tags=True. Default-deny: existing clients that need to pass routing tags must have the flag set explicitly on their key or team. Preserves the tag-routing feature for admins who trust their callers; closes the injection path for everyone else.
Two pre-existing tests codified the pre-fix behavior where any caller- supplied metadata.tags would flow through to spend logs and routing: - test_add_key_or_team_level_spend_logs_metadata_to_request exercised the request/key/team tag merge. Set allow_client_tags=True on the key metadata so the merge path is still tested under the new regime. - test_create_file_with_nested_litellm_metadata asserted that litellm_metadata[tags] form-data propagated to the handler. Drop the tag field; the test still proves nested form-parser correctness via spend_logs_metadata and environment.
Silent strip is the worst debug UX: admin's client sends routing tags, they disappear, admin can't figure out why. Emit a warning naming the metadata key the tags came from and telling the admin exactly which flag to set if this is intentional.
test_add_litellm_data_to_request_duplicate_tags tests the request/key tag merge when tags overlap. The merge requires caller-supplied tags to flow through — set allow_client_tags=True on the key so the merge path stays testable under the new default-deny regime.
Veria AI caught a bypass: metadata can arrive as a JSON string via multipart/form-data or extra_body, and the existing strip block ran before the string-to-dict parse. The isinstance(_user_meta, dict) guard returned False on the string, the strip was skipped, and then the parse turned the string into a dict — leaving user_api_key_metadata / user_api_key_team_metadata / _pipeline_managed_guardrails / tags intact in the parsed dict. Move the strip to run AFTER the parse and BEFORE the merge of litellm_metadata into data[_metadata_variable_name], closing the bypass for both raw-dict and string-encoded payloads. Regression test: test_add_litellm_data_to_request_strips_string_encoded_admin_injection.
…keys Per VERIA-28's secondary recommendation. The existing check only gated metadata.guardrails. User-supplied values for disable_global_guardrails (plural and the original singular typo variant) and opted_out_global_guardrails are already silently ignored by _get_admin_metadata at read time, but the silent-ignore makes diagnosis confusing and relies on one specific read site catching them. Reject at auth time with a 403 when any of: - guardrails list (existing) - disable_global_guardrails (new) - disable_global_guardrail (new — historical singular-key variant) - opted_out_global_guardrails (new) are present in metadata, litellm_metadata, or at the request root, and the caller's team lacks can_modify_guardrails. Defense in depth: the strip at the pre-call layer still runs; this check fails loudly one layer earlier so operators see an explicit 403 rather than a silent-ignore.
Close three variant bypasses adjacent to VERIA-28 found during post-fix
variant audit:
1. _guardrail_modification_check had the same isinstance(dict) bypass
Veria-AI just flagged on the pre-call strip. A caller sending
`{"metadata": "{…}"}` as a JSON-encoded string (multipart/form-data
or extra_body) skipped the guard, got parsed to dict downstream, and
reached guardrail logic with bypass flags intact. Coerce strings via
safe_json_loads before evaluating.
2. The allow_client_tags strip only covered body metadata.tags and
litellm_metadata.tags — caller-supplied tags arriving via the
x-litellm-tags header or root-level data["tags"] bypassed it. Gate
add_request_tag_to_metadata's result on the same flag.
3. requester_metadata was deepcopied BEFORE the strip, so attacker
injections (user_api_key_metadata shadows, disallowed tags,
_pipeline_managed_guardrails) persisted in the snapshot. The PANW
guardrail (and any future consumer) trusting requester_metadata
would see forged values. Move the deepcopy to after the strip.
Regression tests added for each.
…itellm_yj_apr17 # Conflicts: # litellm/__init__.py
|
@greptileai please re-review after merging latest litellm_internal_staging |
|
|
||
| Raises: | ||
| SSRFError: If the URL scheme is invalid or the hostname resolves | ||
| to a private/internal IP address. | ||
| """ | ||
| parsed = urlparse(url) | ||
|
|
||
| if parsed.scheme not in _ALLOWED_SCHEMES: | ||
| raise SSRFError(f"URL scheme '{parsed.scheme}' is not allowed") | ||
|
|
||
| hostname = parsed.hostname | ||
| if not hostname: | ||
| raise SSRFError("URL has no hostname") | ||
|
|
||
| port = parsed.port |
There was a problem hiding this comment.
Blocking DNS call inside async function stalls the event loop
validate_url() calls socket.getaddrinfo(), a blocking POSIX syscall, with no timeout. When async_safe_get() invokes it, it blocks the entire asyncio event loop for the duration of the DNS lookup. On a busy proxy that frequently processes image URLs (via async_convert_url_to_base64), this serialises all concurrent requests for every DNS resolution.
Use asyncio.get_event_loop().run_in_executor (or asyncio.to_thread on Python 3.9+) to run the blocking call off the event loop:
import asyncio
async def _async_validate_url(url: str) -> tuple[str, str]:
loop = asyncio.get_event_loop()
# run_in_executor moves the blocking getaddrinfo call to a thread pool
return await loop.run_in_executor(None, validate_url, url)Then in async_safe_get, replace validate_url(url) with await _async_validate_url(url). The synchronous safe_get / validate_url path is unaffected.
…y zero bug) `should_create_missing_views()` had `and result[0]["reltuples"]` which is falsy when reltuples=0. On a fresh empty PostgreSQL table, CREATE INDEX sets reltuples=0, causing the guard to return False and skip view creation entirely. Views like MonthlyGlobalSpendPerKey are never created, and the /global/spend/logs endpoint returns 500. Fix: change to `and result[0]["reltuples"] is not None` so reltuples=0 (empty table) and reltuples=-1 (unanalyzed table) both correctly return True. Also harden test_vertex_ai.py to return None instead of crashing with JSONDecodeError when the spend-logs endpoint returns a non-JSON 500 response, and add unit tests covering all three reltuples branches (0, -1, positive).
e690519
into
litellm_internal_staging
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewDelays in PR merge?
If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).
CI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Screenshots / Proof of Fix
Type
🚄 Infrastructure
Changes