Skip to content

merge main#25523

Merged
Sameerlite merged 217 commits intolitellm_fix-refusal-status-streaming2from
main
Apr 10, 2026
Merged

merge main#25523
Sameerlite merged 217 commits intolitellm_fix-refusal-status-streaming2from
main

Conversation

@Sameerlite
Copy link
Copy Markdown
Collaborator

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

joereyna and others added 30 commits March 31, 2026 16:44
Pin every dependency across all Docker builds so upgrades are intentional.
Verified by building all 3 production images and diffing pip freeze against
known-good v1.83.0-nightly baselines — zero version drift.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pin pyproject.toml deps from PyPI resolution of `pip install litellm[proxy]==1.83.0`
instead of Docker freeze versions. Docker builds (requirements.txt) and PyPI installs
(pyproject.toml) are independent dependency paths. Some packages pinned to 3.9-compatible
versions where latest requires >=3.10.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…on logging (#24592)

asyncio.create_task in CSW.__anext__ scheduled the deferred logging
callback as an independent task that raced with unified_guardrail's
end-of-stream block. For short-stream providers (Vertex AI, Azure,
Anthropic), the logging fired before guardrail_information was written,
causing post_call guardrail entries to be missing from
StandardLoggingPayload.

Move the deferred callback trigger from CSW.__anext__ to
ProxyLogging.async_post_call_streaming_iterator_hook (after the full
streaming pipeline completes). CSW now stores the assembled response
args; the outer consumer fires the callback after all guardrail
end-of-stream blocks finish. Also skip apply_guardrail guardrails in
_run_deferred_stream_guardrails to eliminate duplicate API calls.
pytest-asyncio 1.x no longer provides an implicit event loop in sync
fixtures/tests. Make async-dependent fixtures and tests async, and
replace deprecated asyncio.get_event_loop() in tests. Switch
Dockerfile.build_from_pip from Alpine to Debian slim since
pyroscope-io 0.8.x has no musl wheels.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Re-add Codecov coverage reporting to GHA matrix workflow
hf-xet is Apache 2.0 licensed but PyPI metadata doesn't expose the
license string, so the automated checker can't determine it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolve conflicts between pinned versions and main's caret ranges,
keeping exact pins. Add pytest-cov==5.0.0 from main. Regenerate
poetry.lock.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After merging main, pyproject.toml had updated dependency versions but
requirements.txt still had the old pins. The Dockerfile builds litellm
from source (using pyproject.toml) then installs deps from
requirements.txt, so version mismatches cause pip resolution failures.

Updated 21 packages to match: openai, fastuuid, tiktoken,
importlib-metadata, tokenizers, click, jsonschema, fastapi, pyyaml,
uvicorn, boto3, mcp, orjson, polars, apscheduler, fastapi-sso, pyjwt,
python-multipart, azure-identity, rich, aiohttp.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test fix us.anthropic.claude-haiku-4-5-20251001-v1:0

* ignore mypy cache files

---------

Co-authored-by: Ishaan Jaffer <ishaanjaffer0324@gmail.com>
Co-authored-by: David Chen <clfhhc@gmail.com>
CircleCI had stale version pins (e.g. boto3==1.36.0, aioboto3==13.4.0) that
conflict with requirements.txt (boto3==1.42.80, aioboto3==15.5.0), causing
uv resolution failures. Updated all mismatched pins across config.yml and
.circleci/requirements.txt to match requirements.txt as the source of truth.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
boto3==1.42.80 requires botocore>=1.42.80 but aioboto3==15.5.0 (via
aiobotocore==2.25.1) requires botocore<1.40.62. No aioboto3 release
supports botocore 1.42.x yet. pip's lenient resolver handles this for
Docker builds, but uv's strict resolver rejects it in CI. Added
uv-overrides.txt to force botocore to match boto3 during uv installs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
aiobotocore[boto3] pins both boto3<1.40.62 and botocore<1.40.62.
The previous commit only overrode botocore. Added boto3 override too.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sameerlite and others added 24 commits April 8, 2026 21:27
…xed model names (#25334)

Fixes 'LLM Provider NOT provided' errors when models are configured with
custom_llm_provider but model names lack provider prefix (e.g., 'gpt-4.1-mini'
instead of 'azure/gpt-4.1-mini').

Changes:
- Router now passes deployment's custom_llm_provider to get_llm_provider()
- Fixes 6 code paths: file creation, file content, batch operations, vector store
- Adds regression tests for file creation and file content operations

Made-with: Cursor
* fix(vertex_ai): support pluggable (executable) credential_source for WIF auth (#24700)

The WIF credential dispatch in load_auth() only handled identity_pool and
aws credential types. When credential_source.executable was present (used
for Azure Managed Identity via Workload Identity Federation), it fell
through to identity_pool.Credentials which rejected it with MalformedError.

Add dispatch to google.auth.pluggable.Credentials for executable-type
credential sources, following the same pattern as the existing identity_pool
and aws helpers.

Fixes authentication for Azure Container Apps → GCP Vertex AI via WIF
with executable credential sources.

* feat(logging): add component and logger fields to JSON logs for 3rd p… (#24447)

* feat(logging): add component and logger fields to JSON logs for 3rd party filtering

* Let user-supplied extra fields win over auto-generated component/logger, tighten test assertions

* Feat - Add organization into the metrics metadata for org_id & org_alias (#24440)

* Add org_id and org_alias label names to Prometheus metric definitions

* Add user_api_key_org_alias to StandardLoggingUserAPIKeyMetadata

* Populate user_api_key_org_alias in pre-call metadata

* Pass org_id and org_alias into per-request Prometheus metric labels

* Add test for org labels on per-request Prometheus metrics

* chore: resolve test mockdata

* Address review: populate org_alias from DB view, add feature flag, use .get() for org metadata

* Add org labels to failure path and verify flag behavior in test

* Fix test: build flag-off enum_values without org fields

* Gate org labels behind feature flag in get_labels() instead of static metric lists

* Scope org label injection to metrics that carry team context, remove orphaned budget label defs, add test teardown

* Use explicit metric allowlist for org label injection instead of team heuristic

* Fix duplicate org label guard, move _org_label_metrics to class constant

* Reset custom_prometheus_metadata_labels after duplicate label assertion

* fix: emit org labels by default, remove flag, fix missing org_alias in all metadata paths

* fix: emit org labels by default, no opt-in flag required

* fix: write org_alias to metadata unconditionally in proxy_server.py

* fix: 429s from batch creation being converted to 500 (#24703)

* add us gov models (#24660)

* add us gov models

* added max tokens

* Litellm dev 04 02 2026 p1 (#25052)

* fix: replace hardcoded url

* fix: Anthropic web search cost not tracked for Chat Completions

The ModelResponse branch in response_object_includes_web_search_call()
only checked url_citation annotations and prompt_tokens_details, missing
Anthropic's server_tool_use.web_search_requests field. This caused
_handle_web_search_cost() to never fire for Anthropic Claude models.

Also routes vertex_ai/claude-* models to the Anthropic cost calculator
instead of the Gemini one, since Claude on Vertex uses the same
server_tool_use billing structure as the direct Anthropic API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix(anthropic): pass logging_obj to client.post for litellm_overhead_time_ms (#24071)

When LITELLM_DETAILED_TIMING=true, litellm_overhead_time_ms was null for
Anthropic because the handler did not pass logging_obj to client.post(),
so track_llm_api_timing could not set llm_api_duration_ms. Pass
logging_obj=logging_obj at all four post() call sites (make_call,
make_sync_call, acompletion, completion). Add test to ensure make_call
passes logging_obj to client.post.

Made-with: Cursor

* sap - add additional parameters for grounding

- additional parameter for grounding added for the sap provider

* sap - fix models

* (sap) add filtering, masking, translation SAP GEN AI Hub modules

* (sap) add tests and docs for new SAP modules

* (sap) add support of multiple modules config

* (sap) code refactoring

* (sap) rename file

* test(): add safeguard tests

* (sap) update tests

* (sap) update docs, solve merge conflict in transformation.py

* (sap) linter fix

* (sap) Align embedding request transformation with current API

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) mock commit

* (sap) run black formater

* (sap) add literals to models, add negative tests, fix test for tool transformation

* (sap) fix formating

* (sap) fix models

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) commit for rerun bot review

* (sap) minor improve

* (sap) fix after bot review

* (sap) lint fix

* docs(sap): update documentation

* fix(sap): change creds priority

* fix(sap): change creds priority

* fix(sap): fix sap creds unit test

* fix(sap): linter fix

* fix(sap): linter fix

* linter fix

* (sap) update logic of fetching creds, add additional tests

* (sap) clean up code

* (sap) fix after review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) add a possibility to put the service key by both variants

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) update test

* (sap) update service key resolve function

* (sap) run black formater

* (sap) fix validate credentials, add negative tests for credential fetching

* (sap) fix validate credentials, add negative tests for credential fetching

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) lint fix

* (sap) lint fix

* feat: support service_tier in gemini

* chore: add a service_tier field mapping from openai to gemini

* fix: use x-gemini-service-tier header in response

* docs: add service_tier to gemini docs

* chore: add defaut/standard mapping, and some tests

* chore: tidying up some case insensitivity

* chore: remove unnecessary guard

* fix: remove redundant test file

* fix: handle 'auto' case-insensitively

* fix: return service_tier on final steamed chunk

* chore: black

* feat: enable supports_service_tier to gemini models

* Fix get_standard_logging_metadata tests

* Fix test_get_model_info_bedrock_models

* Fix test_get_model_info_bedrock_models

* Fix remaining tests

* Fix mypy issues

* Fix tests

* Fix merge conflicts

* Fix code qa

* Fix code qa

* Fix code qa

* Fix greptile review

---------

Co-authored-by: michelligabriele <gabriele.michelli@icloud.com>
Co-authored-by: Josh <36064836+J-Byron@users.noreply.github.com>
Co-authored-by: mubashir1osmani <mubashir.osmani777@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: milan-berri <milan@berri.ai>
Co-authored-by: Alperen Kömürcü <alperen.koemuercue@sap.com>
Co-authored-by: Vasilisa Parshikova <vasilisa.parshikova@sap.com>
Co-authored-by: Lin Xu <lin.xu03@sap.com>
Co-authored-by: Mark McDonald <macd@google.com>
Co-authored-by: Sameer Kankute <sameer@berri.ai>
Add Baseten Model API pricing entries for Nemotron, GLM, Kimi, GPT OSS, and DeepSeek models with validated model slugs. Include a focused regression test to assert provider and per-token pricing values.

Made-with: Cursor
Remove the silent try/catch from setSecureItem so OAuth hooks can
surface actionable "enable storage" guidance instead of a cryptic
"state lost" error after the round-trip. Add a local try/catch in
ChatUI where the storage write is non-critical.
…loyment best practices (#25439)

- New doc page covering all signed image variants, verification commands,
  CI/CD enforcement (K8s Sigstore Policy Controller, GCP Binary Authorization,
  AWS/EKS, GitHub Actions), digest pinning, and safe upgrade patterns
- Added to sidebar under Setup & Deployment
- Cross-linked from the existing deploy.md cosign section

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com>
…_model_test

fix(test): mock headers in test_completion_fine_tuned_model
feat(mcp): add per-user OAuth token storage for interactive MCP flows
[Fix] UI: improve storage handling and Dockerfile consistency
…st overrides

Raise vitest testTimeout from 10s to 30s and drop per-test timeout overrides
across UI unit tests. Group CreateUserButton and TeamInfo tests under nested
describe blocks to make the most flaky suites easier to scan.
…et_model_query_param

fix(responses-ws): append ?model= to backend WebSocket URL
Remove leftover 10000ms per-test timeout in add_model_tab.test.tsx that was
missed in the initial sweep. The test now inherits the 30000ms global.
MCP_PER_USER_TOKEN_DEFAULT_TTL and MCP_PER_USER_TOKEN_EXPIRY_BUFFER_SECONDS
were added in #25441 but not documented, causing test_env_keys.py to fail.
…_env_vars

[Docs] Add missing MCP per-user token env vars to config_settings
[Test] UI - Unit tests: raise global vitest timeout and remove per-test overrides
Unify UI and API token authorization through the shared RBAC path
and backfill missing routes in role-based route lists.
refactor: consolidate route auth for UI and API tokens
@github-advanced-security
Copy link
Copy Markdown
Contributor

You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool.

What Enabling Code Scanning Means:

  • The 'Security' tab will display more code scanning analysis results (e.g., for the default branch).
  • Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results.
  • You will be able to see the analysis results for the pull request's branch on this overview once the scans have completed and the checks have passed.

For more information about GitHub Code Scanning, check out the documentation.

Copy link
Copy Markdown
Contributor

@github-advanced-security github-advanced-security AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

Comment thread Dockerfile
RUN apk add --no-cache bash gcc py3-pip python3 python3-dev openssl openssl-dev

RUN python -m pip install build
RUN python -m pip install build==1.4.2
Comment thread Dockerfile

# Install the built wheel using pip; again using a wildcard if it's the only file
RUN pip install *.whl /wheels/* --no-index --find-links=/wheels/ && rm -f *.whl && rm -rf /wheels
RUN pip install *.whl /wheels/* --no-index --find-links=/wheels/ --no-deps && rm -f *.whl && rm -rf /wheels
Comment thread docker/Dockerfile.alpine
Comment on lines +16 to +17
RUN pip install --upgrade pip==26.0.1 && \
pip install build==1.4.2
Comment thread docker/Dockerfile.alpine
Comment on lines +16 to +17
RUN pip install --upgrade pip==26.0.1 && \
pip install build==1.4.2
Comment thread docker/Dockerfile.alpine

# Install the built wheel using pip; again using a wildcard if it's the only file
RUN pip install *.whl /wheels/* --no-index --find-links=/wheels/ && rm -f *.whl && rm -rf /wheels
RUN pip install *.whl /wheels/* --no-index --find-links=/wheels/ --no-deps && rm -f *.whl && rm -rf /wheels

# Install the built wheel using pip; again using a wildcard if it's the only file
RUN pip install *.whl /wheels/* --no-index --find-links=/wheels/ && rm -f *.whl && rm -rf /wheels
RUN pip install *.whl /wheels/* --no-index --find-links=/wheels/ --no-deps && rm -f *.whl && rm -rf /wheels
Comment thread docker/install_auto_router.sh Outdated
# --- Playwright ---
echo "=== Installing Playwright dependencies ==="
cd "$SCRIPT_DIR"
npm install --silent
# --- Rebuild UI from source ---
echo "=== Building UI from source ==="
cd "$DASHBOARD_DIR"
npm install --silent 2>/dev/null || true
# --- Playwright ---
echo "=== Installing Playwright dependencies ==="
cd "$DASHBOARD_DIR"
npm install --silent 2>/dev/null || true
@Sameerlite Sameerlite merged commit 89f05b5 into litellm_fix-refusal-status-streaming2 Apr 10, 2026
54 of 99 checks passed
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 10, 2026

Too many files changed for review. (1039 files found, 100 file limit)

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 10, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
4 out of 6 committers have signed the CLA.

✅ csoni-cweave
✅ yuneng-berri
✅ joereyna
✅ ryan-crabbe-berri
❌ krrish-berri-2
❌ ishaan-berri
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.