Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
a881ac5
[Fix] UI: resolve CodeQL security alerts and Dockerfile.health_check …
yuneng-berri Apr 9, 2026
36bf337
fix(docker): add non-root USER and HEALTHCHECK to Dockerfile.custom_ui
yuneng-berri Apr 2, 2026
70a5c27
[Fix] Address review feedback on storage utility and Dockerfiles
yuneng-berri Apr 9, 2026
ac29118
Update docker/Dockerfile.custom_ui
yuneng-berri Apr 9, 2026
20ed120
[Fix] Let setSecureItem propagate storage errors to callers
yuneng-berri Apr 9, 2026
cd9c511
feat(proxy): add credential overrides per team/project via model_conf…
michelligabriele Apr 9, 2026
f6dde29
fix(responses-ws): append ?model= to backend WebSocket URL
joereyna Apr 9, 2026
3ac4333
fix(responses-ws): use urllib.parse to append model param, fix test m…
joereyna Apr 9, 2026
3a6db70
docs: add Docker Image Security Guide for cosign verification and dep…
krrish-berri-2 Apr 9, 2026
ce2add3
feat(mcp): add per-user OAuth token storage for interactive MCP flows
csoni-cweave Apr 9, 2026
1571f5e
fix(test): mock headers in test_completion_fine_tuned_model
joereyna Apr 9, 2026
afd46e7
format vertex test file
joereyna Apr 9, 2026
e7551a1
Merge pull request #25444 from joereyna/litellm_fix_vertex_fine_tuned…
yuneng-berri Apr 10, 2026
2c0d20b
Merge pull request #25441 from csoni-cweave/per-user-mcp-oauth-token
ishaan-berri Apr 10, 2026
ce75598
Merge pull request #25384 from BerriAI/litellm_/bold-pare
yuneng-berri Apr 10, 2026
3a316b9
[Test] UI - Unit tests: raise global vitest timeout and remove per-te…
yuneng-berri Apr 10, 2026
aa0fa10
Merge pull request #25437 from joereyna/litellm_fix_responses_websock…
yuneng-berri Apr 10, 2026
92dbd2c
address greptile review feedback (greploop iteration 1)
yuneng-berri Apr 10, 2026
ce0b57b
[Docs] Add missing MCP per-user token env vars to config_settings
yuneng-berri Apr 10, 2026
42e5583
Merge pull request #25471 from BerriAI/litellm_doc_mcp_per_user_token…
yuneng-berri Apr 10, 2026
c7f610c
Merge remote-tracking branch 'origin/main' into litellm_/compassionat…
yuneng-berri Apr 10, 2026
9e6d2d2
Merge pull request #25468 from BerriAI/litellm_/compassionate-shannon
yuneng-berri Apr 10, 2026
26e99f2
refactor: consolidate route auth for UI and API tokens
ryan-crabbe-berri Apr 10, 2026
3af7de4
retain ui_routes enum alias for JWT config backwards compatibility
ryan-crabbe-berri Apr 10, 2026
d0e347a
Merge pull request #25473 from BerriAI/litellm_auth_rbac_cleanup
yuneng-berri Apr 10, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions docker/Dockerfile.custom_ui
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,16 @@ WORKDIR /app
RUN sed -i 's/\r$//' docker/entrypoint.sh && chmod +x docker/entrypoint.sh
RUN sed -i 's/\r$//' docker/prod_entrypoint.sh && chmod +x docker/prod_entrypoint.sh

# Run as non-root user
RUN groupadd --gid 1000 appuser && useradd --uid 1000 --gid 1000 --no-create-home appuser \
&& chown -R appuser:appuser /app
USER appuser

# Expose the necessary port
EXPOSE 4000/tcp

HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD ["python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:4000/health')"]

# Override the CMD instruction with your desired command and arguments
CMD ["--port", "4000", "--config", "config.yaml", "--detailed_debug"]
8 changes: 4 additions & 4 deletions docker/Dockerfile.health_check
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,12 @@ RUN pip install --no-cache-dir -r requirements.txt
RUN chmod +x /app/health_check_client.py

# Run as non-root user
RUN adduser --disabled-password --gecos "" --uid 1001 healthcheck
USER healthcheck
RUN groupadd --gid 1000 appuser && useradd --uid 1000 --gid 1000 --no-create-home appuser
USER appuser

# Health check
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
CMD python /app/health_check_client.py --help || exit 1
HEALTHCHECK --interval=30s --timeout=5s --start-period=5s --retries=3 \
CMD ["python", "/app/health_check_client.py", "--help"]

# Set entrypoint
ENTRYPOINT ["python", "/app/health_check_client.py"]
2 changes: 2 additions & 0 deletions docs/my-website/docs/proxy/config_settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -602,6 +602,8 @@ router_settings:
| MCP_OAUTH2_TOKEN_CACHE_MAX_SIZE | Maximum number of entries in MCP OAuth2 token cache. Default is 200
| MCP_OAUTH2_TOKEN_CACHE_MIN_TTL | Minimum TTL in seconds for MCP OAuth2 token cache. Default is 10
| MCP_OAUTH2_TOKEN_EXPIRY_BUFFER_SECONDS | Seconds to subtract from token expiry when computing cache TTL. Default is 60
| MCP_PER_USER_TOKEN_DEFAULT_TTL | Default TTL in seconds for per-user MCP OAuth tokens stored in Redis. Default is 43200 (12 hours)
| MCP_PER_USER_TOKEN_EXPIRY_BUFFER_SECONDS | Seconds to subtract from per-user MCP OAuth token expiry when computing Redis TTL. Default is 60
| DEFAULT_MOCK_RESPONSE_COMPLETION_TOKEN_COUNT | Default token count for mock response completions. Default is 20
| DEFAULT_MOCK_RESPONSE_PROMPT_TOKEN_COUNT | Default token count for mock response prompts. Default is 10
| DEFAULT_MODEL_CREATED_AT_TIME | Default creation timestamp for models. Default is 1677610602
Expand Down
274 changes: 274 additions & 0 deletions docs/my-website/docs/proxy/credential_routing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,274 @@
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

# Per-Team/Project Credential Routing

Route the same model to different LLM provider endpoints (e.g. different Azure instances) based on which team or project makes the request.

## Overview

In multi-tenant deployments, different teams often need the same model name (e.g., `gpt-4`) to hit different provider endpoints β€” for example, separate Azure OpenAI instances per business unit for cost isolation, data residency, or rate limit separation.

**Credential routing** lets you configure this in team/project metadata using the existing [credentials table](./ui_credentials.md), without duplicating model definitions or creating separate model groups per team.

```
Hotel Team β†’ gpt-4 β†’ https://hotel-eastus.openai.azure.com/
Flight Team β†’ gpt-4 β†’ https://flight-centralus.openai.azure.com/
```

### Precedence Chain

When a request comes in, the system walks this precedence chain (first match wins):

1. **Clientside credentials** β€” `api_base`/`api_key` passed in the request body ([docs](./clientside_auth.md))
2. **Project model-specific** β€” override for this exact model in the project's `model_config`
3. **Project default** β€” `defaultconfig` in the project's `model_config`
4. **Team model-specific** β€” override for this exact model in the team's `model_config`
5. **Team default** β€” `defaultconfig` in the team's `model_config`
6. **Deployment default** β€” the model's `litellm_params` as configured in `config.yaml`

## Quick Start

### Step 1: Create Credentials

Store your Azure endpoint credentials in the credentials table. You can do this via the [UI](./ui_credentials.md) or API:

```bash showLineNumbers
# Create credential for Hotel team's Azure endpoint
curl -X POST 'http://0.0.0.0:4000/credentials' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"credential_name": "hotel-azure-eastus",
"credential_values": {
"api_base": "https://hotel-eastus.openai.azure.com/",
"api_key": "sk-azure-hotel-key-xxx"
}
}'
```

```bash showLineNumbers
# Create credential for Flight team's Azure endpoint
curl -X POST 'http://0.0.0.0:4000/credentials' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"credential_name": "flight-azure-centralus",
"credential_values": {
"api_base": "https://flight-centralus.openai.azure.com/",
"api_key": "sk-azure-flight-key-xxx"
}
}'
```

### Step 2: Set `model_config` on Teams

Add a `model_config` key to the team's metadata referencing the credential by name:

```bash showLineNumbers
# Hotel team β€” default Azure endpoint for all models
curl -X PATCH 'http://0.0.0.0:4000/team/update' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"team_id": "hotel-team-id",
"metadata": {
"model_config": {
"defaultconfig": {
"azure": {
"litellm_credentials": "hotel-azure-eastus"
}
}
}
}
}'
```

```bash showLineNumbers
# Flight team β€” default Azure endpoint for all models
curl -X PATCH 'http://0.0.0.0:4000/team/update' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"team_id": "flight-team-id",
"metadata": {
"model_config": {
"defaultconfig": {
"azure": {
"litellm_credentials": "flight-azure-centralus"
}
}
}
}
}'
```

### Step 3: Make Requests

Requests are automatically routed to the correct Azure endpoint based on the API key's team:

```bash showLineNumbers
# Request using Hotel team's API key β†’ routes to hotel-eastus.openai.azure.com
curl http://localhost:4000/v1/chat/completions \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-hotel-team-key' \
-d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello"}]}'

# Request using Flight team's API key β†’ routes to flight-centralus.openai.azure.com
curl http://localhost:4000/v1/chat/completions \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-flight-team-key' \
-d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello"}]}'
```

## Per-Model Overrides

You can set different credentials for specific models while keeping a default for everything else:

```bash showLineNumbers
curl -X PATCH 'http://0.0.0.0:4000/team/update' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"team_id": "hotel-team-id",
"metadata": {
"model_config": {
"defaultconfig": {
"azure": {
"litellm_credentials": "hotel-azure-eastus"
}
},
"gpt-4": {
"azure": {
"litellm_credentials": "hotel-azure-westus"
}
}
}
}
}'
```

With this config:
- `gpt-4` requests β†’ `hotel-azure-westus` credential (model-specific)
- All other models β†’ `hotel-azure-eastus` credential (default)

## Project-Level Overrides

Projects inherit their team's `model_config` but can override at the project level. Project overrides take precedence over team overrides.

```bash showLineNumbers
# Project overrides the team default for all models
curl -X PATCH 'http://0.0.0.0:4000/project/update' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"project_id": "hotel-rec-app-id",
"metadata": {
"model_config": {
"defaultconfig": {
"azure": {
"litellm_credentials": "hotel-rec-azure"
}
},
"gpt-4-vision": {
"azure": {
"litellm_credentials": "hotel-rec-vision"
}
}
}
}
}'
```

### Full Example: Hotel Team with Two Projects

**Setup:**
- **Hotel Team**: default `hotel-azure-eastus`, GPT-4 override to `hotel-azure-westus`
- **Hotel Rec App** (project): default `hotel-rec-azure`, GPT-4-Vision override to `hotel-rec-vision`
- **Hotel Review App** (project): no overrides β€” inherits team config

**Resolution:**

| Request | Resolved Credential | Why |
|---|---|---|
| Hotel Rec App β†’ `gpt-4` | `hotel-rec-azure` | Project default (no project model-specific match for gpt-4) |
| Hotel Rec App β†’ `gpt-4-vision` | `hotel-rec-vision` | Project model-specific |
| Hotel Review App β†’ `gpt-3.5` | `hotel-azure-eastus` | Team default (no project config) |
| Hotel Review App β†’ `gpt-4` | `hotel-azure-westus` | Team model-specific |

## `model_config` Schema

The `model_config` key is a JSON object in team/project `metadata`:

```json
{
"model_config": {
"defaultconfig": {
"<provider>": {
"litellm_credentials": "<credential-name>"
}
},
"<model-name>": {
"<provider>": {
"litellm_credentials": "<credential-name>"
}
}
}
}
```

| Field | Description |
|---|---|
| `defaultconfig` | Fallback credential for any model not explicitly listed |
| `<model-name>` | Model-specific override β€” must match the LiteLLM model group name |
| `<provider>` | Provider key (e.g. `azure`, `openai`, `bedrock`). When the model name includes a provider prefix (e.g. `azure/gpt-4`), the system prefers the matching provider key |
| `litellm_credentials` | Name of a credential in the [credentials table](./ui_credentials.md) |

### Credential Values

The referenced credential can contain any combination of:

| Key | Description |
|---|---|
| `api_base` | Provider endpoint URL |
| `api_key` | API key for the provider |
| `api_version` | API version (e.g. for Azure) |

Only keys present in the credential are applied. Keys already in the request (e.g. clientside `api_version`) are never overwritten.

## Enabling the Feature

This feature is **disabled by default** and must be explicitly enabled. To enable it:

<Tabs>

<TabItem value="config" label="config.yaml">

```yaml
litellm_settings:
enable_model_config_credential_overrides: true
```

</TabItem>

<TabItem value="env" label="Environment Variable">

```bash
export LITELLM_ENABLE_MODEL_CONFIG_CREDENTIAL_OVERRIDES=true
```

</TabItem>

</Tabs>

:::info
The feature flag must be enabled before `model_config` entries in team/project metadata take effect. Without it, credential routing is completely inert β€” no metadata is read, no credentials are resolved.
:::

## Related Documentation

- [Adding LLM Credentials](./ui_credentials.md) β€” Create and manage reusable credentials
- [Project Management](./project_management.md) β€” Project hierarchy and API
- [Team Budgets](./team_budgets.md) β€” Team-level budget management
- [Clientside LLM Credentials](./clientside_auth.md) β€” Passing credentials in the request body
- [Credential Usage Tracking](./credential_usage_tracking.md) β€” Track spend by credential
2 changes: 1 addition & 1 deletion docs/my-website/docs/proxy/deploy.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ The following checks were performed on each of these signatures:
- The signatures were verified against the specified public key
```

Learn more about LiteLLM's release signing in the [CI/CD v2 announcement](https://docs.litellm.ai/blog/ci-cd-v2-improvements#verify-docker-image-signatures).
Learn more about LiteLLM's release signing in the [CI/CD v2 announcement](https://docs.litellm.ai/blog/ci-cd-v2-improvements#verify-docker-image-signatures). For a complete guide covering all image variants, CI/CD enforcement, and deployment best practices, see the [Docker Image Security Guide](./docker_image_security.md).

### Docker Run

Expand Down
Loading
Loading