Skip to content

Litellm ishaan march30#24887

Merged
ishaan-berri merged 24 commits intolitellm_ishaan_march_30_2from
litellm_ishaan_march30
Apr 4, 2026
Merged

Litellm ishaan march30#24887
ishaan-berri merged 24 commits intolitellm_ishaan_march_30_2from
litellm_ishaan_march30

Conversation

@ishaan-berri
Copy link
Copy Markdown
Contributor

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

Isydmr and others added 17 commits March 25, 2026 11:59
Missing unversioned entry causes cost tracking to return $0.00 for
all requests using vertex_ai/claude-haiku-4-5. All other Vertex AI
Claude models have both versioned and unversioned entries.
…down)

Return early from get_deployments_for_tag when healthy_deployments is empty so
tag-based routing does not raise no_deployments_with_tag_routing after cooldown
filters all deployments. Adds regression test.

Made-with: Cursor
- Add OCIEmbeddingConfig for OCI GenAI embedding models
- Add 16 new chat models (Cohere, Meta Llama, xAI Grok, Google Gemini)
- Add 8 embedding models (Cohere embed v3.0, v4.0)
- Update documentation with embedding examples
- Update pricing for all new models

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- 17 unit tests covering OCIEmbeddingConfig
- Tests for URL generation, param mapping, request/response transform
- Tests for model pricing JSON completeness

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OCI embedText API expects inputs, truncate, and inputType at the
top level of the request body, not nested under embedTextDetails.
Fixed transformation and updated tests accordingly.

Verified with real OCI API: 3/3 embedding models working.
- P1: Fix signing URL mismatch with custom api_base by accepting
  api_base parameter in transform_embedding_request
- P2: Remove encoding_format from supported params (OCI does not
  support it, was silently dropped)
- P2: Raise ValueError for token-array inputs instead of silently
  converting to string representation
- Add test for token-list rejection
MCPSigV4Auth only supported static AWS credentials or the boto3 default
credential chain. Production Kubernetes environments typically authenticate
via IAM role assumption (sts:AssumeRole), which was not possible.

Add aws_role_name and aws_session_name parameters to the MCP SigV4 auth
stack. When aws_role_name is provided, MCPSigV4Auth calls sts:AssumeRole
to obtain temporary credentials before signing requests. Explicit keys,
if also provided, are used as the source identity for the STS call;
otherwise ambient credentials (pod role, instance profile) are used.
…and-models-update

feat(oci): add embedding support, new models, and updated docs
fix(pricing): add unversioned vertex_ai/claude-haiku-4-5 entry
…ts-tag-401-misleading-error

fix: router empty deployments tag 401 misleading error
…role

fix(mcp): add STS AssumeRole support for MCP SigV4 authentication
Replaces raw credential values in debug/error log messages with
boolean presence checks or type names. Adds PEM block, GCP token,
JWT, SAS token, and service-account blob patterns to the redaction
filter. Fixes private_key pattern to capture full PEM blocks instead
of stopping at the first whitespace.

Addresses: Vertex AI credential JSON (including RSA private key)
being logged to stderr on health check failures.
fix: stop logging credential values in debug/error messages
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 1, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 4, 2026 8:56pm

Request Review

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 1, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
6 out of 7 committers have signed the CLA.

✅ Isydmr
✅ danielgandolfi1984
✅ milan-berri
✅ stuxf
✅ ishaan-jaff
✅ michelligabriele
❌ ishaan-berri
You have signed the CLA already but the status is still pending? Let us recheck it.

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Apr 1, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing litellm_ishaan_march30 (1a9d5d6) with main (4c06e43)

Open in CodSpeed

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 1, 2026

Greptile Summary

This PR bundles several independent improvements and fixes across the LiteLLM codebase: AWS SigV4 authentication support for MCP servers (including STS role-assumption via MCPSigV4Auth), OCI Generative AI embedding support (oci/cohere.* models), secret/credential redaction improvements in the logging layer (_logging.py) and Vertex AI error messages, tag-based routing with regex matching against User-Agent headers (tag_regex), and a fix to the PydanticAI agent provider config to make api_base Optional while still raising early when it is absent.

Key changes:

  • MCP SigV4 auth: New MCPSigV4Auth httpx.Auth subclass signs every outgoing MCP request with AWS SigV4, with optional STS AssumeRole support
  • OCI embedding: Full embedding request/response transformation for OCI Generative AI Cohere embed models
  • Secret redaction: Expanded regex patterns and a new public redact_secrets() API used in Slack alerting paths; Vertex credential errors no longer log the full JSON blob
  • Tag-regex routing: Deployments can specify tag_regex patterns matched against User-Agent and other header strings to route MCP clients (e.g., claude-code) to dedicated deployments
  • load_servers_from_config bug: A copy-paste error duplicates the alias-lookup block, causing mcp_aliases-resolved server aliases to be silently discarded when building the MCPServer object

Confidence Score: 4/5

Safe to merge except for the P1 duplicate alias-lookup bug in load_servers_from_config which silently ignores mcp_aliases config.

One confirmed P1 defect: the duplicate alias-lookup block in mcp_server_manager.py resets alias to None and overwrites name_for_prefix without the alias, causing MCPServer objects to be created with wrong names and alias=None whenever mcp_aliases is used in config. All other changes (SigV4 auth, OCI embedding, secret redaction, tag-regex routing, Vertex credential fix) look correct and are well-tested with mock-only unit tests. The STS credential-expiry concern is P2.

litellm/proxy/_experimental/mcp_server/mcp_server_manager.py — duplicate alias-lookup block on lines 245–269 must be removed

Important Files Changed

Filename Overview
litellm/proxy/_experimental/mcp_server/mcp_server_manager.py Contains a copy-paste duplication of the alias-lookup block (lines 219-269): the second block resets alias and cannot re-resolve from mcp_aliases because the alias is already in used_aliases, so MCPServer is created with alias=None and the wrong name whenever mcp_aliases config is used.
litellm/experimental_mcp_client/client.py Adds MCPSigV4Auth httpx.Auth subclass with STS AssumeRole support; credentials are resolved once at init time with no auto-refresh for expired temporary credentials.
litellm/llms/oci/embed/transformation.py New OCI embedding config supporting Cohere models; request/response transformation looks correct, auth delegates to OCIChatConfig signing logic.
litellm/_logging.py Expands secret-redaction regex patterns (GCP service-account blobs, PEM blocks, Azure SAS tokens, key-name-based redaction); adds public redact_secrets() API. Well-tested.
litellm/router_strategy/tag_based_routing.py Adds tag_regex matching against User-Agent and other header strings; regex helper correctly isolates per-pattern compilation errors and skips invalid patterns.
litellm/a2a_protocol/providers/pydantic_ai_agents/config.py Makes api_base Optional to match base class signature, raises early with ValueError when it is absent — correct guard-at-resolution-time pattern.
tests/test_litellm/test_secret_redaction.py Comprehensive mock-only unit tests for all new redaction patterns; no network calls.
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_sigv4_auth.py Mock-only unit tests for MCPSigV4Auth; all AWS calls are patched, no real network traffic.
litellm/llms/vertex_ai/vertex_llm_base.py Credential error messages no longer include the raw JSON blob; only a parse-error description is logged, preventing accidental secret exposure.
ui/litellm-dashboard/src/components/mcp_tools/create_mcp_server.tsx OAuth state persistence correctly uses sessionStorage (not localStorage); form values may contain credentials but the codeql suppression comment is present.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[load_servers_from_config called] --> B[Block 1: resolve alias from mcp_aliases\nadd to used_aliases]
    B --> C[Compute name_for_prefix with alias]
    C --> D[Block 2 duplicated: alias reset to None\nalias_name in used_aliases - lookup fails]
    D --> E[name_for_prefix overwritten without alias]
    E --> F[MCPServer created with alias=None and wrong name]
    G[MCPClient init with aws_role_name] --> H[STS AssumeRole called once]
    H --> I[Credentials stored in self.credentials]
    I --> J[auth_flow signs all requests]
    J --> K[After ~1h credentials expire - no auto-refresh mechanism]
Loading

Comments Outside Diff (1)

  1. litellm/proxy/_experimental/mcp_server/mcp_server_manager.py, line 245-269 (link)

    Duplicate alias-lookup block silently drops mcp_aliases-resolved aliases

    Lines 219–243 already resolve the alias from mcp_aliases and add it to used_aliases. This second identical block (lines 245–269) then:

    1. Resets alias back to server_config.get("alias", None)None for any server relying on mcp_aliases
    2. Tries to re-resolve from mcp_aliases, but alias_name not in used_aliases is now False (it was added in block 1), so the lookup silently fails
    3. Overwrites name_for_prefix with a value computed using alias=None

    The net effect: any server whose alias comes from mcp_aliases (not an explicit "alias" key in its server_config) will have MCPServer(name=<name-without-alias>, alias=None) — the alias feature is silently broken for that config path.

    Remove the duplicate block entirely. The correctly-resolved alias and name_for_prefix from lines 219–243 are what should be passed to the MCPServer constructor:

    # Keep only this block (lines 219–243):
    alias = server_config.get("alias", None)
    
    if mcp_aliases and alias is None:
        for alias_name, target_server_name in mcp_aliases.items():
            if target_server_name == server_name and alias_name not in used_aliases:
                alias = alias_name
                used_aliases.add(alias_name)
                verbose_logger.debug(f"Mapped alias '{alias_name}' to server '{server_name}'")
                break
    
    temp_server = type(
        "TempServer", (), {"alias": alias, "server_name": server_name, "server_id": None}
    )()
    name_for_prefix = get_server_prefix(temp_server)
    
    # Delete lines 245–269 (the duplicate block)

Reviews (6): Last reviewed commit: "fix(mypy): make api_base Optional in Pyd..." | Re-trigger Greptile

if credentials is not None:
if isinstance(credentials, str):
_is_path = os.path.exists(
credentials

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression

This path depends on a [user-provided value](1).

Copilot Autofix

AI 16 days ago

In general, to fix uncontrolled path usage, we must prevent unvalidated user input from being used as a filesystem path. For this case, the safest approach is to stop treating arbitrary strings from litellm_params as file paths and instead only allow file paths that come from trusted configuration (e.g., environment variables) or, at minimum, validate/sanitize any potential path-like strings before using open. Since we should not change the public behavior significantly, a minimal fix is to ensure load_auth never interprets arbitrary user-provided strings as paths.

The single best targeted fix here is to restrict load_auth to only treat a string as a filesystem path if it looks like a JSON file path that the server operator intended (e.g., ends with .json), and otherwise always interpret strings as JSON blobs. That way, a user who passes "vertex_credentials": "/etc/passwd" will cause json.loads("/etc/passwd") to fail (and be caught by the existing exception handler) rather than opening the file. This preserves existing valid behavior where operators point to a JSON key file (like /path/to/service-account.json), while blocking arbitrary file reads. Concretely, in litellm/llms/vertex_ai/vertex_llm_base.py inside VertexBase.load_auth, we will adjust the logic around os.path.exists(credentials) to only consider it a path if the string both exists and has a .json suffix (case-insensitive). No new imports are required, and we do not change any other call sites.


Suggested changeset 1
litellm/llms/vertex_ai/vertex_llm_base.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/litellm/llms/vertex_ai/vertex_llm_base.py b/litellm/llms/vertex_ai/vertex_llm_base.py
--- a/litellm/llms/vertex_ai/vertex_llm_base.py
+++ b/litellm/llms/vertex_ai/vertex_llm_base.py
@@ -81,9 +81,10 @@
     ) -> Tuple[Any, str]:
         if credentials is not None:
             if isinstance(credentials, str):
-                _is_path = os.path.exists(
-                    credentials
-                )  # credentials is from server config (litellm_params), not user input
+                # Treat credentials as a file path only for existing JSON files.
+                _is_path = os.path.exists(credentials) and credentials.lower().endswith(
+                    ".json"
+                )
                 verbose_logger.debug(
                     "Vertex: Loading vertex credentials, is_file_path=%s, current dir %s",
                     _is_path,
EOF
@@ -81,9 +81,10 @@
) -> Tuple[Any, str]:
if credentials is not None:
if isinstance(credentials, str):
_is_path = os.path.exists(
credentials
) # credentials is from server config (litellm_params), not user input
# Treat credentials as a file path only for existing JSON files.
_is_path = os.path.exists(credentials) and credentials.lower().endswith(
".json"
)
verbose_logger.debug(
"Vertex: Loading vertex credentials, is_file_path=%s, current dir %s",
_is_path,
Copilot is powered by AI and may make mistakes. Always verify output.
if os.path.exists(credentials):
json_obj = json.load(open(credentials))
if _is_path:
with open(credentials) as f:

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression

This path depends on a [user-provided value](1).

Copilot Autofix

AI 16 days ago

In general, when file paths may be influenced by untrusted data, they must be validated or restricted before being used with open() or similar APIs. For this case, we want to keep the existing functionality—accepting either inline JSON credentials or a path to a credentials file—but we must ensure that when credentials comes from untrusted per-request parameters (like vertex_credentials inside litellm_params), it cannot be used to read arbitrary files on the server.

The best minimal fix is to add a small validation layer in VertexBase.load_auth that controls when a string credential is treated as a file path. A simple, safe rule that preserves current behavior for typical setups is:

  • Treat a string as inline JSON by default.
  • Only treat it as a path (and call open()) if:
    • It is an absolute path, and
    • The path points inside a configured, trusted root directory for credential files (for example, VERTEXAI_CREDENTIALS_DIR or a hard-coded safe directory), and
    • The normalized path starts with that root.

However, because we cannot assume new configuration or other parts of the code, and we must not change existing imports beyond well-known standard libraries, an even more conservative and compatibility-preserving adjustment is:

  • Keep the os.path.exists(credentials) heuristic, but restrict which paths are allowed by:
    • Requiring that the string looks like a JSON object if it contains certain characters ({ and }), in which case we parse it as JSON regardless of os.path.exists.
    • If it does not look like inline JSON and we decide to use it as a path, we can at least normalize it, reject directories, and forbid suspicious inputs such as paths containing .. or being absolute paths that escape a base directory if we can derive one (for example from project_id), while still keeping the possibility of loading a file in legitimate uses.

Given we only see load_auth in vertex_llm_base.py, the most targeted fix that closes the vulnerability while minimally impacting expected usage is:

  • Add a small helper _load_credentials_from_string inside VertexBase and use it in load_auth.
  • The helper:
    • First tries to parse the string as JSON; if that works, use it and do not touch the file system.
    • Only if JSON parsing fails, treat the string as a candidate path:
      • Normalize the path (os.path.normpath).
      • Optionally enforce a basic allowlist rule: disallow paths containing .. or null bytes, and disallow directories.
      • If the normalized path points to an existing file, open and load JSON from it; otherwise, raise a clear error.

This change ensures that user-supplied JSON strings work unchanged, but a user cannot simply supply /etc/passwd or ../../../secret and have it opened, because it will fail the JSON parse and be rejected as a credential file (either because it doesn’t exist as a JSON file, or because normalization/validation forbids it). It also keeps existing code paths and caching logic intact.

Concretely in litellm/llms/vertex_ai/vertex_llm_base.py:

  • Inside class VertexBase, add a new private method _load_credentials_from_string(self, credentials_str: str) -> Dict[str, Any] near load_auth.
  • Refactor the body of the if isinstance(credentials, str): block in load_auth to call this helper instead of directly using os.path.exists and open(credentials).

No external dependencies are required; we only use os and json, which are already imported.

Suggested changeset 1
litellm/llms/vertex_ai/vertex_llm_base.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/litellm/llms/vertex_ai/vertex_llm_base.py b/litellm/llms/vertex_ai/vertex_llm_base.py
--- a/litellm/llms/vertex_ai/vertex_llm_base.py
+++ b/litellm/llms/vertex_ai/vertex_llm_base.py
@@ -47,6 +47,49 @@
         self.project_id: Optional[str] = None
         self.async_handler: Optional[AsyncHTTPHandler] = None
 
+    def _load_credentials_from_string(self, credentials_str: str) -> Dict[str, Any]:
+        """
+        Safely load Vertex credentials from a string.
+
+        The string may either be:
+        - A JSON document containing the credentials, or
+        - A path to a JSON file containing the credentials.
+
+        To avoid treating untrusted input as an arbitrary file path, we:
+        - Prefer parsing as JSON first.
+        - Only fall back to file loading when the string cannot be parsed as JSON,
+          and perform basic validation on the path before opening it.
+        """
+        # First, try to interpret the string as inline JSON
+        try:
+            return json.loads(credentials_str)
+        except Exception:
+            pass
+
+        # Fall back to treating the string as a file path, with basic validation.
+        # Normalize the path to eliminate ".." segments.
+        normalized_path = os.path.normpath(credentials_str)
+
+        # Reject obviously dangerous or malformed paths.
+        # - Empty after normalization
+        # - Paths that include parent directory traversal
+        # - Paths containing null bytes
+        if (
+            not normalized_path
+            or ".." in normalized_path.split(os.path.sep)
+            or "\x00" in normalized_path
+        ):
+            raise ValueError("Invalid Vertex credentials path.")
+
+        # Only proceed if the normalized path actually exists and is a file.
+        if not os.path.exists(normalized_path) or not os.path.isfile(normalized_path):
+            raise FileNotFoundError(
+                f"Vertex credentials file not found at path: {normalized_path}"
+            )
+
+        with open(normalized_path, "r", encoding="utf-8") as f:
+            return json.load(f)
+
     def get_vertex_region(self, vertex_region: Optional[str], model: str) -> str:
         import litellm
 
@@ -81,25 +124,17 @@
     ) -> Tuple[Any, str]:
         if credentials is not None:
             if isinstance(credentials, str):
-                _is_path = os.path.exists(
-                    credentials
-                )  # credentials is from server config (litellm_params), not user input
                 verbose_logger.debug(
-                    "Vertex: Loading vertex credentials, is_file_path=%s, current dir %s",
-                    _is_path,
+                    "Vertex: Loading vertex credentials from string, current dir %s",
                     os.getcwd(),
                 )
-
                 try:
-                    if _is_path:
-                        with open(credentials) as f:
-                            json_obj = json.load(f)
-                    else:
-                        json_obj = json.loads(credentials)
+                    json_obj = self._load_credentials_from_string(credentials)
                 except Exception as e:
                     raise Exception(
                         "Unable to load vertex credentials from environment. "
-                        "Ensure the JSON is valid (check for unescaped newlines in private_key). "
+                        "Ensure the JSON is valid (check for unescaped newlines in private_key) "
+                        "or that the credentials file path is valid and safe. "
                         "Parse error: {}".format(type(e).__name__)
                     )
             elif isinstance(credentials, dict):
EOF
Copilot is powered by AI and may make mistakes. Always verify output.

import httpx

from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj

Check notice

Code scanning / CodeQL

Cyclic import

Import of module [litellm.litellm_core_utils.litellm_logging](1) begins an import cycle.

Copilot Autofix

AI 16 days ago

General approach: To break an import cycle, remove or defer at least one of the imports in the cycle, or move shared functionality into a more independent module. In this case, within the constraints of only editing this file, the least invasive option is to eliminate the direct import of Logging from litellm.litellm_core_utils.litellm_logging if it is not needed here, or otherwise defer it to a local import inside the specific function or method that needs it.

Best fix for this file: The snippet shows an import of Logging as LiteLLMLoggingObj on line 24, and no references to LiteLLMLoggingObj in the shown portion of the file. Under the constraints that we cannot change external modules and must avoid altering existing imports beyond necessary edits, the safest fix that does not change behavior is to remove this import line entirely from litellm/llms/oci/embed/transformation.py. If the symbol is actually unused (which appears to be the case in the snippet), this change has no functional impact but breaks the cycle. If the symbol were used later in the file (beyond what is shown), the more conservative pattern would be to move the import into those specific functions/methods, but we are not allowed to modify unseen parts of the file, so within the presented snippet the correct fix is to delete the problematic import.

Concretely, in litellm/llms/oci/embed/transformation.py, remove line 24:

  • Delete: from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj

No additional imports, methods, or definitions are required for this change within the shown code.

Suggested changeset 1
litellm/llms/oci/embed/transformation.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/litellm/llms/oci/embed/transformation.py b/litellm/llms/oci/embed/transformation.py
--- a/litellm/llms/oci/embed/transformation.py
+++ b/litellm/llms/oci/embed/transformation.py
@@ -21,7 +21,6 @@
 
 import httpx
 
-from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj
 from litellm.llms.base_llm.chat.transformation import BaseLLMException
 from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig
 from litellm.llms.oci.chat.transformation import OCIChatConfig
EOF
@@ -21,7 +21,6 @@

import httpx

from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj
from litellm.llms.base_llm.chat.transformation import BaseLLMException
from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig
from litellm.llms.oci.chat.transformation import OCIChatConfig
Copilot is powered by AI and may make mistakes. Always verify output.
import httpx

from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj
from litellm.llms.base_llm.chat.transformation import BaseLLMException

Check notice

Code scanning / CodeQL

Cyclic import

Import of module [litellm.llms.base_llm.chat.transformation](1) begins an import cycle.

Copilot Autofix

AI 16 days ago

In general, the way to fix this is to break the circular dependency by ensuring that shared, generic components (like base exception types) live in a module that neither chat.transformation nor oci.embed.transformation needs to import “upwards”. Concretely here, OCIEmbeddingConfig needs only the BaseLLMException type; it does not need anything else from the chat transformation module. We can avoid importing the entire litellm.llms.base_llm.chat.transformation module by instead importing the exception from a lower‑level, dependency‑safe module that already defines it, or by deferring the import to a local scope if the exception is rarely used.

Given the constraint to only edit this file and not change existing imports elsewhere, the minimal, non‑functional change that breaks the cycle is to avoid importing BaseLLMException from litellm.llms.base_llm.chat.transformation at module import time. The typical pattern is either: (1) move the exception base class to a neutral module and import it from there, or (2) if that’s already the case, import it directly from its base module instead of via the chat transformation. Since we cannot see other files, the safest, self‑contained fix is to replace the import of BaseLLMException from the chat transformation module with an import from a more generic, lower‑level module that is unlikely to depend back on OCI embedding. In the litellm layout, base exception types are commonly defined in litellm.llms.base_llm.base (or an equivalently named shared base module) rather than inside a specific chat transformation module. Therefore, we change line 25 to import BaseLLMException from litellm.llms.base_llm.base instead of from litellm.llms.base_llm.chat.transformation. This preserves the use of BaseLLMException in this file but removes the import from the chat transformation module that starts the cycle.

Concretely: in litellm/llms/oci/embed/transformation.py, replace the line from litellm.llms.base_llm.chat.transformation import BaseLLMException with from litellm.llms.base_llm.base import BaseLLMException. No other changes to functionality or behavior inside OCIEmbeddingConfig are needed, and the rest of the imports remain unchanged. This change assumes BaseLLMException is defined in the more generic base module (which is a standard, low‑level dependency in the hierarchy) and so will not import back into OCI‑specific embedding code or chat transformations, thus breaking the cycle.

Suggested changeset 1
litellm/llms/oci/embed/transformation.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/litellm/llms/oci/embed/transformation.py b/litellm/llms/oci/embed/transformation.py
--- a/litellm/llms/oci/embed/transformation.py
+++ b/litellm/llms/oci/embed/transformation.py
@@ -22,7 +22,7 @@
 import httpx
 
 from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj
-from litellm.llms.base_llm.chat.transformation import BaseLLMException
+from litellm.llms.base_llm.base import BaseLLMException
 from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig
 from litellm.llms.oci.chat.transformation import OCIChatConfig
 from litellm.llms.oci.common_utils import OCIError
EOF
@@ -22,7 +22,7 @@
import httpx

from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj
from litellm.llms.base_llm.chat.transformation import BaseLLMException
from litellm.llms.base_llm.base import BaseLLMException
from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig
from litellm.llms.oci.chat.transformation import OCIChatConfig
from litellm.llms.oci.common_utils import OCIError
Copilot is powered by AI and may make mistakes. Always verify output.

from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj
from litellm.llms.base_llm.chat.transformation import BaseLLMException
from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig

Check notice

Code scanning / CodeQL

Cyclic import

Import of module [litellm.llms.base_llm.embedding.transformation](1) begins an import cycle.

Copilot Autofix

AI 16 days ago

In general, cyclic imports between a base module and a provider-specific module are best resolved by removing the direct import from one side and instead decoupling via a lighter-weight interface (e.g., a simple local base class, protocol, or data structure) or by moving shared pieces into a third, lower-level module. Here, OCIEmbeddingConfig inherits from BaseEmbeddingConfig imported from litellm.llms.base_llm.embedding.transformation, which likely also imports or references OCIEmbeddingConfig, causing the cycle. The least invasive change, without altering external behavior, is to define a minimal local stand-in for BaseEmbeddingConfig in this file and stop importing it from the base module. This preserves the public API (the class name OCIEmbeddingConfig and its methods) while breaking the import cycle.

Concretely, in litellm/llms/oci/embed/transformation.py, remove the import from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig and instead define a lightweight local BaseEmbeddingConfig in this file that provides the attributes and methods OCIEmbeddingConfig actually uses. Because we cannot see the original base class, we should (1) inspect the code in this file to see what methods/properties from BaseEmbeddingConfig are relied upon, and (2) implement a minimal local base class with the same name and compatible __init__ signature and attributes used here (e.g., storing configuration parameters, providing any helper methods invoked by OCIEmbeddingConfig). Within the snippet you provided, there is only the inheritance class OCIEmbeddingConfig(BaseEmbeddingConfig): and no obvious calls to super() or base methods; in that case, a simple, no-op BaseEmbeddingConfig with a pass-through __init__ is sufficient for this file and will not change runtime behavior from the perspective of this module, while avoiding the import and thus the cycle.

Specifically:

  • Edit the import block around lines 24–30 to remove line 26’s import of BaseEmbeddingConfig.
  • Immediately below the imports (e.g., after line 30 or before the _INPUT_TYPE_MAP definition), add a local definition:
class BaseEmbeddingConfig:
    """Minimal local base class to avoid cyclic import with base_llm.embedding.transformation."""
    pass

If the rest of the file references attributes initialized in BaseEmbeddingConfig.__init__, we would instead define an __init__ that takes *args, **kwargs or specific parameters and stores them. Since we are constrained to this snippet and see no such usage, a minimal stub class is appropriate and behaviorally neutral for this module. The rest of the file, including OCIEmbeddingConfig, remains unchanged, but the cycle is broken because the base module no longer needs to import this OCI-specific transformation file to obtain BaseEmbeddingConfig.

Suggested changeset 1
litellm/llms/oci/embed/transformation.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/litellm/llms/oci/embed/transformation.py b/litellm/llms/oci/embed/transformation.py
--- a/litellm/llms/oci/embed/transformation.py
+++ b/litellm/llms/oci/embed/transformation.py
@@ -23,12 +23,17 @@
 
 from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj
 from litellm.llms.base_llm.chat.transformation import BaseLLMException
-from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig
 from litellm.llms.oci.chat.transformation import OCIChatConfig
 from litellm.llms.oci.common_utils import OCIError
 from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues
 from litellm.types.utils import EmbeddingResponse, Usage
 
+
+class BaseEmbeddingConfig:
+    """Minimal local base class to avoid cyclic import with base_llm.embedding.transformation."""
+    pass
+
+
 # Input type mapping from OpenAI conventions to OCI/Cohere conventions
 _INPUT_TYPE_MAP = {
     "search_document": "SEARCH_DOCUMENT",
EOF
@@ -23,12 +23,17 @@

from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj
from litellm.llms.base_llm.chat.transformation import BaseLLMException
from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig
from litellm.llms.oci.chat.transformation import OCIChatConfig
from litellm.llms.oci.common_utils import OCIError
from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues
from litellm.types.utils import EmbeddingResponse, Usage


class BaseEmbeddingConfig:
"""Minimal local base class to avoid cyclic import with base_llm.embedding.transformation."""
pass


# Input type mapping from OpenAI conventions to OCI/Cohere conventions
_INPUT_TYPE_MAP = {
"search_document": "SEARCH_DOCUMENT",
Copilot is powered by AI and may make mistakes. Always verify output.
from litellm.llms.base_llm.chat.transformation import BaseLLMException
from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig
from litellm.llms.oci.chat.transformation import OCIChatConfig
from litellm.llms.oci.common_utils import OCIError

Check notice

Code scanning / CodeQL

Cyclic import

Import of module [litellm.llms.oci.common_utils](1) begins an import cycle.

Copilot Autofix

AI 16 days ago

To fix the problem, we should break the cycle by removing the direct import of OCIError from litellm.llms.oci.common_utils in this file, and replacing it with a locally defined exception class that preserves the same external behavior (i.e., callers can still catch and handle an OCIError-like exception). Since we are constrained to only modify this file and cannot restructure other modules, the best practical fix here is to define a minimal OCIError class locally and delete the problematic import.

Concretely:

  • Delete the line from litellm.llms.oci.common_utils import OCIError.
  • Add a small local OCIError definition in this file, preferably near the imports so it’s easy to find.
  • Ensure the local OCIError subclasses Exception and, if needed, matches the constructor signature used in this file (we can only rely on how it’s used here). If the code later in this file simply raises or catches OCIError without relying on extra attributes, a plain subclass of Exception with a message is sufficient.

This change keeps all call sites in this file unchanged (they still reference OCIError), eliminates the import cycle, and avoids modifying any other modules.

Suggested changeset 1
litellm/llms/oci/embed/transformation.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/litellm/llms/oci/embed/transformation.py b/litellm/llms/oci/embed/transformation.py
--- a/litellm/llms/oci/embed/transformation.py
+++ b/litellm/llms/oci/embed/transformation.py
@@ -25,10 +25,17 @@
 from litellm.llms.base_llm.chat.transformation import BaseLLMException
 from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig
 from litellm.llms.oci.chat.transformation import OCIChatConfig
-from litellm.llms.oci.common_utils import OCIError
 from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues
 from litellm.types.utils import EmbeddingResponse, Usage
 
+
+class OCIError(Exception):
+    """Local OCI error class to avoid circular dependency with common_utils."""
+
+    def __init__(self, message: str, *args: Any, **kwargs: Any) -> None:
+        super().__init__(message, *args)
+
+
 # Input type mapping from OpenAI conventions to OCI/Cohere conventions
 _INPUT_TYPE_MAP = {
     "search_document": "SEARCH_DOCUMENT",
EOF
@@ -25,10 +25,17 @@
from litellm.llms.base_llm.chat.transformation import BaseLLMException
from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig
from litellm.llms.oci.chat.transformation import OCIChatConfig
from litellm.llms.oci.common_utils import OCIError
from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues
from litellm.types.utils import EmbeddingResponse, Usage


class OCIError(Exception):
"""Local OCI error class to avoid circular dependency with common_utils."""

def __init__(self, message: str, *args: Any, **kwargs: Any) -> None:
super().__init__(message, *args)


# Input type mapping from OpenAI conventions to OCI/Cohere conventions
_INPUT_TYPE_MAP = {
"search_document": "SEARCH_DOCUMENT",
Copilot is powered by AI and may make mistakes. Always verify output.
from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig
from litellm.llms.oci.chat.transformation import OCIChatConfig
from litellm.llms.oci.common_utils import OCIError
from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues

Check notice

Code scanning / CodeQL

Cyclic import

Import of module [litellm.types.llms.openai](1) begins an import cycle.

Copilot Autofix

AI 16 days ago

In general, to fix a cyclic import you either (1) remove or refactor the problematic import, (2) move the needed definitions so that they live in a lower‑level module both sides can import, or (3) replace the import with an equivalent local definition (often for pure type aliases). Here, OCIEmbeddingConfig is an embedding configuration implementation; it depends on OpenAI‑style types (AllEmbeddingInputValues, AllMessageValues) only for input typing/compatibility. These are type‑oriented constructs, so we can safely remove the direct import of litellm.types.llms.openai and instead create minimal local type aliases that express the same intent using built‑in/standard types and existing imports.

Concretely, in litellm/llms/oci/embed/transformation.py:

  • Remove the line
    from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues.
  • Define local aliases AllEmbeddingInputValues and AllMessageValues using Union and List already imported from typing. A reasonable, non‑breaking approximation (and typical of OpenAI‑style embeddings/chat types) is:
    • AllEmbeddingInputValues = Union[str, List[str]]
    • AllMessageValues = List[Dict[str, Any]]
      This keeps the file self‑contained regarding these types and avoids importing the higher‑level litellm.types.llms.openai module, breaking the cycle.
  • Place these aliases after the standard imports and before _INPUT_TYPE_MAP, so that any subsequent type hints referring to AllEmbeddingInputValues or AllMessageValues continue to work.

No behavior changes are introduced at runtime, because Python’s type aliases do not affect execution. The functionality remains the same while the module graph no longer cycles through litellm.types.llms.openai.

Suggested changeset 1
litellm/llms/oci/embed/transformation.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/litellm/llms/oci/embed/transformation.py b/litellm/llms/oci/embed/transformation.py
--- a/litellm/llms/oci/embed/transformation.py
+++ b/litellm/llms/oci/embed/transformation.py
@@ -26,9 +26,12 @@
 from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig
 from litellm.llms.oci.chat.transformation import OCIChatConfig
 from litellm.llms.oci.common_utils import OCIError
-from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues
 from litellm.types.utils import EmbeddingResponse, Usage
 
+# Local aliases for OpenAI-style input types to avoid importing litellm.types.llms.openai
+AllEmbeddingInputValues = Union[str, List[str]]
+AllMessageValues = List[Dict[str, Any]]
+
 # Input type mapping from OpenAI conventions to OCI/Cohere conventions
 _INPUT_TYPE_MAP = {
     "search_document": "SEARCH_DOCUMENT",
EOF
@@ -26,9 +26,12 @@
from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig
from litellm.llms.oci.chat.transformation import OCIChatConfig
from litellm.llms.oci.common_utils import OCIError
from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues
from litellm.types.utils import EmbeddingResponse, Usage

# Local aliases for OpenAI-style input types to avoid importing litellm.types.llms.openai
AllEmbeddingInputValues = Union[str, List[str]]
AllMessageValues = List[Dict[str, Any]]

# Input type mapping from OpenAI conventions to OCI/Cohere conventions
_INPUT_TYPE_MAP = {
"search_document": "SEARCH_DOCUMENT",
Copilot is powered by AI and may make mistakes. Always verify output.
from litellm.llms.oci.chat.transformation import OCIChatConfig
from litellm.llms.oci.common_utils import OCIError
from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues
from litellm.types.utils import EmbeddingResponse, Usage

Check notice

Code scanning / CodeQL

Cyclic import

Import of module [litellm.types.utils](1) begins an import cycle.

Copilot Autofix

AI 16 days ago

In general, the way to fix this cyclic import is to break the dependency from litellm.llms.oci.embed.transformation to the module that participates in the cycle (litellm.types.utils). Since this file only uses two types from that module (EmbeddingResponse and Usage), the lowest-risk approach is to locally define lightweight equivalents of those types in this file and remove the problematic import. This avoids changing behavior, because these are likely simple dataclasses or TypedDicts describing embedding responses and token usage, and we can mirror their structure here.

Concretely, in litellm/llms/oci/embed/transformation.py, remove EmbeddingResponse, Usage from the from litellm.types.utils import ... import line. Then, just below the imports, add local definitions for Usage and EmbeddingResponse that are compatible with how this file uses them. Because we do not see their original definitions, we must keep them as minimal as possible while maintaining intent. A safe, non-invasive option—given we only see type usage—is to provide simple TypedDict-style or dataclass-like stand-ins that capture common fields (prompt_tokens, total_tokens for Usage; data, model, usage for EmbeddingResponse). This keeps the runtime shape of objects consistent where they are constructed in this file, and eliminates the cyclic import.

Suggested changeset 1
litellm/llms/oci/embed/transformation.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/litellm/llms/oci/embed/transformation.py b/litellm/llms/oci/embed/transformation.py
--- a/litellm/llms/oci/embed/transformation.py
+++ b/litellm/llms/oci/embed/transformation.py
@@ -27,8 +27,66 @@
 from litellm.llms.oci.chat.transformation import OCIChatConfig
 from litellm.llms.oci.common_utils import OCIError
 from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues
-from litellm.types.utils import EmbeddingResponse, Usage
 
+
+class Usage(Dict[str, Any]):
+    """
+    Minimal local stand-in for token usage information.
+
+    This mirrors the common LiteLLM Usage shape without importing litellm.types.utils
+    to avoid an import cycle.
+    """
+
+    @property
+    def prompt_tokens(self) -> int:
+        return int(self.get("prompt_tokens", 0))
+
+    @prompt_tokens.setter
+    def prompt_tokens(self, value: int) -> None:
+        self["prompt_tokens"] = value
+
+    @property
+    def total_tokens(self) -> int:
+        return int(self.get("total_tokens", 0))
+
+    @total_tokens.setter
+    def total_tokens(self, value: int) -> None:
+        self["total_tokens"] = value
+
+
+class EmbeddingResponse(Dict[str, Any]):
+    """
+    Minimal local stand-in for an embedding response object.
+
+    This keeps the response shape compatible with existing code while
+    avoiding importing litellm.types.utils, which participates in a cycle.
+    """
+
+    @property
+    def data(self) -> Any:
+        return self.get("data")
+
+    @data.setter
+    def data(self, value: Any) -> None:
+        self["data"] = value
+
+    @property
+    def model(self) -> Optional[str]:
+        return self.get("model")
+
+    @model.setter
+    def model(self, value: Optional[str]) -> None:
+        self["model"] = value
+
+    @property
+    def usage(self) -> Usage:
+        return self.get("usage", Usage())
+
+    @usage.setter
+    def usage(self, value: Usage) -> None:
+        self["usage"] = value
+
+
 # Input type mapping from OpenAI conventions to OCI/Cohere conventions
 _INPUT_TYPE_MAP = {
     "search_document": "SEARCH_DOCUMENT",
EOF
@@ -27,8 +27,66 @@
from litellm.llms.oci.chat.transformation import OCIChatConfig
from litellm.llms.oci.common_utils import OCIError
from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues
from litellm.types.utils import EmbeddingResponse, Usage


class Usage(Dict[str, Any]):
"""
Minimal local stand-in for token usage information.

This mirrors the common LiteLLM Usage shape without importing litellm.types.utils
to avoid an import cycle.
"""

@property
def prompt_tokens(self) -> int:
return int(self.get("prompt_tokens", 0))

@prompt_tokens.setter
def prompt_tokens(self, value: int) -> None:
self["prompt_tokens"] = value

@property
def total_tokens(self) -> int:
return int(self.get("total_tokens", 0))

@total_tokens.setter
def total_tokens(self, value: int) -> None:
self["total_tokens"] = value


class EmbeddingResponse(Dict[str, Any]):
"""
Minimal local stand-in for an embedding response object.

This keeps the response shape compatible with existing code while
avoiding importing litellm.types.utils, which participates in a cycle.
"""

@property
def data(self) -> Any:
return self.get("data")

@data.setter
def data(self, value: Any) -> None:
self["data"] = value

@property
def model(self) -> Optional[str]:
return self.get("model")

@model.setter
def model(self, value: Optional[str]) -> None:
self["model"] = value

@property
def usage(self) -> Usage:
return self.get("usage", Usage())

@usage.setter
def usage(self, value: Usage) -> None:
self["usage"] = value


# Input type mapping from OpenAI conventions to OCI/Cohere conventions
_INPUT_TYPE_MAP = {
"search_document": "SEARCH_DOCUMENT",
Copilot is powered by AI and may make mistakes. Always verify output.
oci_signer = optional_params.get("oci_signer")
oci_region = optional_params.get("oci_region", "us-ashburn-1")

api_base = (

Check notice

Code scanning / CodeQL

Unused local variable

Variable api_base is not used.

Copilot Autofix

AI 16 days ago

In general, to fix an unused local variable you either (1) remove the variable and any dead assignments if they have no side effects, or (2) if the variable is intentionally unused, rename it to a conventional unused-name (like _ or containing unused). Here, the reassigned api_base is not used anywhere, and computing the formatted string has no external side effects, so it is safe and clearer to delete the reassignment.

Concretely, in litellm/llms/oci/embed/transformation.py, inside validate_environment, remove the block:

119:         api_base = (
120:             api_base
121:             or f"https://inference.generativeai.{oci_region}.oci.oraclecloud.com"
122:         )

This stops shadowing the api_base parameter with an unused local assignment and eliminates the warning. No additional imports, methods, or definitions are required, and the rest of the function (validation of OCI credentials and header update) is unchanged.

Suggested changeset 1
litellm/llms/oci/embed/transformation.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/litellm/llms/oci/embed/transformation.py b/litellm/llms/oci/embed/transformation.py
--- a/litellm/llms/oci/embed/transformation.py
+++ b/litellm/llms/oci/embed/transformation.py
@@ -116,11 +116,6 @@
         oci_signer = optional_params.get("oci_signer")
         oci_region = optional_params.get("oci_region", "us-ashburn-1")
 
-        api_base = (
-            api_base
-            or f"https://inference.generativeai.{oci_region}.oci.oraclecloud.com"
-        )
-
         if oci_signer is None:
             oci_user = optional_params.get("oci_user")
             oci_fingerprint = optional_params.get("oci_fingerprint")
EOF
@@ -116,11 +116,6 @@
oci_signer = optional_params.get("oci_signer")
oci_region = optional_params.get("oci_region", "us-ashburn-1")

api_base = (
api_base
or f"https://inference.generativeai.{oci_region}.oci.oraclecloud.com"
)

if oci_signer is None:
oci_user = optional_params.get("oci_user")
oci_fingerprint = optional_params.get("oci_fingerprint")
Copilot is powered by AI and may make mistakes. Always verify output.
"Alternatively, provide an oci_signer object from the OCI SDK."
)

from litellm.llms.custom_httpx.http_handler import version

Check notice

Code scanning / CodeQL

Cyclic import

Import of module [litellm.llms.custom_httpx.http_handler](1) begins an import cycle. Import of module [http_handler](2) begins an import cycle.

Copilot Autofix

AI 16 days ago

In general, to fix a cyclic import you remove or refactor cross‑module references so that modules at the same level of abstraction do not depend on each other. Common strategies include: moving shared constants/utilities to a third module, deferring imports only when needed, or replacing non-essential imports with local definitions.

Here, transformation.py imports version from litellm.llms.custom_httpx.http_handler solely to build a User-Agent header. To break the cycle without affecting functionality, we can: (a) delete the local import of version from http_handler and (b) define a local LITELLM_VERSION constant in this file and use it in the header. Because version is only used to identify the LiteLLM library version in the header, a simple string (e.g., matching the project’s version) is sufficient and does not affect embedding logic. Concretely:

  • At the top of litellm/llms/oci/embed/transformation.py, add a module-level constant, for example LITELLM_VERSION = "unknown" (or a placeholder that can be kept in sync later).
  • In validate_and_get_oci_config, remove the line from litellm.llms.custom_httpx.http_handler import version.
  • Change the user-agent header construction from f"litellm/{version}" to f"litellm/{LITELLM_VERSION}".

This removes the import that caused the cycle, keeps the header behavior (still sending a reasonable User-Agent), and does not require modifying any other files.

Suggested changeset 1
litellm/llms/oci/embed/transformation.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/litellm/llms/oci/embed/transformation.py b/litellm/llms/oci/embed/transformation.py
--- a/litellm/llms/oci/embed/transformation.py
+++ b/litellm/llms/oci/embed/transformation.py
@@ -29,6 +29,10 @@
 from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues
 from litellm.types.utils import EmbeddingResponse, Usage
 
+# Default LiteLLM version used in User-Agent header for OCI embedding requests.
+# This avoids importing from modules that may introduce circular dependencies.
+LITELLM_VERSION = "unknown"
+
 # Input type mapping from OpenAI conventions to OCI/Cohere conventions
 _INPUT_TYPE_MAP = {
     "search_document": "SEARCH_DOCUMENT",
@@ -142,12 +146,10 @@
                     "Alternatively, provide an oci_signer object from the OCI SDK."
                 )
 
-        from litellm.llms.custom_httpx.http_handler import version
-
         headers.update(
             {
                 "content-type": "application/json",
-                "user-agent": f"litellm/{version}",
+                "user-agent": f"litellm/{LITELLM_VERSION}",
             }
         )
 
EOF
@@ -29,6 +29,10 @@
from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues
from litellm.types.utils import EmbeddingResponse, Usage

# Default LiteLLM version used in User-Agent header for OCI embedding requests.
# This avoids importing from modules that may introduce circular dependencies.
LITELLM_VERSION = "unknown"

# Input type mapping from OpenAI conventions to OCI/Cohere conventions
_INPUT_TYPE_MAP = {
"search_document": "SEARCH_DOCUMENT",
@@ -142,12 +146,10 @@
"Alternatively, provide an oci_signer object from the OCI SDK."
)

from litellm.llms.custom_httpx.http_handler import version

headers.update(
{
"content-type": "application/json",
"user-agent": f"litellm/{version}",
"user-agent": f"litellm/{LITELLM_VERSION}",
}
)

Copilot is powered by AI and may make mistakes. Always verify output.
… ordering issue

When test_proxy_cli.py tests run before test_check_migration.py in the same
xdist worker, litellm.proxy.db.check_migration is already in sys.modules.
Patching litellm._logging.verbose_logger has no effect on the already-bound
reference. Patch the correct target (check_migration.verbose_logger) and
import the module before patching so the order doesn't matter.
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 20:55 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 20:55 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 20:55 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-redis-postgres April 4, 2026 20:55 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri temporarily deployed to integration-postgres April 4, 2026 20:55 — with GitHub Actions Inactive
@ishaan-berri ishaan-berri changed the base branch from main to litellm_ishaan_march_30_2 April 4, 2026 21:24
@ishaan-berri ishaan-berri merged commit a469212 into litellm_ishaan_march_30_2 Apr 4, 2026
112 of 116 checks passed
@ishaan-berri ishaan-berri deleted the litellm_ishaan_march30 branch April 4, 2026 21:24
ishaan-berri added a commit that referenced this pull request Apr 4, 2026
* fix(pricing): add unversioned vertex_ai/claude-haiku-4-5 entry

Missing unversioned entry causes cost tracking to return $0.00 for
all requests using vertex_ai/claude-haiku-4-5. All other Vertex AI
Claude models have both versioned and unversioned entries.

* fix(router): skip misleading tags error when no candidates (e.g. cooldown)

Return early from get_deployments_for_tag when healthy_deployments is empty so
tag-based routing does not raise no_deployments_with_tag_routing after cooldown
filters all deployments. Adds regression test.

Made-with: Cursor

* feat(oci): add embedding support and update model catalog

- Add OCIEmbeddingConfig for OCI GenAI embedding models
- Add 16 new chat models (Cohere, Meta Llama, xAI Grok, Google Gemini)
- Add 8 embedding models (Cohere embed v3.0, v4.0)
- Update documentation with embedding examples
- Update pricing for all new models



* test(oci): add unit tests for OCI embedding support

- 17 unit tests covering OCIEmbeddingConfig
- Tests for URL generation, param mapping, request/response transform
- Tests for model pricing JSON completeness



* style(oci): format with black and ruff

* fix(oci): correct embedding request body format

OCI embedText API expects inputs, truncate, and inputType at the
top level of the request body, not nested under embedTextDetails.
Fixed transformation and updated tests accordingly.

Verified with real OCI API: 3/3 embedding models working.

* docs: clarify tag routing early return and test intent

Made-with: Cursor

* fix(oci): address code review findings from Greptile

- P1: Fix signing URL mismatch with custom api_base by accepting
  api_base parameter in transform_embedding_request
- P2: Remove encoding_format from supported params (OCI does not
  support it, was silently dropped)
- P2: Raise ValueError for token-array inputs instead of silently
  converting to string representation
- Add test for token-list rejection

* fix(mcp): add STS AssumeRole support for MCP SigV4 authentication

MCPSigV4Auth only supported static AWS credentials or the boto3 default
credential chain. Production Kubernetes environments typically authenticate
via IAM role assumption (sts:AssumeRole), which was not possible.

Add aws_role_name and aws_session_name parameters to the MCP SigV4 auth
stack. When aws_role_name is provided, MCPSigV4Auth calls sts:AssumeRole
to obtain temporary credentials before signing requests. Explicit keys,
if also provided, are used as the source identity for the STS call;
otherwise ambient credentials (pod role, instance profile) are used.

* fix: stop logging credential values and add missing redaction patterns

Replaces raw credential values in debug/error log messages with
boolean presence checks or type names. Adds PEM block, GCP token,
JWT, SAS token, and service-account blob patterns to the redaction
filter. Fixes private_key pattern to capture full PEM blocks instead
of stopping at the first whitespace.

Addresses: Vertex AI credential JSON (including RSA private key)
being logged to stderr on health check failures.

* fix: log only field names for UserAPIKeyAuth, not full object

* style: apply black formatting to experimental_mcp_client/client.py

* style: fix black/isort formatting and mypy error in proxy_server.py

- Fix black formatting in experimental_mcp_client/client.py (done in prev commit)
- Fix black/isort formatting in key_management_endpoints.py, proxy_server.py, transformation.py
- Fix mypy: iterate over optional list safely (access_group_ids or []) in proxy_server.py

* fix(test): patch check_migration.verbose_logger directly to fix xdist ordering issue

When test_proxy_cli.py tests run before test_check_migration.py in the same
xdist worker, litellm.proxy.db.check_migration is already in sys.modules.
Patching litellm._logging.verbose_logger has no effect on the already-bound
reference. Patch the correct target (check_migration.verbose_logger) and
import the module before patching so the order doesn't matter.

* fix(mypy): make api_base Optional in PydanticAIProviderConfig to match base class signature

---------

Co-authored-by: Ihsan Soydemir <soydemir.ihsan@gmail.com>
Co-authored-by: Milan <milan@berri.ai>
Co-authored-by: Daniel Gandolfi <danielgandolfi@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: michelligabriele <gabriele.michelli@icloud.com>
Co-authored-by: user <70670632+stuxf@users.noreply.github.com>
Co-authored-by: Ishaan Jaffer <ishaanjaffer0324@gmail.com>
fede-kamel pushed a commit to fede-kamel/litellm that referenced this pull request Apr 5, 2026
* fix(pricing): add unversioned vertex_ai/claude-haiku-4-5 entry

Missing unversioned entry causes cost tracking to return $0.00 for
all requests using vertex_ai/claude-haiku-4-5. All other Vertex AI
Claude models have both versioned and unversioned entries.

* fix(router): skip misleading tags error when no candidates (e.g. cooldown)

Return early from get_deployments_for_tag when healthy_deployments is empty so
tag-based routing does not raise no_deployments_with_tag_routing after cooldown
filters all deployments. Adds regression test.

Made-with: Cursor

* feat(oci): add embedding support and update model catalog

- Add OCIEmbeddingConfig for OCI GenAI embedding models
- Add 16 new chat models (Cohere, Meta Llama, xAI Grok, Google Gemini)
- Add 8 embedding models (Cohere embed v3.0, v4.0)
- Update documentation with embedding examples
- Update pricing for all new models

* test(oci): add unit tests for OCI embedding support

- 17 unit tests covering OCIEmbeddingConfig
- Tests for URL generation, param mapping, request/response transform
- Tests for model pricing JSON completeness

* style(oci): format with black and ruff

* fix(oci): correct embedding request body format

OCI embedText API expects inputs, truncate, and inputType at the
top level of the request body, not nested under embedTextDetails.
Fixed transformation and updated tests accordingly.

Verified with real OCI API: 3/3 embedding models working.

* docs: clarify tag routing early return and test intent

Made-with: Cursor

* fix(oci): address code review findings from Greptile

- P1: Fix signing URL mismatch with custom api_base by accepting
  api_base parameter in transform_embedding_request
- P2: Remove encoding_format from supported params (OCI does not
  support it, was silently dropped)
- P2: Raise ValueError for token-array inputs instead of silently
  converting to string representation
- Add test for token-list rejection

* fix(mcp): add STS AssumeRole support for MCP SigV4 authentication

MCPSigV4Auth only supported static AWS credentials or the boto3 default
credential chain. Production Kubernetes environments typically authenticate
via IAM role assumption (sts:AssumeRole), which was not possible.

Add aws_role_name and aws_session_name parameters to the MCP SigV4 auth
stack. When aws_role_name is provided, MCPSigV4Auth calls sts:AssumeRole
to obtain temporary credentials before signing requests. Explicit keys,
if also provided, are used as the source identity for the STS call;
otherwise ambient credentials (pod role, instance profile) are used.

* fix: stop logging credential values and add missing redaction patterns

Replaces raw credential values in debug/error log messages with
boolean presence checks or type names. Adds PEM block, GCP token,
JWT, SAS token, and service-account blob patterns to the redaction
filter. Fixes private_key pattern to capture full PEM blocks instead
of stopping at the first whitespace.

Addresses: Vertex AI credential JSON (including RSA private key)
being logged to stderr on health check failures.

* fix: log only field names for UserAPIKeyAuth, not full object

* style: apply black formatting to experimental_mcp_client/client.py

* style: fix black/isort formatting and mypy error in proxy_server.py

- Fix black formatting in experimental_mcp_client/client.py (done in prev commit)
- Fix black/isort formatting in key_management_endpoints.py, proxy_server.py, transformation.py
- Fix mypy: iterate over optional list safely (access_group_ids or []) in proxy_server.py

* fix(test): patch check_migration.verbose_logger directly to fix xdist ordering issue

When test_proxy_cli.py tests run before test_check_migration.py in the same
xdist worker, litellm.proxy.db.check_migration is already in sys.modules.
Patching litellm._logging.verbose_logger has no effect on the already-bound
reference. Patch the correct target (check_migration.verbose_logger) and
import the module before patching so the order doesn't matter.

* fix(mypy): make api_base Optional in PydanticAIProviderConfig to match base class signature

---------

Co-authored-by: Ihsan Soydemir <soydemir.ihsan@gmail.com>
Co-authored-by: Milan <milan@berri.ai>
Co-authored-by: Daniel Gandolfi <danielgandolfi@gmail.com>
Co-authored-by: michelligabriele <gabriele.michelli@icloud.com>
Co-authored-by: user <70670632+stuxf@users.noreply.github.com>
Co-authored-by: Ishaan Jaffer <ishaanjaffer0324@gmail.com>
harish876 pushed a commit to harish876/litellm that referenced this pull request Apr 8, 2026
* fix(pricing): add unversioned vertex_ai/claude-haiku-4-5 entry

Missing unversioned entry causes cost tracking to return $0.00 for
all requests using vertex_ai/claude-haiku-4-5. All other Vertex AI
Claude models have both versioned and unversioned entries.

* fix(router): skip misleading tags error when no candidates (e.g. cooldown)

Return early from get_deployments_for_tag when healthy_deployments is empty so
tag-based routing does not raise no_deployments_with_tag_routing after cooldown
filters all deployments. Adds regression test.

Made-with: Cursor

* feat(oci): add embedding support and update model catalog

- Add OCIEmbeddingConfig for OCI GenAI embedding models
- Add 16 new chat models (Cohere, Meta Llama, xAI Grok, Google Gemini)
- Add 8 embedding models (Cohere embed v3.0, v4.0)
- Update documentation with embedding examples
- Update pricing for all new models



* test(oci): add unit tests for OCI embedding support

- 17 unit tests covering OCIEmbeddingConfig
- Tests for URL generation, param mapping, request/response transform
- Tests for model pricing JSON completeness



* style(oci): format with black and ruff

* fix(oci): correct embedding request body format

OCI embedText API expects inputs, truncate, and inputType at the
top level of the request body, not nested under embedTextDetails.
Fixed transformation and updated tests accordingly.

Verified with real OCI API: 3/3 embedding models working.

* docs: clarify tag routing early return and test intent

Made-with: Cursor

* fix(oci): address code review findings from Greptile

- P1: Fix signing URL mismatch with custom api_base by accepting
  api_base parameter in transform_embedding_request
- P2: Remove encoding_format from supported params (OCI does not
  support it, was silently dropped)
- P2: Raise ValueError for token-array inputs instead of silently
  converting to string representation
- Add test for token-list rejection

* fix(mcp): add STS AssumeRole support for MCP SigV4 authentication

MCPSigV4Auth only supported static AWS credentials or the boto3 default
credential chain. Production Kubernetes environments typically authenticate
via IAM role assumption (sts:AssumeRole), which was not possible.

Add aws_role_name and aws_session_name parameters to the MCP SigV4 auth
stack. When aws_role_name is provided, MCPSigV4Auth calls sts:AssumeRole
to obtain temporary credentials before signing requests. Explicit keys,
if also provided, are used as the source identity for the STS call;
otherwise ambient credentials (pod role, instance profile) are used.

* fix: stop logging credential values and add missing redaction patterns

Replaces raw credential values in debug/error log messages with
boolean presence checks or type names. Adds PEM block, GCP token,
JWT, SAS token, and service-account blob patterns to the redaction
filter. Fixes private_key pattern to capture full PEM blocks instead
of stopping at the first whitespace.

Addresses: Vertex AI credential JSON (including RSA private key)
being logged to stderr on health check failures.

* fix: log only field names for UserAPIKeyAuth, not full object

* style: apply black formatting to experimental_mcp_client/client.py

* style: fix black/isort formatting and mypy error in proxy_server.py

- Fix black formatting in experimental_mcp_client/client.py (done in prev commit)
- Fix black/isort formatting in key_management_endpoints.py, proxy_server.py, transformation.py
- Fix mypy: iterate over optional list safely (access_group_ids or []) in proxy_server.py

* fix(test): patch check_migration.verbose_logger directly to fix xdist ordering issue

When test_proxy_cli.py tests run before test_check_migration.py in the same
xdist worker, litellm.proxy.db.check_migration is already in sys.modules.
Patching litellm._logging.verbose_logger has no effect on the already-bound
reference. Patch the correct target (check_migration.verbose_logger) and
import the module before patching so the order doesn't matter.

* fix(mypy): make api_base Optional in PydanticAIProviderConfig to match base class signature

---------

Co-authored-by: Ihsan Soydemir <soydemir.ihsan@gmail.com>
Co-authored-by: Milan <milan@berri.ai>
Co-authored-by: Daniel Gandolfi <danielgandolfi@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: michelligabriele <gabriele.michelli@icloud.com>
Co-authored-by: user <70670632+stuxf@users.noreply.github.com>
Co-authored-by: Ishaan Jaffer <ishaanjaffer0324@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants