Skip to content

feat(oci): add embedding support, new models, and updated docs#24808

Merged
ishaan-berri merged 5 commits intoBerriAI:litellm_ishaan_march30from
danielgandolfi1984:feat/oci-embedding-and-models-update
Mar 30, 2026
Merged

feat(oci): add embedding support, new models, and updated docs#24808
ishaan-berri merged 5 commits intoBerriAI:litellm_ishaan_march30from
danielgandolfi1984:feat/oci-embedding-and-models-update

Conversation

@danielgandolfi1984
Copy link
Copy Markdown
Contributor

Summary

Comprehensive update to the OCI (Oracle Cloud Infrastructure) Generative AI provider:

  • New: Embedding supportOCIEmbeddingConfig for OCI GenAI embedding models, reusing the existing OCIChatConfig signing logic
  • New: 16 chat models — Cohere Command-A/R variants, Meta Llama 3.1-3.3, xAI Grok 4.x/code, Google Gemini 2.5 via OCI
  • New: 8 embedding models — Cohere embed-english/multilingual v3.0 (regular + light + image), embed-v4.0
  • Updated documentation with embedding examples, dedicated endpoints, all auth methods
  • Updated pricing for all 24 new models

New Files

  • litellm/llms/oci/embed/__init__.py
  • litellm/llms/oci/embed/transformation.pyOCIEmbeddingConfig
  • tests/test_litellm/llms/oci/embed/__init__.py
  • tests/test_litellm/llms/oci/embed/test_oci_embedding.py — 18 unit tests

Modified Files

  • litellm/__init__.py — Import OCIEmbeddingConfig
  • litellm/utils.py — Register in ProviderConfigManager
  • litellm/main.py — Dispatch embedding calls to OCI handler
  • litellm/llms/oci/chat/transformation.py — Updated docstring for get_vendor_from_model()
  • model_prices_and_context_window.json — 24 new model entries
  • litellm/model_prices_and_context_window_backup.json — Backup updated
  • docs/my-website/docs/providers/oci.md — Embedding docs, new models list, dedicated endpoints

Embedding Usage

from litellm import embedding

response = embedding(
    model="oci/cohere.embed-english-v3.0",
    input=["Hello world", "Goodbye world"],
    oci_signer=signer,
    oci_region="us-chicago-1",
    oci_compartment_id="<compartment_id>",
)

Integration Test Results (us-chicago-1)

Tested with real OCI credentials against new models:

Type Passed Details
Chat (new models) 4/16 grok-code-fast-1, command-r-08-2024, command-r-plus-08-2024, gemini-2.5-flash-lite
Chat (existing models) 5/5 command-a-03-2025, llama-3.3-70b, llama-4-scout, grok-3-mini-fast, gemini-2.5-flash-lite
Embedding (new) 5/8 embed-english-v3.0, embed-english-light-v3.0, embed-multilingual-v3.0, embed-multilingual-light-v3.0, embed-v4.0

Note: 404 failures are region-availability issues (models not yet deployed in us-chicago-1), not code bugs. Gemini pro/flash have response parsing differences that may need a follow-up.

Test plan

  • 18 unit tests for OCIEmbeddingConfig (URL, params, request/response transform, pricing JSON)
  • Existing OCI chat tests unaffected
  • Integration tested with real OCI GenAI API
  • Formatted with black, linted with ruff
  • No credentials or secrets in commits

@vercel
Copy link
Copy Markdown

vercel Bot commented Mar 30, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 30, 2026 7:48pm

Request Review

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 30, 2026

CLA assistant check
All committers have signed the CLA.

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented Mar 30, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing danielgandolfi1984:feat/oci-embedding-and-models-update (c6f8595) with main (5cec43c)

Open in CodSpeed

danielgandolfi1984 and others added 4 commits March 30, 2026 19:35
- Add OCIEmbeddingConfig for OCI GenAI embedding models
- Add 16 new chat models (Cohere, Meta Llama, xAI Grok, Google Gemini)
- Add 8 embedding models (Cohere embed v3.0, v4.0)
- Update documentation with embedding examples
- Update pricing for all new models

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- 17 unit tests covering OCIEmbeddingConfig
- Tests for URL generation, param mapping, request/response transform
- Tests for model pricing JSON completeness

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OCI embedText API expects inputs, truncate, and inputType at the
top level of the request body, not nested under embedTextDetails.
Fixed transformation and updated tests accordingly.

Verified with real OCI API: 3/3 embedding models working.
@danielgandolfi1984 danielgandolfi1984 force-pushed the feat/oci-embedding-and-models-update branch from cc28505 to 7840f24 Compare March 30, 2026 19:36

import httpx

from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj

Check notice

Code scanning / CodeQL

Cyclic import Note

Import of module
litellm.litellm_core_utils.litellm_logging
begins an import cycle.
import httpx

from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj
from litellm.llms.base_llm.chat.transformation import BaseLLMException

Check notice

Code scanning / CodeQL

Cyclic import Note

Import of module
litellm.llms.base_llm.chat.transformation
begins an import cycle.

from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj
from litellm.llms.base_llm.chat.transformation import BaseLLMException
from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig

Check notice

Code scanning / CodeQL

Cyclic import Note

Import of module
litellm.llms.base_llm.embedding.transformation
begins an import cycle.
from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj
from litellm.llms.base_llm.chat.transformation import BaseLLMException
from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig
from litellm.llms.oci.chat.transformation import OCIChatConfig

Check notice

Code scanning / CodeQL

Cyclic import Note

Import of module
litellm.llms.oci.chat.transformation
begins an import cycle.
from litellm.llms.base_llm.chat.transformation import BaseLLMException
from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig
from litellm.llms.oci.chat.transformation import OCIChatConfig
from litellm.llms.oci.common_utils import OCIError

Check notice

Code scanning / CodeQL

Cyclic import Note

Import of module
litellm.llms.oci.common_utils
begins an import cycle.
from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig
from litellm.llms.oci.chat.transformation import OCIChatConfig
from litellm.llms.oci.common_utils import OCIError
from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues

Check notice

Code scanning / CodeQL

Cyclic import Note

Import of module
litellm.types.llms.openai
begins an import cycle.
from litellm.llms.oci.chat.transformation import OCIChatConfig
from litellm.llms.oci.common_utils import OCIError
from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues
from litellm.types.utils import EmbeddingResponse, Usage

Check notice

Code scanning / CodeQL

Cyclic import Note

Import of module
litellm.types.utils
begins an import cycle.
oci_signer = optional_params.get("oci_signer")
oci_region = optional_params.get("oci_region", "us-ashburn-1")

api_base = (

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable api_base is not used.
"Alternatively, provide an oci_signer object from the OCI SDK."
)

from litellm.llms.custom_httpx.http_handler import version

Check notice

Code scanning / CodeQL

Cyclic import Note

Import of module
litellm.llms.custom_httpx.http_handler
begins an import cycle.
Import of module
http_handler
begins an import cycle.
Comment thread litellm/utils.py
elif litellm.LlmProviders.PERPLEXITY == provider:
return litellm.PerplexityEmbeddingConfig()
elif litellm.LlmProviders.OCI == provider:
from litellm.llms.oci.embed.transformation import OCIEmbeddingConfig

Check notice

Code scanning / CodeQL

Cyclic import Note

Import of module
litellm.llms.oci.embed.transformation
begins an import cycle.
Import of module
embed.transformation
begins an import cycle.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 30, 2026

Greptile Summary

This PR adds OCI Generative AI embedding support (OCIEmbeddingConfig) plus 24 new model entries (8 embedding, 16 chat) to the pricing JSON. The integration follows established patterns — the new config class reuses OCIChatConfig's signing logic via delegation, registers in ProviderConfigManager, dispatches through base_llm_http_handler.embedding() in main.py, and is covered by 18 mock-only unit tests.

Key changes:

  • litellm/llms/oci/embed/transformation.py: New OCIEmbeddingConfig implementing BaseEmbeddingConfig — URL construction, request/response transformation, credential validation, and signing delegation to OCIChatConfig. Token-array inputs now raise ValueError instead of silently mangling them.
  • litellm/main.py: OCI dispatch block in embedding(), consistent with other base_llm_http_handler-backed providers.
  • litellm/utils.py: ProviderConfigManager.get_provider_embedding_config extended to return OCIEmbeddingConfig for the OCI provider.
  • model_prices_and_context_window.json: 8 embedding models with accurate output_vector_size / mode: embedding, 16 chat models with correct serving metadata.
  • One notable issue: dimensions is declared in get_supported_openai_params and mapped through map_openai_params, but is never written into the request_data dict that is sent to OCI. Callers who pass dimensions will have it silently ignored. Since OCI's embedText API does not accept a dimensions field, this parameter should either be removed from the supported list or clearly documented as a no-op placeholder for future use.

Confidence Score: 4/5

Safe to merge for default OCI usage; a custom api_base (noted in prior threads) and a silent dimensions drop are the remaining rough edges.

The core embedding flow — URL resolution, request signing, request/response transformation, and error handling — is correct and well-tested for the default (no custom api_base) path. The only new finding in this review is P2: dimensions is advertised as a supported param but silently dropped before it reaches the OCI request body. The previously-noted signing-URL mismatch for custom api_base (prior threads) remains unaddressed in the current code. Together these keep the score at 4/5 rather than 5.

litellm/llms/oci/embed/transformation.py — dimensions handling and the api_base signing alignment

Important Files Changed

Filename Overview
litellm/llms/oci/embed/transformation.py New OCIEmbeddingConfig with correct signing delegation, request/response transforms, and token-array error handling; dimensions is declared supported but silently dropped from the request body
litellm/main.py Adds OCI embedding dispatch block using base_llm_http_handler; consistent with other provider patterns in the same file
litellm/utils.py Registers OCIEmbeddingConfig in ProviderConfigManager via a lazy import; clean integration alongside existing providers
tests/test_litellm/llms/oci/embed/test_oci_embedding.py 18 well-structured unit tests with mocked signing; no real network calls; covers URL construction, request/response transforms, error cases, and pricing JSON validation
model_prices_and_context_window.json Adds 24 new entries (8 embedding models with correct mode/vector size, 16 chat models); also adds supports_vision to oci/meta.llama-3.2-11b-vision-instruct
litellm/llms/oci/chat/transformation.py Minor docstring improvement to get_vendor_from_model; no logic changes

Sequence Diagram

sequenceDiagram
    participant Caller
    participant litellm_main as litellm.embedding()
    participant handler as base_llm_http_handler
    participant cfg as OCIEmbeddingConfig
    participant chat_cfg as OCIChatConfig (signing)
    participant OCI as OCI GenAI API

    Caller->>litellm_main: embedding(model="oci/...", input=[...], oci_signer=...)
    litellm_main->>litellm_main: get_llm_provider() strips "oci/" prefix
    litellm_main->>handler: embedding(model, input, optional_params, ...)
    handler->>cfg: get_provider_embedding_config()
    handler->>cfg: validate_environment(headers, optional_params)
    cfg-->>handler: headers with content-type / user-agent
    handler->>cfg: get_complete_url(api_base, oci_region)
    cfg-->>handler: https://inference.generativeai.{region}.oci.oraclecloud.com/.../embedText
    handler->>cfg: transform_embedding_request(model, input, optional_params, headers)
    cfg->>cfg: build request_data (compartmentId, servingMode, inputs, truncate)
    cfg->>chat_cfg: sign_request(headers, optional_params, request_data, url)
    chat_cfg-->>cfg: signed_headers
    cfg-->>handler: request_data dict
    handler->>OCI: POST /embedText with signed headers + request_data
    OCI-->>handler: {embeddings, modelId, inputTextTokenCounts}
    handler->>cfg: transform_embedding_response(raw_response, model_response)
    cfg-->>handler: EmbeddingResponse (data, usage)
    handler-->>litellm_main: EmbeddingResponse
    litellm_main-->>Caller: EmbeddingResponse
Loading

Reviews (2): Last reviewed commit: "fix(oci): address code review findings f..." | Re-trigger Greptile

Comment thread litellm/llms/oci/embed/transformation.py Outdated
Comment thread litellm/llms/oci/embed/transformation.py
Comment thread litellm/llms/oci/embed/transformation.py
- P1: Fix signing URL mismatch with custom api_base by accepting
  api_base parameter in transform_embedding_request
- P2: Remove encoding_format from supported params (OCI does not
  support it, was silently dropped)
- P2: Raise ValueError for token-array inputs instead of silently
  converting to string representation
- Add test for token-list rejection
@danielgandolfi1984
Copy link
Copy Markdown
Contributor Author

@ishaan-jaff @krrishdholakia This PR is ready for review. It adds OCI embedding support, 24 new models, docs, and 18 unit tests. All relevant CI checks pass — the failing checks (lint, proxy-core, proxy-auth, proxy-infra, CodeQL) are pre-existing on main.

@danielgandolfi1984
Copy link
Copy Markdown
Contributor Author

@ishaan-jaff @krrishdholakia This PR is ready for review.

What it does: Adds OCI GenAI embedding support (OCIEmbeddingConfig), 24 new model entries (16 chat + 8 embedding), updated documentation, and 18 unit tests. All code review findings from Greptile have been addressed in the latest commit.

CI status: All relevant checks pass (49/49 successful including all Required checks). The failing checks (lint, proxy-core, proxy-auth, proxy-infra, CodeQL, zizmor) are pre-existing on main and unrelated to this PR.

Happy to address any feedback. Thanks!

Copy link
Copy Markdown
Contributor

@ishaan-berri ishaan-berri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ishaan-berri ishaan-berri changed the base branch from main to litellm_ishaan_march30 March 30, 2026 20:09
@ishaan-berri ishaan-berri merged commit ace883e into BerriAI:litellm_ishaan_march30 Mar 30, 2026
51 of 59 checks passed
@danielgandolfi1984
Copy link
Copy Markdown
Contributor Author

Hi there @ishaan-berri . Quick follow up fix on #24816

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants