feat(oci): add embedding support, new models, and updated docs#24808
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
- Add OCIEmbeddingConfig for OCI GenAI embedding models - Add 16 new chat models (Cohere, Meta Llama, xAI Grok, Google Gemini) - Add 8 embedding models (Cohere embed v3.0, v4.0) - Update documentation with embedding examples - Update pricing for all new models Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- 17 unit tests covering OCIEmbeddingConfig - Tests for URL generation, param mapping, request/response transform - Tests for model pricing JSON completeness Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OCI embedText API expects inputs, truncate, and inputType at the top level of the request body, not nested under embedTextDetails. Fixed transformation and updated tests accordingly. Verified with real OCI API: 3/3 embedding models working.
cc28505 to
7840f24
Compare
|
|
||
| import httpx | ||
|
|
||
| from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj |
Check notice
Code scanning / CodeQL
Cyclic import Note
| import httpx | ||
|
|
||
| from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj | ||
| from litellm.llms.base_llm.chat.transformation import BaseLLMException |
Check notice
Code scanning / CodeQL
Cyclic import Note
|
|
||
| from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj | ||
| from litellm.llms.base_llm.chat.transformation import BaseLLMException | ||
| from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig |
Check notice
Code scanning / CodeQL
Cyclic import Note
| from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj | ||
| from litellm.llms.base_llm.chat.transformation import BaseLLMException | ||
| from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig | ||
| from litellm.llms.oci.chat.transformation import OCIChatConfig |
Check notice
Code scanning / CodeQL
Cyclic import Note
| from litellm.llms.base_llm.chat.transformation import BaseLLMException | ||
| from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig | ||
| from litellm.llms.oci.chat.transformation import OCIChatConfig | ||
| from litellm.llms.oci.common_utils import OCIError |
Check notice
Code scanning / CodeQL
Cyclic import Note
| from litellm.llms.base_llm.embedding.transformation import BaseEmbeddingConfig | ||
| from litellm.llms.oci.chat.transformation import OCIChatConfig | ||
| from litellm.llms.oci.common_utils import OCIError | ||
| from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues |
Check notice
Code scanning / CodeQL
Cyclic import Note
| from litellm.llms.oci.chat.transformation import OCIChatConfig | ||
| from litellm.llms.oci.common_utils import OCIError | ||
| from litellm.types.llms.openai import AllEmbeddingInputValues, AllMessageValues | ||
| from litellm.types.utils import EmbeddingResponse, Usage |
Check notice
Code scanning / CodeQL
Cyclic import Note
| oci_signer = optional_params.get("oci_signer") | ||
| oci_region = optional_params.get("oci_region", "us-ashburn-1") | ||
|
|
||
| api_base = ( |
Check notice
Code scanning / CodeQL
Unused local variable Note
| "Alternatively, provide an oci_signer object from the OCI SDK." | ||
| ) | ||
|
|
||
| from litellm.llms.custom_httpx.http_handler import version |
Check notice
Code scanning / CodeQL
Cyclic import Note
| elif litellm.LlmProviders.PERPLEXITY == provider: | ||
| return litellm.PerplexityEmbeddingConfig() | ||
| elif litellm.LlmProviders.OCI == provider: | ||
| from litellm.llms.oci.embed.transformation import OCIEmbeddingConfig |
Check notice
Code scanning / CodeQL
Cyclic import Note
Greptile SummaryThis PR adds OCI Generative AI embedding support ( Key changes:
Confidence Score: 4/5Safe to merge for default OCI usage; a custom api_base (noted in prior threads) and a silent dimensions drop are the remaining rough edges. The core embedding flow — URL resolution, request signing, request/response transformation, and error handling — is correct and well-tested for the default (no custom api_base) path. The only new finding in this review is P2: litellm/llms/oci/embed/transformation.py — dimensions handling and the api_base signing alignment
|
| Filename | Overview |
|---|---|
| litellm/llms/oci/embed/transformation.py | New OCIEmbeddingConfig with correct signing delegation, request/response transforms, and token-array error handling; dimensions is declared supported but silently dropped from the request body |
| litellm/main.py | Adds OCI embedding dispatch block using base_llm_http_handler; consistent with other provider patterns in the same file |
| litellm/utils.py | Registers OCIEmbeddingConfig in ProviderConfigManager via a lazy import; clean integration alongside existing providers |
| tests/test_litellm/llms/oci/embed/test_oci_embedding.py | 18 well-structured unit tests with mocked signing; no real network calls; covers URL construction, request/response transforms, error cases, and pricing JSON validation |
| model_prices_and_context_window.json | Adds 24 new entries (8 embedding models with correct mode/vector size, 16 chat models); also adds supports_vision to oci/meta.llama-3.2-11b-vision-instruct |
| litellm/llms/oci/chat/transformation.py | Minor docstring improvement to get_vendor_from_model; no logic changes |
Sequence Diagram
sequenceDiagram
participant Caller
participant litellm_main as litellm.embedding()
participant handler as base_llm_http_handler
participant cfg as OCIEmbeddingConfig
participant chat_cfg as OCIChatConfig (signing)
participant OCI as OCI GenAI API
Caller->>litellm_main: embedding(model="oci/...", input=[...], oci_signer=...)
litellm_main->>litellm_main: get_llm_provider() strips "oci/" prefix
litellm_main->>handler: embedding(model, input, optional_params, ...)
handler->>cfg: get_provider_embedding_config()
handler->>cfg: validate_environment(headers, optional_params)
cfg-->>handler: headers with content-type / user-agent
handler->>cfg: get_complete_url(api_base, oci_region)
cfg-->>handler: https://inference.generativeai.{region}.oci.oraclecloud.com/.../embedText
handler->>cfg: transform_embedding_request(model, input, optional_params, headers)
cfg->>cfg: build request_data (compartmentId, servingMode, inputs, truncate)
cfg->>chat_cfg: sign_request(headers, optional_params, request_data, url)
chat_cfg-->>cfg: signed_headers
cfg-->>handler: request_data dict
handler->>OCI: POST /embedText with signed headers + request_data
OCI-->>handler: {embeddings, modelId, inputTextTokenCounts}
handler->>cfg: transform_embedding_response(raw_response, model_response)
cfg-->>handler: EmbeddingResponse (data, usage)
handler-->>litellm_main: EmbeddingResponse
litellm_main-->>Caller: EmbeddingResponse
Reviews (2): Last reviewed commit: "fix(oci): address code review findings f..." | Re-trigger Greptile
- P1: Fix signing URL mismatch with custom api_base by accepting api_base parameter in transform_embedding_request - P2: Remove encoding_format from supported params (OCI does not support it, was silently dropped) - P2: Raise ValueError for token-array inputs instead of silently converting to string representation - Add test for token-list rejection
|
@ishaan-jaff @krrishdholakia This PR is ready for review. It adds OCI embedding support, 24 new models, docs, and 18 unit tests. All relevant CI checks pass — the failing checks (lint, proxy-core, proxy-auth, proxy-infra, CodeQL) are pre-existing on main. |
|
@ishaan-jaff @krrishdholakia This PR is ready for review. What it does: Adds OCI GenAI embedding support ( CI status: All relevant checks pass (49/49 successful including all Required checks). The failing checks (lint, proxy-core, proxy-auth, proxy-infra, CodeQL, zizmor) are pre-existing on main and unrelated to this PR. Happy to address any feedback. Thanks! |
ace883e
into
BerriAI:litellm_ishaan_march30
|
Hi there @ishaan-berri . Quick follow up fix on #24816 |
Summary
Comprehensive update to the OCI (Oracle Cloud Infrastructure) Generative AI provider:
OCIEmbeddingConfigfor OCI GenAI embedding models, reusing the existingOCIChatConfigsigning logicNew Files
litellm/llms/oci/embed/__init__.pylitellm/llms/oci/embed/transformation.py—OCIEmbeddingConfigtests/test_litellm/llms/oci/embed/__init__.pytests/test_litellm/llms/oci/embed/test_oci_embedding.py— 18 unit testsModified Files
litellm/__init__.py— ImportOCIEmbeddingConfiglitellm/utils.py— Register inProviderConfigManagerlitellm/main.py— Dispatch embedding calls to OCI handlerlitellm/llms/oci/chat/transformation.py— Updated docstring forget_vendor_from_model()model_prices_and_context_window.json— 24 new model entrieslitellm/model_prices_and_context_window_backup.json— Backup updateddocs/my-website/docs/providers/oci.md— Embedding docs, new models list, dedicated endpointsEmbedding Usage
Integration Test Results (us-chicago-1)
Tested with real OCI credentials against new models:
grok-code-fast-1,command-r-08-2024,command-r-plus-08-2024,gemini-2.5-flash-litecommand-a-03-2025,llama-3.3-70b,llama-4-scout,grok-3-mini-fast,gemini-2.5-flash-liteembed-english-v3.0,embed-english-light-v3.0,embed-multilingual-v3.0,embed-multilingual-light-v3.0,embed-v4.0Note: 404 failures are region-availability issues (models not yet deployed in us-chicago-1), not code bugs. Gemini pro/flash have response parsing differences that may need a follow-up.
Test plan
OCIEmbeddingConfig(URL, params, request/response transform, pricing JSON)