feat(ocr/azure-di): support Mistral-style pages param via analyze query string#25929
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR adds support for the Confidence Score: 5/5Safe to merge — the feature is correctly implemented end-to-end with no blocking issues. All prior review concerns (inline import of No files require special attention.
|
| Filename | Overview |
|---|---|
| litellm/llms/azure_ai/ocr/document_intelligence/transformation.py | Adds pages param support: get_supported_ocr_params now returns ["pages"], map_ocr_params and _normalize_pages_param handle Mistral→Azure translation, and get_complete_url appends the query string. Prior review concerns (inline import, all() vs any() bool guard) have been addressed in prior commits. |
| tests/ocr_tests/test_ocr_azure_document_intelligence.py | Adds TestAzureDocumentIntelligencePagesParam with 12 pure unit tests covering int-list conversion, dedup/sort, empty-list passthrough, native Azure string passthrough, list-of-string tokens, validation errors, URL construction, body exclusion, and end-to-end shape. No real network calls. |
Reviews (3): Last reviewed commit: "Merge branch 'litellm_internal_staging' ..." | Re-trigger Greptile
Use any() instead of all() for bool check so lists like [True, 1, 2] raise ValueError; bool is a subclass of int so all(int) alone was insufficient. Made-with: Cursor
Remove inline import in get_complete_url; quote is stdlib with no circular import risk per project style. Made-with: Cursor
|
@greptile review with the new two commits that resolved the raised p1 and p2 |
d042b44
into
litellm_internal_staging
Relevant issues
AzureDocumentIntelligenceOCRConfig.get_supported_ocr_params returned [], so LiteLLM dropped pages from OCR requests to azure/doc-intel. transform_ocr_request explicitly ignored it and get_complete_url never appended Azure DI's pages query param. Result: callers couldn't limit page ranges on Azure DI through /v1/ocr, despite Azure natively supporting it via ?pages=1-3,5,7-9.
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewDelays in PR merge?
If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).
CI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Screenshots / Proof of Fix
Type
🐛 Bug Fix
✅ Test
Changes
Declare pages as a supported param.
Add map_ocr_params + _normalize_pages_param to translate Mistral-style list[int] (0-based) → Azure's 1-based comma/range string, with passthrough for native strings ("3-9") and list[str] tokens; validate and raise on bad input.
Append &pages=... to the analyze URL in get_complete_url; keep pages out of the JSON body.
Add unit tests in tests/ocr_tests/test_ocr_azure_document_intelligence.py (no Azure creds needed) covering param mapping, URL construction, body exclusion, and end-to-end shape.