feat(ocr/azure-di): support Mistral-style pages param via analyze query string by shivamrawat1 · Pull Request #25929 · BerriAI/litellm

shivamrawat1 · 2026-04-17T02:08:37Z

Relevant issues

AzureDocumentIntelligenceOCRConfig.get_supported_ocr_params returned [], so LiteLLM dropped pages from OCR requests to azure/doc-intel. transform_ocr_request explicitly ignored it and get_complete_url never appended Azure DI's pages query param. Result: callers couldn't limit page ranges on Azure DI through /v1/ocr, despite Azure natively supporting it via ?pages=1-3,5,7-9.

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Screenshots / Proof of Fix

Type

🐛 Bug Fix
✅ Test

Changes

Declare pages as a supported param.
Add map_ocr_params + _normalize_pages_param to translate Mistral-style list[int] (0-based) → Azure's 1-based comma/range string, with passthrough for native strings ("3-9") and list[str] tokens; validate and raise on bad input.
Append &pages=... to the analyze URL in get_complete_url; keep pages out of the JSON body.
Add unit tests in tests/ocr_tests/test_ocr_azure_document_intelligence.py (no Azure creds needed) covering param mapping, URL construction, body exclusion, and end-to-end shape.

vercel · 2026-04-17T02:08:42Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Apr 17, 2026 2:30am

greptile-apps · 2026-04-17T02:11:04Z

Greptile Summary

This PR adds support for the pages parameter in Azure Document Intelligence OCR requests by (1) declaring it as a supported param, (2) adding map_ocr_params + _normalize_pages_param to translate Mistral-style 0-based list[int] into Azure's 1-based comma/range query string, and (3) appending &pages=... to the analyze URL in get_complete_url while keeping it out of the JSON body. The implementation is end-to-end wired correctly through litellm/ocr/main.py and is accompanied by comprehensive unit tests that don't require Azure credentials.

Confidence Score: 5/5

Safe to merge — the feature is correctly implemented end-to-end with no blocking issues.

All prior review concerns (inline import of urllib.parse, all() vs any() bool guard) were resolved in prior commits. The pages translation logic is correct, URL encoding uses safe=',-' appropriately, the param is correctly excluded from the request body, and the pipeline in ocr/main.py already calls both map_ocr_params and get_complete_url. Remaining findings are P2 style observations only.

No files require special attention.

Important Files Changed

Filename	Overview
litellm/llms/azure_ai/ocr/document_intelligence/transformation.py	Adds `pages` param support: `get_supported_ocr_params` now returns `["pages"]`, `map_ocr_params` and `_normalize_pages_param` handle Mistral→Azure translation, and `get_complete_url` appends the query string. Prior review concerns (inline import, `all()` vs `any()` bool guard) have been addressed in prior commits.
tests/ocr_tests/test_ocr_azure_document_intelligence.py	Adds `TestAzureDocumentIntelligencePagesParam` with 12 pure unit tests covering int-list conversion, dedup/sort, empty-list passthrough, native Azure string passthrough, list-of-string tokens, validation errors, URL construction, body exclusion, and end-to-end shape. No real network calls.

_{Reviews (3): Last reviewed commit: "Merge branch 'litellm_internal_staging' ..." | Re-trigger Greptile}

Use any() instead of all() for bool check so lists like [True, 1, 2] raise ValueError; bool is a subclass of int so all(int) alone was insufficient. Made-with: Cursor

Remove inline import in get_complete_url; quote is stdlib with no circular import risk per project style. Made-with: Cursor

shivamrawat1 · 2026-04-17T02:29:02Z

@greptile review with the new two commits that resolved the raised p1 and p2

…r_ocr

add support for pages param

b6d5728

vercel Bot deployed to Preview April 17, 2026 02:08 View deployment

greptile-apps Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread litellm/llms/azure_ai/ocr/document_intelligence/transformation.py Outdated

Comment thread litellm/llms/azure_ai/ocr/document_intelligence/transformation.py

fix(azure-ocr): reject mixed bool+int in pages list validation

089ca5f

Use any() instead of all() for bool check so lists like [True, 1, 2] raise ValueError; bool is a subclass of int so all(int) alone was insufficient. Made-with: Cursor

shivamrawat1 had a problem deploying to integration-postgres April 17, 2026 02:27 — with GitHub Actions Error

shivamrawat1 had a problem deploying to integration-postgres April 17, 2026 02:28 — with GitHub Actions Error

refactor(azure-ocr): move urllib.parse.quote to module imports

a84c276

Remove inline import in get_complete_url; quote is stdlib with no circular import risk per project style. Made-with: Cursor

shivamrawat1 temporarily deployed to integration-postgres April 17, 2026 02:29 — with GitHub Actions Inactive

vercel Bot deployed to Preview April 17, 2026 02:30 View deployment

Merge branch 'litellm_internal_staging' into litellm_pages_support_fo…

37765e6

…r_ocr

shivamrawat1 temporarily deployed to integration-postgres April 18, 2026 02:29 — with GitHub Actions Inactive

shivamrawat1 had a problem deploying to integration-postgres April 18, 2026 02:29 — with GitHub Actions Error

ishaan-berri approved these changes Apr 18, 2026

View reviewed changes

ishaan-berri merged commit d042b44 into litellm_internal_staging Apr 18, 2026
94 of 98 checks passed

ishaan-berri deleted the litellm_pages_support_for_ocr branch April 18, 2026 17:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(ocr/azure-di): support Mistral-style pages param via analyze query string#25929

feat(ocr/azure-di): support Mistral-style pages param via analyze query string#25929
ishaan-berri merged 4 commits intolitellm_internal_stagingfrom
litellm_pages_support_for_ocr

shivamrawat1 commented Apr 17, 2026

Uh oh!

vercel Bot commented Apr 17, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Apr 17, 2026 •

edited

Loading

Important Files Changed

Uh oh!

Uh oh!

Uh oh!

shivamrawat1 commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

shivamrawat1 commented Apr 17, 2026

Relevant issues

Pre-Submission checklist

Delays in PR merge?

CI (LiteLLM team)

Screenshots / Proof of Fix

Type

Changes

Uh oh!

vercel Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

Uh oh!

Uh oh!

shivamrawat1 commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel Bot commented Apr 17, 2026 •

edited

Loading

greptile-apps Bot commented Apr 17, 2026 •

edited

Loading