[docs][data][llm] Batch inference docs reorg + update to reflect per-stage config refactor by nrghosh · Pull Request #59214 · ray-project/ray

nrghosh · 2025-12-06T02:36:17Z

Update / streamline batch inference documentation

Include new layout of updated architecture
Reflect refactor for batch inference config knobs / etc
Simplify user walkthrough, examples, and onboarding for batch inference
Misc API spec / comment updates and improvements

New structure for Batch Inference Docs (proposed)

Working with LLMs
=================

1. Quickstart   
2. Architecture                       
3. Common use cases
   - Text generation (merge from "Perform batch inference")
   - Embeddings (keep concise)
   - Vision-language models (simplify)
   - OpenAI-compatible endpoints
4. Troubleshooting                     
   - GPU memory / CUDA OOM
   - Model caching for large clusters
5. Advanced configuration            
   - Per-stage configuration
   - Model parallelism (TP/PP)
   - Cross-node parallelism
   - LoRA inference
   - S3/GCS model loading
   - Serve deployments
6. Usage data collection

gemini-code-assist

Code Review

This pull request significantly improves the documentation for ray.data.llm by updating it to reflect a recent refactoring of per-stage configurations. The changes introduce clearer, stage-based parameters and add valuable new sections explaining the processor architecture and advanced configuration options. The code examples are correctly updated to use the new API. Overall, this is a high-quality documentation update that will greatly benefit users. I've included a couple of minor suggestions to further enhance the clarity and completeness of the documentation.

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

- Restructure into: Getting Started, Common Use Cases, Troubleshooting, Advanced Config - Remove redundant 'Perform batch inference' section (duplicated quickstart) - Promote GPU OOM / model caching to Troubleshooting section - Consolidate advanced topics (parallelism, per-stage config, LoRA, Serve) - Simplify VLM and embeddings examples - Update to new stage config API (prepare_image_stage, etc.) - Add PIL, RunAI to Vale vocabulary Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

Signed-off-by: Nikhil G <nrghosh@users.noreply.github.com>

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

richardliaw · 2025-12-16T05:26:26Z

doc/source/data/working-with-llms.rst

+.. code-block:: text
+
+    Input Dataset
+         |
+         v
+    +------------------+
+    | Preprocess       |  (your custom function)
+    +------------------+
+         |
+         v
+    +------------------+
+    | PrepareImage     |  (optional, for VLMs)


Can you make this into a simple diagram that doesn't take up that much space?

or just a bullet list. problem is just this takes a lot of screen real estate.

richardliaw · 2025-12-16T05:33:42Z

doc/source/data/working-with-llms.rst

+        --bucket-uri gs://my-bucket/path/to/model

-For a complete embedding configuration example, see:
+Then reference the remote path in your config:


i would link out to the runai streamer explicitly for further reading

what's the difference between this and below model loading section?

They could be listed together, but the diff is that one is focused on a commonly encountered error (HF rate limits) and the other is introducing a new optimized solution (runai streamer) - so they are addressing different things. Renaming for clarity + including a direct link to the runai streamer docs.

after (runai) -

after - (HF model loading)

richardliaw · 2025-12-16T05:34:32Z

doc/source/data/working-with-llms.rst

+Horizontal scaling
+~~~~~~~~~~~~~~~~~~
+
+Besides cross-node parallelism, you can horizontally scale the LLM stage to multiple replicas using the ``concurrency`` parameter:

 .. literalinclude:: doc_code/working-with-llms/basic_llm_example.py
    :language: python
    :start-after: __concurrent_config_example_start__
    :end-before: __concurrent_config_example_end__


this belongs more in Common rather than Advanced

Common is more for detailing use-cases, I'll add it to 'getting started' - since it applies to all.

richardliaw · 2025-12-16T05:35:06Z

doc/source/data/working-with-llms.rst

+    )

-.. _faqs:
+Available fields for all stages: ``enabled``, ``batch_size``, ``concurrency``, ``runtime_env``, ``num_cpus``, ``memory``.


link out to documentation, don't reference fields here

Adding stage configs to API reference, and replacing inline with direct doc reference.

doc/source/data/working-with-llms.rst

richardliaw

overall this change makes sense.

github-actions · 2025-12-30T12:25:56Z

This pull request has been automatically marked as stale because it has not had
any activity for 14 days. It will be closed in another 14 days if no further activity occurs.
Thank you for your contributions.

You can always ask for help on our discussion forum or Ray's public slack channel.

If you'd like to keep this open, just leave any comment, and the stale label will be removed.

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

- move horiz scaling section - add explicit RunAI streamer link Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

- highlight agentic and multi-turn - add stage configs to api reference - link to api reference instead of listing fields Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

richardliaw · 2026-01-08T17:46:41Z

doc/source/data/working-with-llms.rst

 =================

-The :ref:`ray.data.llm <llm-ref>` module integrates with key large language model (LLM) inference engines and deployed models to enable LLM batch inference.
+The :ref:`ray.data.llm <llm-ref>` module integrates with LLM inference engines (vLLM, SGLang) to enable scalable batch inference on Ray Data datasets.


TOC is already on the right side, so this is fairly redundant. Can you also make the right side reflect proper heading hierarchy? -

I don't see a way to do nesting for ToC (right side) or any examples anywhere else in docs, hence the redundancy (just for readability). Followed up with Docs folks

For context this is what Master looks like right now:

doc/source/data/working-with-llms.rst

doc/source/data/api/llm.rst

doc/source/data/working-with-llms.rst

jeffreywang-anyscale · 2026-01-08T18:19:48Z

Thanks for updating the doc! It looks much nicer now :)

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

bveeramani

Stamp

…stage config refactor (ray-project#59214) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: Nikhil G <nrghosh@users.noreply.github.com> Signed-off-by: jeffery4011 <jefferyshen1015@gmail.com>

…stage config refactor (ray-project#59214) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: Nikhil G <nrghosh@users.noreply.github.com>

…stage config refactor (ray-project#59214) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: Nikhil G <nrghosh@users.noreply.github.com> Signed-off-by: peterxcli <peterxcli@gmail.com>

gemini-code-assist bot reviewed Dec 6, 2025

View reviewed changes

[docs][data][llm] Update docs for per-stage config refactor

80cef64

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

nrghosh force-pushed the data-llm-docs-config-knobs branch from a3199de to 80cef64 Compare December 7, 2025 00:40

Merge branch 'master' into data-llm-docs-config-knobs

8ebe8e1

nrghosh added the go add ONLY when ready to merge, run all tests label Dec 8, 2025

nrghosh changed the title ~~[docs][data][llm] Update batch inference docs for per-stage config refactor~~ [docs][data][llm] Batch inference docs reorg + update to reflect per-stage config refactor Dec 8, 2025

nrghosh and others added 2 commits December 7, 2025 23:35

Merge branch 'master' into data-llm-docs-config-knobs

f3bd0cf

nrghosh force-pushed the data-llm-docs-config-knobs branch from b4df643 to bcaeef2 Compare December 8, 2025 18:29

Merge branch 'master' into data-llm-docs-config-knobs

a9f8c98

richardliaw added the data Ray Data-related issues label Dec 10, 2025

nrghosh and others added 2 commits December 12, 2025 15:16

Merge branch 'master' into data-llm-docs-config-knobs

43765f3

Signed-off-by: Nikhil G <nrghosh@users.noreply.github.com>

fix doc

5a408f5

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

nrghosh marked this pull request as ready for review December 16, 2025 00:58

nrghosh requested a review from a team as a code owner December 16, 2025 00:58

richardliaw reviewed Dec 16, 2025

View reviewed changes

doc/source/data/working-with-llms.rst Show resolved Hide resolved

richardliaw reviewed Dec 16, 2025

View reviewed changes

github-actions bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Dec 30, 2025

nrghosh removed the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Jan 6, 2026

nrghosh added 3 commits January 6, 2026 14:01

clean up stage diagram

3e6a7fc

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

Address feedback

429835a

- move horiz scaling section - add explicit RunAI streamer link Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

address feedback

9b6251a

- highlight agentic and multi-turn - add stage configs to api reference - link to api reference instead of listing fields Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

nrghosh force-pushed the data-llm-docs-config-knobs branch from 9b6251a to 0513ec4 Compare January 7, 2026 02:42

nrghosh requested a review from richardliaw January 8, 2026 17:27

richardliaw reviewed Jan 8, 2026

View reviewed changes

doc/source/data/working-with-llms.rst Outdated Show resolved Hide resolved

richardliaw reviewed Jan 8, 2026

View reviewed changes

doc/source/data/working-with-llms.rst Outdated Show resolved Hide resolved

jeffreywang-anyscale reviewed Jan 8, 2026

View reviewed changes

nrghosh added 3 commits January 8, 2026 10:46

Address feedback

cf6e30e

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

feedback - add note about hosted endpoint / serve deployment

6838f4b

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

feedback - add back classification ex

403f9b3

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

nrghosh force-pushed the data-llm-docs-config-knobs branch from 0513ec4 to 403f9b3 Compare January 8, 2026 19:31

add tpu

f4c97da

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

nrghosh requested a review from a team January 8, 2026 19:42

Merge upstream/master

984d386

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

nrghosh requested review from jeffreywang-anyscale and richardliaw January 8, 2026 21:14

fix

10c2b5e

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

nrghosh requested review from a team and kouroshHakha January 9, 2026 00:28

nrghosh added the llm label Jan 12, 2026

kouroshHakha approved these changes Jan 15, 2026

View reviewed changes

kouroshHakha enabled auto-merge (squash) January 15, 2026 17:26

bveeramani approved these changes Jan 15, 2026

View reviewed changes

kouroshHakha merged commit 29af75c into ray-project:master Jan 15, 2026
7 checks passed

jeffreywang-anyscale mentioned this pull request Jan 29, 2026

[data][llm][doc] Add in resiliency section and refine doc code #60594

Merged

Conversation

nrghosh commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Update / streamline batch inference documentation

New structure for Batch Inference Docs (proposed)

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

richardliaw Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

richardliaw left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jeffreywang-anyscale commented Jan 8, 2026

Uh oh!

bveeramani left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

nrghosh commented Dec 6, 2025 •

edited

Loading

richardliaw Dec 16, 2025 •

edited

Loading