[data][llm] Add pooling parameter by jeffreywang-anyscale · Pull Request #59534 · ray-project/ray

jeffreywang-anyscale · 2025-12-18T06:31:03Z

Description

The preprocessor forwards only sampling_params to the engine today. For task_type="embed", however, we should also allow forwarding pooling_params, enabling features such as truncating the input prompt to a fixed token budget via truncate_prompt_tokens or normalizing the output embedding via normalize. See https://docs.vllm.ai/en/latest/api/vllm/#vllm.PoolingParams for a comprehensive list of supported attributes.

Related issues

Resolves #57805

Additional information

Similar to sampling parameters, we allow users to specify pooling parameter under the "pooling_params" column for embedding tasks.
Add tests in test_vllm_engine_stage to validate that the pooling parameters are received by the engine.

gemini-code-assist

Code Review

This pull request adds support for pooling_params in vLLM embedding tasks, allowing users to specify parameters like token truncation and embedding normalization. The changes include updating the vLLM engine stage to handle these parameters and adding corresponding tests to validate the new functionality. My review includes a couple of suggestions for improving maintainability and aligning the implementation with the documented behavior.

python/ray/llm/_internal/batch/stages/vllm_engine_stage.py

python/ray/llm/tests/batch/gpu/stages/test_vllm_engine_stage.py

python/ray/llm/_internal/batch/stages/vllm_engine_stage.py

kouroshHakha · 2025-12-19T01:27:03Z

python/ray/llm/tests/batch/gpu/stages/test_vllm_engine_stage.py

+@pytest.mark.parametrize(
+    "pooling_params",
+    [
+        {"truncate_prompt_tokens": -1},


also test None, and {} ?

Added a test case for empty dict. None is not a possible value as we default to empty dict if pooling_params is not provided.

kouroshHakha · 2025-12-19T01:30:48Z

python/ray/llm/tests/batch/gpu/stages/test_vllm_engine_stage.py

+        for key, expected_value in pooling_params.items():
+            assert hasattr(request.params, key)
+            actual_value = getattr(request.params, key)
+            assert actual_value == expected_value


Critical can we test on some other property?

you want to basically test whether truncation is applied or whether normalizatio is applied.
It is not sufficient to check for request.params values to match what was sent in.

Idea: We can test that on input x the answer will be different comparing truncation=None vs. truncation=2, similarly we can test normalize=False vs. normalize=True

Validating the difference of outputs is a good idea.

sampling_params do not apply to encode. encode is deterministic.

Added a couple more test cases:

Compare truncation=None vs. truncation=3

Compare normalize=False vs. normalize=True

Validate that truncation is effective on long prompts

python/ray/llm/tests/batch/gpu/stages/test_vllm_engine_stage.py

python/ray/llm/_internal/batch/stages/vllm_engine_stage.py

jeffreywang-anyscale · 2025-12-19T07:41:17Z

In the latest revision, introduced a small fix:

Pooling parameter's truncate_prompt_tokens is not respected by AsyncLLMEngine.encode(). I filed an vllm-project/vllm#31012 in vLLM and have a vllm-project/vllm#31013 for it.

As a temporary solution, prompt truncation is handled through the truncate_prompt_tokens argument passed to AsyncLLMEngine.encode. From Ray's users perspective, any value provided via pooling_params will be honored as expected.

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>

kouroshHakha

lgtm

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: jasonwrwang <jasonwrwang@tencent.com>

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>

jeffreywang-anyscale requested review from kouroshHakha and richardliaw December 18, 2025 06:31

jeffreywang-anyscale requested a review from a team as a code owner December 18, 2025 06:31

gemini-code-assist bot reviewed Dec 18, 2025

View reviewed changes

python/ray/llm/_internal/batch/stages/vllm_engine_stage.py Show resolved Hide resolved

python/ray/llm/_internal/batch/stages/vllm_engine_stage.py Outdated Show resolved Hide resolved

cursor bot reviewed Dec 18, 2025

View reviewed changes

python/ray/llm/tests/batch/gpu/stages/test_vllm_engine_stage.py Show resolved Hide resolved

ray-gardener bot added data Ray Data-related issues llm labels Dec 18, 2025

jeffreywang-anyscale force-pushed the pooling-params branch from 0b66677 to b57e380 Compare December 18, 2025 23:26

cursor bot reviewed Dec 18, 2025

View reviewed changes

python/ray/llm/_internal/batch/stages/vllm_engine_stage.py Show resolved Hide resolved

kouroshHakha reviewed Dec 19, 2025

View reviewed changes

cursor bot reviewed Dec 19, 2025

View reviewed changes

python/ray/llm/tests/batch/gpu/stages/test_vllm_engine_stage.py Outdated Show resolved Hide resolved

python/ray/llm/_internal/batch/stages/vllm_engine_stage.py Show resolved Hide resolved

jeffreywang-anyscale force-pushed the pooling-params branch from 884994f to ff4e874 Compare December 19, 2025 07:37

jeffreywang-anyscale added the go add ONLY when ready to merge, run all tests label Dec 19, 2025

jeffreywang-anyscale added 3 commits December 19, 2025 13:51

Add pooling parameter

747ea5a

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>

gemini feedback

a5a14a8

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>

CR feedback

f234b30

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>

jeffreywang-anyscale force-pushed the pooling-params branch from ff4e874 to f234b30 Compare December 19, 2025 21:52

kouroshHakha approved these changes Dec 20, 2025

View reviewed changes

kouroshHakha merged commit 6d91f46 into ray-project:master Dec 20, 2025
6 checks passed

Yicheng-Lu-llll pushed a commit to Yicheng-Lu-llll/ray that referenced this pull request Dec 22, 2025

[data][llm] Add pooling parameter (ray-project#59534)

c40ae37

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>

AYou0207 pushed a commit to AYou0207/ray that referenced this pull request Jan 13, 2026

[data][llm] Add pooling parameter (ray-project#59534)

4042063

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: jasonwrwang <jasonwrwang@tencent.com>

lee1258561 pushed a commit to pinterest/ray that referenced this pull request Feb 3, 2026

[data][llm] Add pooling parameter (ray-project#59534)

ff32e66

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>

peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026

[data][llm] Add pooling parameter (ray-project#59534)

d590eae

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[data][llm] Add pooling parameter#59534

[data][llm] Add pooling parameter#59534
kouroshHakha merged 3 commits intoray-project:masterfrom
jeffreywang-anyscale:pooling-params

jeffreywang-anyscale commented Dec 18, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kouroshHakha Dec 19, 2025

Uh oh!

jeffreywang-anyscale Dec 19, 2025

Uh oh!

kouroshHakha Dec 19, 2025

Uh oh!

jeffreywang-anyscale Dec 19, 2025 •

edited

Loading

Uh oh!

jeffreywang-anyscale Dec 19, 2025

Uh oh!

jeffreywang-anyscale Dec 19, 2025

Uh oh!

Uh oh!

Uh oh!

jeffreywang-anyscale commented Dec 19, 2025

Uh oh!

kouroshHakha left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jeffreywang-anyscale commented Dec 18, 2025

Description

Related issues

Additional information

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kouroshHakha Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

jeffreywang-anyscale Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

kouroshHakha Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

jeffreywang-anyscale Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeffreywang-anyscale Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

jeffreywang-anyscale Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jeffreywang-anyscale commented Dec 19, 2025

Uh oh!

kouroshHakha left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jeffreywang-anyscale Dec 19, 2025 •

edited

Loading