[data][llm] Support configuring HttpRequestUDF resources by yuchen-ecnu · Pull Request #60313 · ray-project/ray

yuchen-ecnu · 2026-01-20T05:33:04Z

Description

This PR introduces resources configuration options in HttpRequestProcessorConfig to provide more flexible resource control for HttpRequestUDF.

Related issues

Closes #60311

Additional information

It has been tested in python/ray/llm/tests/batch/cpu/stages/test_http_request_stage.py.
Users can use this feature with the following code:

config = HttpRequestProcessorConfig(
    url="http://xxxx/v1/chat/completions",
    headers={"Authorization": "Bearer fake_key"},
    qps=2,
    max_retries=3,
    concurrency=(1, 100),
    http_request_stage=HttpRequestStageConfig(
        num_cpus=0.5,
        memory=100000,
    ),
)

gemini-code-assist

Code Review

This pull request introduces a resources configuration option in HttpRequestProcessorConfig to allow more flexible resource control for the HttpRequestUDF. The implementation correctly adds the new field and passes the specified resources to the underlying Ray Data map_batches operator. The changes are accompanied by a test case that verifies the new functionality. The code is clear, correct, and a good addition to provide more control over resource allocation.

jeffreywang-anyscale · 2026-01-22T17:58:17Z

Hey @yuchen-ecnu thanks for your contribution! Could we follow how other stages (e.g. ChatTemplate, Tokenizer) configure their num_cpus (reference) and not introduce the resources attribute? That is, configure the resources at the stage rather than the processor granularity. Please fix DCO as well (git commit -s -m "your commit message" or git commit -s --amend), thank you! cc: @nrghosh

nrghosh

Thanks for the detailed issue writeup / PR!

Valid use-case / concern, and we do support fractional cpu support - so this is a good request.

Main gap to address as you point out is that HttpRequestProcessorConfig doesn't expose that for the HTTP stage. For this PR (as Jeff pointed out), would want to align with the existing pattern by adding explicit fields, like

class HttpRequestProcessorConfig(ProcessorConfig):                                                           
    num_cpus: Optional[float] = Field(                                                                       
      default=None,                                                                                        
      description="CPUs per HttpRequestUDF worker. Defaults to 1 if None. "                                
      "For I/O-bound workloads, use fractional values (e.g., 0.1).",                                       
    )                                                                                                        
    memory: Optional[float] = Field(default=None, ...)

This way we keep the API consistent with other stages + leverage the existing extract_resource_kwargs() utility.

Also

lint / sign (from Jeff)
update test to match above pattern ^

…stProcessorConfig Signed-off-by: Yu Chen <yuchen.ecnu@gmail.com>

yuchen-ecnu · 2026-01-27T02:54:26Z

Hi @jeffreywang-anyscale @nrghosh , thanks for the detailed comments!

You’re right that we should keep consistency with the existing pattern. I’ve replaced the generic config resource with explicit fields num_cpus and memory, and updated the tests accordingly.
The DCO issue has also been addressed in the latest commit.

Please let me know if there’s anything else I should improve.

jeffreywang-anyscale · 2026-01-27T17:13:53Z

python/ray/llm/_internal/batch/processor/http_request_proc.py

+    num_cpus: Optional[float] = Field(
+        default=None,
+        description="Number of CPUs per HttpRequestUDF worker. Defaults to 1 if None. "
+        "For I/O-bound workloads, use fractional values (e.g., 0.1).",
+    )
+    memory: Optional[float] = Field(
+        default=None,
+        description="Heap memory in bytes to reserve for each HttpRequestUDF worker.",
+    )


The direction makes sense, but let's do this at the stage layer (not the processor layer) to be consistent with how other stages configure num_cpus and memory although there is only 1 stage in the HttpRequestProcessor. We can take advantage of

ray/python/ray/llm/_internal/batch/processor/utils.py

Line 44 in d406677

def build_cpu_stage_map_kwargs(

and will not need to invoke extract_resource_kwargs directly. You can follow this example.

Hi @jeffreywang-anyscale , many thanks for the example — it really helped me understand.

I’ve moved the configs to the stage layer using the new HttpRequestStageConfig and added a guard check to prevent users from disabling the only stage in HttpRequestProcessor.

Please let me know if there are any further improvements I should make.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

python/ray/llm/tests/batch/cpu/processor/test_http_request_proc.py

jeffreywang-anyscale

Couple of documentation NITs. Thank you again for the contribution, @yuchen-ecnu!

jeffreywang-anyscale · 2026-01-28T16:53:42Z

python/ray/llm/_internal/batch/processor/http_request_proc.py

    )
+    http_request_stage: Any = Field(
+        default=True,
+        description="Chat templating stage config (bool | dict | HttpRequestStageConfig).",


My mistake, updated.

python/ray/llm/tests/batch/cpu/processor/test_http_request_proc.py

Signed-off-by: Yu Chen <yuchen.ecnu@gmail.com>

yuchen-ecnu · 2026-01-29T02:09:46Z

Couple of documentation NITs. Thank you again for the contribution, @yuchen-ecnu!

Hi @jeffreywang-anyscale , many thanks for your kind and patient review. The related docs have been updated.

doc/source/data/working-with-llms.rst

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>

jeffreywang-anyscale · 2026-01-29T07:01:20Z

There are some transient failures with Anyscale:

Traceback (most recent call last):
--
File "/workdir/release/ray_release/job_manager/anyscale_job_manager.py", line 85, in _run_job
job_response = self._sdk.create_job(job_request)
File "/usr/local/lib/python3.10/site-packages/anyscale/sdk/anyscale_client/sdk.py", line 552, in create_job
return super().create_job(create_production_job, **kwargs)
File "/usr/local/lib/python3.10/site-packages/anyscale/sdk/anyscale_client/api/default_api.py", line 884, in create_job
return self.create_job_with_http_info(create_production_job, **kwargs)  # noqa: E501
File "/usr/local/lib/python3.10/site-packages/anyscale/sdk/anyscale_client/api/default_api.py", line 963, in create_job_with_http_info
return self.api_client.call_api(
File "/usr/local/lib/python3.10/site-packages/anyscale/sdk/anyscale_client/api_client.py", line 348, in call_api
return self.__call_api(resource_path, method,
File "/usr/local/lib/python3.10/site-packages/anyscale/sdk/anyscale_client/api_client.py", line 180, in __call_api
response_data = self.request(
File "/usr/local/lib/python3.10/site-packages/anyscale/sdk/anyscale_client/api_client.py", line 391, in request
return self.rest_client.POST(url,
File "/usr/local/lib/python3.10/site-packages/anyscale/sdk/anyscale_client/rest.py", line 270, in POST
return self.request("POST", url,
File "/usr/local/lib/python3.10/site-packages/anyscale/sdk/anyscale_client/rest.py", line 229, in request
raise ApiException(http_resp=r)
anyscale_client.exceptions.ApiException: (500)
Reason: Internal Server Error

Retry didn't succeed. Will retry the failed release tests again tomorrow.

out of office

bveeramani

Stamp

…#60313) Signed-off-by: Yu Chen <yuchen.ecnu@gmail.com> Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Co-authored-by: Jeffrey Wang <jeffreywang@anyscale.com>

…#60313) Signed-off-by: Yu Chen <yuchen.ecnu@gmail.com> Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Co-authored-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: 400Ping <jiekaichang@apache.org>

xiyuecangxin · 2026-02-10T02:50:43Z

LGTM!

…#60313) Signed-off-by: Yu Chen <yuchen.ecnu@gmail.com> Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Co-authored-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: Adel Nour <ans9868@nyu.edu>

…#60313) Signed-off-by: Yu Chen <yuchen.ecnu@gmail.com> Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Co-authored-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>

yuchen-ecnu requested a review from a team as a code owner January 20, 2026 05:33

gemini-code-assist bot reviewed Jan 20, 2026

View reviewed changes

ray-gardener bot added data Ray Data-related issues llm community-contribution Contributed by the community labels Jan 20, 2026

nrghosh previously requested changes Jan 22, 2026

View reviewed changes

[data][llm] Support configuring HttpRequestUDF cpu & mem in HttpReque…

fff4742

…stProcessorConfig Signed-off-by: Yu Chen <yuchen.ecnu@gmail.com>

yuchen-ecnu force-pushed the RAY-60311 branch from 8611290 to fff4742 Compare January 27, 2026 02:48

jeffreywang-anyscale requested changes Jan 27, 2026

View reviewed changes

cursor bot reviewed Jan 28, 2026

View reviewed changes

python/ray/llm/tests/batch/cpu/processor/test_http_request_proc.py Show resolved Hide resolved

yuchen-ecnu force-pushed the RAY-60311 branch from 5170186 to aec7d65 Compare January 28, 2026 06:31

jeffreywang-anyscale requested changes Jan 28, 2026

View reviewed changes

jeffreywang-anyscale added the go add ONLY when ready to merge, run all tests label Jan 28, 2026

resolve comments

3d9e1cd

Signed-off-by: Yu Chen <yuchen.ecnu@gmail.com>

yuchen-ecnu force-pushed the RAY-60311 branch from aec7d65 to 3d9e1cd Compare January 29, 2026 02:05

yuchen-ecnu requested a review from a team as a code owner January 29, 2026 02:05

jeffreywang-anyscale reviewed Jan 29, 2026

View reviewed changes

doc/source/data/working-with-llms.rst Outdated Show resolved Hide resolved

Update doc/source/data/working-with-llms.rst

0dca570

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>

jeffreywang-anyscale approved these changes Jan 29, 2026

View reviewed changes

kouroshHakha approved these changes Jan 29, 2026

View reviewed changes

kouroshHakha enabled auto-merge (squash) January 29, 2026 18:57

bveeramani approved these changes Jan 29, 2026

View reviewed changes

kouroshHakha merged commit 599b347 into ray-project:master Jan 29, 2026
7 checks passed

Conversation

yuchen-ecnu commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related issues

Additional information

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

jeffreywang-anyscale commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nrghosh left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuchen-ecnu commented Jan 27, 2026

Uh oh!

jeffreywang-anyscale Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

yuchen-ecnu Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jeffreywang-anyscale left a comment

Choose a reason for hiding this comment

Uh oh!

jeffreywang-anyscale Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

yuchen-ecnu Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yuchen-ecnu commented Jan 29, 2026

Uh oh!

Uh oh!

jeffreywang-anyscale commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bveeramani left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

xiyuecangxin commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

yuchen-ecnu commented Jan 20, 2026 •

edited

Loading

jeffreywang-anyscale commented Jan 22, 2026 •

edited

Loading

nrghosh left a comment •

edited

Loading

jeffreywang-anyscale commented Jan 29, 2026 •

edited

Loading