[data][llm] Support configuring HttpRequestUDF resources#60313
[data][llm] Support configuring HttpRequestUDF resources#60313kouroshHakha merged 3 commits intoray-project:masterfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a resources configuration option in HttpRequestProcessorConfig to allow more flexible resource control for the HttpRequestUDF. The implementation correctly adds the new field and passes the specified resources to the underlying Ray Data map_batches operator. The changes are accompanied by a test case that verifies the new functionality. The code is clear, correct, and a good addition to provide more control over resource allocation.
|
Hey @yuchen-ecnu thanks for your contribution! Could we follow how other stages (e.g. ChatTemplate, Tokenizer) configure their num_cpus (reference) and not introduce the |
There was a problem hiding this comment.
Thanks for the detailed issue writeup / PR!
Valid use-case / concern, and we do support fractional cpu support - so this is a good request.
Main gap to address as you point out is that HttpRequestProcessorConfig doesn't expose that for the HTTP stage. For this PR (as Jeff pointed out), would want to align with the existing pattern by adding explicit fields, like
class HttpRequestProcessorConfig(ProcessorConfig):
num_cpus: Optional[float] = Field(
default=None,
description="CPUs per HttpRequestUDF worker. Defaults to 1 if None. "
"For I/O-bound workloads, use fractional values (e.g., 0.1).",
)
memory: Optional[float] = Field(default=None, ...)
This way we keep the API consistent with other stages + leverage the existing extract_resource_kwargs() utility.
Also
- lint / sign (from Jeff)
- update test to match above pattern ^
…stProcessorConfig Signed-off-by: Yu Chen <yuchen.ecnu@gmail.com>
8611290 to
fff4742
Compare
|
Hi @jeffreywang-anyscale @nrghosh , thanks for the detailed comments! You’re right that we should keep consistency with the existing pattern. I’ve replaced the generic config Please let me know if there’s anything else I should improve. |
| num_cpus: Optional[float] = Field( | ||
| default=None, | ||
| description="Number of CPUs per HttpRequestUDF worker. Defaults to 1 if None. " | ||
| "For I/O-bound workloads, use fractional values (e.g., 0.1).", | ||
| ) | ||
| memory: Optional[float] = Field( | ||
| default=None, | ||
| description="Heap memory in bytes to reserve for each HttpRequestUDF worker.", | ||
| ) |
There was a problem hiding this comment.
The direction makes sense, but let's do this at the stage layer (not the processor layer) to be consistent with how other stages configure num_cpus and memory although there is only 1 stage in the HttpRequestProcessor. We can take advantage of
extract_resource_kwargs directly. You can follow this example.
There was a problem hiding this comment.
Hi @jeffreywang-anyscale , many thanks for the example — it really helped me understand.
I’ve moved the configs to the stage layer using the new HttpRequestStageConfig and added a guard check to prevent users from disabling the only stage in HttpRequestProcessor.
Please let me know if there are any further improvements I should make.
5170186 to
aec7d65
Compare
jeffreywang-anyscale
left a comment
There was a problem hiding this comment.
Couple of documentation NITs. Thank you again for the contribution, @yuchen-ecnu!
| ) | ||
| http_request_stage: Any = Field( | ||
| default=True, | ||
| description="Chat templating stage config (bool | dict | HttpRequestStageConfig).", |
There was a problem hiding this comment.
My mistake, updated.
Signed-off-by: Yu Chen <yuchen.ecnu@gmail.com>
aec7d65 to
3d9e1cd
Compare
Hi @jeffreywang-anyscale , many thanks for your kind and patient review. The related docs have been updated. |
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
|
There are some transient failures with Anyscale: Retry didn't succeed. Will retry the failed release tests again tomorrow. |
…#60313) Signed-off-by: Yu Chen <yuchen.ecnu@gmail.com> Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Co-authored-by: Jeffrey Wang <jeffreywang@anyscale.com>
…#60313) Signed-off-by: Yu Chen <yuchen.ecnu@gmail.com> Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Co-authored-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: 400Ping <jiekaichang@apache.org>
|
LGTM! |
…#60313) Signed-off-by: Yu Chen <yuchen.ecnu@gmail.com> Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Co-authored-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: Adel Nour <ans9868@nyu.edu>
…#60313) Signed-off-by: Yu Chen <yuchen.ecnu@gmail.com> Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Co-authored-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
…#60313) Signed-off-by: Yu Chen <yuchen.ecnu@gmail.com> Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Co-authored-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
Description
This PR introduces
resourcesconfiguration options inHttpRequestProcessorConfigto provide more flexible resource control forHttpRequestUDF.Related issues
Closes #60311
Additional information
It has been tested in
python/ray/llm/tests/batch/cpu/stages/test_http_request_stage.py.Users can use this feature with the following code: