[Data][LLM] Remove DataContext overrides in Ray Data LLM Processor by nrghosh · Pull Request #60142 · ray-project/ray

nrghosh · 2026-01-14T19:07:36Z

Summary

Remove two DataContext workarounds in Ray Data LLM that are no longer needed:

Remove wait_for_min_actors_s = 600 override (issue [Data] GPU resource leakage after ray.data.llm pipeline is terminated #53124 closed)
Remove _enable_actor_pool_on_exit_hook = True override (issue [Core] Make sure Actor's __del__ method invoked on Actor's destruction #53169 closed)

Behavior Change

Previously, Ray Data LLM hardcoded wait_for_min_actors_s = 600, which caused blocking behavior:

concurrency=N: blocked until all N actors were ready
concurrency=(1, N): blocked until 1 actor was ready

After this change, wait_for_min_actors_s stays at default (-1), so:

No blocking occurs regardless of concurrency config
Processing starts as soon as any actor is ready
Users can still set wait_for_min_actors_s manually if they want blocking/timeout behavior

See: https://gist.github.com/nrghosh/68d63040e92b82987c67e4dee6c8f40f

Test Plan

Added tests verifying Processor does not override wait_for_min_actors_s
Added tests verifying concurrency config correctly maps to ActorPoolStrategy

Issue ray-project#53124 (GPU resource leakage) is now fixed. Remove the workaround that set wait_for_min_actors_s=600. This restores default Ray Data behavior where processing starts as soon as any actor is ready, rather than blocking for min_actors. Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

Verify that: - Processor does not override wait_for_min_actors_s (default preserved) - User-set wait_for_min_actors_s values are preserved - Concurrency config correctly maps to ActorPoolStrategy min/max size Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

Resource cleanup issues are now fixed (Issue ray-project#53169) including actor __del__ invocation Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

gemini-code-assist

Code Review

This pull request correctly removes two workarounds in the Ray Data LLM Processor that are no longer necessary due to upstream fixes. The removal of the hardcoded wait_for_min_actors_s and _enable_actor_pool_on_exit_hook overrides simplifies the code and restores the default, non-blocking behavior. The newly added tests are comprehensive, effectively verifying that the processor no longer overrides these DataContext settings and that the concurrency configuration is correctly passed through to the ActorPoolStrategy. The changes are clean and well-tested.

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

…ay-project#60142) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: jeffery4011 <jefferyshen1015@gmail.com>

…ay-project#60142) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

…ay-project#60142) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>

nrghosh added 3 commits January 14, 2026 10:41

Remove _enable_actor_pool_on_exit_hook override

c6c7969

Resource cleanup issues are now fixed (Issue ray-project#53169) including actor __del__ invocation Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

gemini-code-assist bot reviewed Jan 14, 2026

View reviewed changes

fix test lint

3e4927c

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

nrghosh added data Ray Data-related issues llm go add ONLY when ready to merge, run all tests labels Jan 14, 2026

nrghosh requested review from kouroshHakha and richardliaw January 14, 2026 20:39

nrghosh marked this pull request as ready for review January 14, 2026 20:52

nrghosh requested a review from a team as a code owner January 14, 2026 20:52

kouroshHakha approved these changes Jan 14, 2026

View reviewed changes

kouroshHakha changed the title ~~Remove DataContext overrides in Ray Data LLM Processor~~ [Data][LLM] Remove DataContext overrides in Ray Data LLM Processor Jan 14, 2026

kouroshHakha merged commit bef2442 into ray-project:master Jan 14, 2026
7 checks passed

ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Feb 3, 2026

[Data][LLM] Remove DataContext overrides in Ray Data LLM Processor (r…

29bd8e9

…ay-project#60142) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026

[Data][LLM] Remove DataContext overrides in Ray Data LLM Processor (r…

be232a9

…ay-project#60142) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>

peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026

[Data][LLM] Remove DataContext overrides in Ray Data LLM Processor (r…

01fe63d

…ay-project#60142) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Data][LLM] Remove DataContext overrides in Ray Data LLM Processor#60142

[Data][LLM] Remove DataContext overrides in Ray Data LLM Processor#60142
kouroshHakha merged 4 commits intoray-project:masterfrom
nrghosh:nrghosh/ray-data-llm-actor-pool-cleanup

nrghosh commented Jan 14, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nrghosh commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Behavior Change

Reproduction / Proof

Test Plan

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nrghosh commented Jan 14, 2026 •

edited

Loading