[Data] Add max_task_concurrency, min_scheduling_resources, and per_task_resource_allocation#56381
[Data] Add max_task_concurrency, min_scheduling_resources, and per_task_resource_allocation#56381bveeramani merged 3 commits intomasterfrom
max_task_concurrency, min_scheduling_resources, and per_task_resource_allocation#56381Conversation
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
There was a problem hiding this comment.
Code Review
This pull request introduces three new methods to the PhysicalOperator interface: per_task_resource_allocation, max_task_concurrency, and min_scheduling_resources. These methods aim to clarify resource usage and scheduling semantics. The changes are well-implemented in the base class and in ActorPoolMapOperator and TaskPoolMapOperator. My review includes a suggestion to refactor a small portion of the code in ActorPoolMapOperator for better conciseness and maintainability by leveraging an existing helper method.
python/ray/data/_internal/execution/operators/actor_pool_map_operator.py
Outdated
Show resolved
Hide resolved
| res[logical_op_id] = logical_op._get_args() | ||
| return res | ||
|
|
||
| # TODO(@balaji): Disambiguate this with `incremental_resource_usage`. |
iamjustinhsu
left a comment
There was a problem hiding this comment.
looks fine, remember to do the TODO pls. Also, for scaling down by max_concurrency, idk if this would be clearer, but maybe you could do something like this?
res = ExecutionResources(
cpu=per_actor_resource_usage.cpu,
gpu=per_actor_resource_usage.gpu,
memory=per_actor_resource_usage.memory,
object_store_memory=per_actor_resource_usage.object_store_memory,
)
return res.scale(1 / max_concurrency)
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
…er_task_resource_allocation` (ray-project#56381) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> Adds three new methods to the PhysicalOperator interface to clarify resource usage and scheduling semantics: * `per_task_resource_allocation` – logical resources required per task (task-level granularity). * `max_task_concurrency` – maximum number of tasks allowed to run concurrently, if the operator enforces one. * `min_scheduling_resources` – minimum resource bundle required to schedule a worker (e.g., task vs. actor). These methods provide a clearer contract for how operators declare resource requirements, making scheduling behavior easier to reason about and extend. ## Related issue number <!-- For example: "Closes ray-project#1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
…er_task_resource_allocation` (ray-project#56381) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> Adds three new methods to the PhysicalOperator interface to clarify resource usage and scheduling semantics: * `per_task_resource_allocation` – logical resources required per task (task-level granularity). * `max_task_concurrency` – maximum number of tasks allowed to run concurrently, if the operator enforces one. * `min_scheduling_resources` – minimum resource bundle required to schedule a worker (e.g., task vs. actor). These methods provide a clearer contract for how operators declare resource requirements, making scheduling behavior easier to reason about and extend. ## Related issue number <!-- For example: "Closes ray-project#1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu> Signed-off-by: zac <zac@anyscale.com>
…er_task_resource_allocation` (ray-project#56381) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> Adds three new methods to the PhysicalOperator interface to clarify resource usage and scheduling semantics: * `per_task_resource_allocation` – logical resources required per task (task-level granularity). * `max_task_concurrency` – maximum number of tasks allowed to run concurrently, if the operator enforces one. * `min_scheduling_resources` – minimum resource bundle required to schedule a worker (e.g., task vs. actor). These methods provide a clearer contract for how operators declare resource requirements, making scheduling behavior easier to reason about and extend. ## Related issue number <!-- For example: "Closes ray-project#1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu> Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
…er_task_resource_allocation` (ray-project#56381) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> Adds three new methods to the PhysicalOperator interface to clarify resource usage and scheduling semantics: * `per_task_resource_allocation` – logical resources required per task (task-level granularity). * `max_task_concurrency` – maximum number of tasks allowed to run concurrently, if the operator enforces one. * `min_scheduling_resources` – minimum resource bundle required to schedule a worker (e.g., task vs. actor). These methods provide a clearer contract for how operators declare resource requirements, making scheduling behavior easier to reason about and extend. ## Related issue number <!-- For example: "Closes ray-project#1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
…er_task_resource_allocation` (ray-project#56381) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> Adds three new methods to the PhysicalOperator interface to clarify resource usage and scheduling semantics: * `per_task_resource_allocation` – logical resources required per task (task-level granularity). * `max_task_concurrency` – maximum number of tasks allowed to run concurrently, if the operator enforces one. * `min_scheduling_resources` – minimum resource bundle required to schedule a worker (e.g., task vs. actor). These methods provide a clearer contract for how operators declare resource requirements, making scheduling behavior easier to reason about and extend. ## Related issue number <!-- For example: "Closes ray-project#1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Why are these changes needed?
Adds three new methods to the PhysicalOperator interface to clarify resource usage and scheduling semantics:
per_task_resource_allocation– logical resources required per task (task-level granularity).max_task_concurrency– maximum number of tasks allowed to run concurrently, if the operator enforces one.min_scheduling_resources– minimum resource bundle required to schedule a worker (e.g., task vs. actor).These methods provide a clearer contract for how operators declare resource requirements, making scheduling behavior easier to reason about and extend.
Related issue number
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.