[Data] Disable Limit Pushdown rule in new execution plan optimizer#36831
Merged
raulchen merged 1 commit intoray-project:masterfrom Jun 27, 2023
Merged
[Data] Disable Limit Pushdown rule in new execution plan optimizer#36831raulchen merged 1 commit intoray-project:masterfrom
raulchen merged 1 commit intoray-project:masterfrom
Conversation
Signed-off-by: Scott Lee <sjl@anyscale.com>
raulchen
approved these changes
Jun 27, 2023
8 tasks
arvind-chandra
pushed a commit
to lmco/ray
that referenced
this pull request
Aug 31, 2023
…ay-project#36831) The Limit Pushdown optimization rule, originally implemented in ray-project#35950, makes the assumption that `Map` and `MapBatches` operators do not change the number of input and output rows. We currently do not have any checks to enforce this condition, so as a result, if this row count invariant condition is not met, it is possible that the output will be incorrect if Limit Pushdown is applied. This row check was expected to be a relatively small change to be implemented in ray-project#36295, but this spawned additional discussion around whether we should enforce this in the first place (particularly around filtering with map_batches, current users' typical use cases, etc). We will delay implementing the row count invariant condition until Ray 2.7, so for Ray 2.6 release, we will disable the Limit Pushdown rule, and re-enable it once the aforementioned row count invariant discussion is resolved. Signed-off-by: Scott Lee <sjl@anyscale.com> Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
bveeramani
added a commit
that referenced
this pull request
Feb 4, 2026
This PR updates the operator fusion rule to fuse `MapBatches` even if they modify the row counts. The intention of this PR is to preserve the historical operator fusion behavior and avoid introducing regressions. For more details, see the timeline below. --- ### Timeline of Changes | Date | Event | Description | | :--- | :--- | :--- | | **June 8, 2023** | **Limit pushdown added** | Added limit pushdown and a property to `MapBatches` incorrectly stating it doesn't modify row counts. (#35950) | | **June 27, 2023** | **Limit pushdown disabled** | Rule disabled because it incorrectly pushed limits past UDFs that modified row counts. (#36831) | | **April 28, 2025** | **Fusion restricted** | Added logic to stop fusing operators that modify row counts when the downstream has a batch size. `MapBatches` stayed fused only because of its incorrect property (#52570). | | **July 8, 2025** | **Limit pushdown re-enabled with special case** | Re-enabled with a special case to prevent pushing limits past `MapBatches`. ([#39486](#39486)) | | **Oct 24, 2025** | **Special case removed** | Special case removed, re-introducing the bug where limits are pushed past `MapBatches`. ([#57880](#57880)) | | **Feb 2, 2026** | **Property Fix** | Updated `MapBatches` to correctly report it modifies rows by default. This fixed the pushdown bug but broke fusion logic. ([PR #60448](#60448)) | | **Feb 4, 2026** | (This PR) | Add a special-case to preserve the historical `MapBatches` fusion behavior | --- <!-- BUGBOT_STATUS --><sup><a href="https://cursor.com/dashboard?tab=bugbot">Cursor Bugbot</a> reviewed your changes and found no issues for commit <u>d99e7b1</u></sup><!-- /BUGBOT_STATUS --> --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
tiennguyentony
pushed a commit
to tiennguyentony/ray
that referenced
this pull request
Feb 7, 2026
…ct#60756) This PR updates the operator fusion rule to fuse `MapBatches` even if they modify the row counts. The intention of this PR is to preserve the historical operator fusion behavior and avoid introducing regressions. For more details, see the timeline below. --- ### Timeline of Changes | Date | Event | Description | | :--- | :--- | :--- | | **June 8, 2023** | **Limit pushdown added** | Added limit pushdown and a property to `MapBatches` incorrectly stating it doesn't modify row counts. (ray-project#35950) | | **June 27, 2023** | **Limit pushdown disabled** | Rule disabled because it incorrectly pushed limits past UDFs that modified row counts. (ray-project#36831) | | **April 28, 2025** | **Fusion restricted** | Added logic to stop fusing operators that modify row counts when the downstream has a batch size. `MapBatches` stayed fused only because of its incorrect property (ray-project#52570). | | **July 8, 2025** | **Limit pushdown re-enabled with special case** | Re-enabled with a special case to prevent pushing limits past `MapBatches`. ([ray-project#39486](ray-project#39486)) | | **Oct 24, 2025** | **Special case removed** | Special case removed, re-introducing the bug where limits are pushed past `MapBatches`. ([ray-project#57880](ray-project#57880)) | | **Feb 2, 2026** | **Property Fix** | Updated `MapBatches` to correctly report it modifies rows by default. This fixed the pushdown bug but broke fusion logic. ([PR ray-project#60448](ray-project#60448)) | | **Feb 4, 2026** | (This PR) | Add a special-case to preserve the historical `MapBatches` fusion behavior | --- <!-- BUGBOT_STATUS --><sup><a href="https://cursor.com/dashboard?tab=bugbot">Cursor Bugbot</a> reviewed your changes and found no issues for commit <u>d99e7b1</u></sup><!-- /BUGBOT_STATUS --> --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu> Signed-off-by: tiennguyentony <46289799+tiennguyentony@users.noreply.github.com>
tiennguyentony
pushed a commit
to tiennguyentony/ray
that referenced
this pull request
Feb 7, 2026
…ct#60756) This PR updates the operator fusion rule to fuse `MapBatches` even if they modify the row counts. The intention of this PR is to preserve the historical operator fusion behavior and avoid introducing regressions. For more details, see the timeline below. --- ### Timeline of Changes | Date | Event | Description | | :--- | :--- | :--- | | **June 8, 2023** | **Limit pushdown added** | Added limit pushdown and a property to `MapBatches` incorrectly stating it doesn't modify row counts. (ray-project#35950) | | **June 27, 2023** | **Limit pushdown disabled** | Rule disabled because it incorrectly pushed limits past UDFs that modified row counts. (ray-project#36831) | | **April 28, 2025** | **Fusion restricted** | Added logic to stop fusing operators that modify row counts when the downstream has a batch size. `MapBatches` stayed fused only because of its incorrect property (ray-project#52570). | | **July 8, 2025** | **Limit pushdown re-enabled with special case** | Re-enabled with a special case to prevent pushing limits past `MapBatches`. ([ray-project#39486](ray-project#39486)) | | **Oct 24, 2025** | **Special case removed** | Special case removed, re-introducing the bug where limits are pushed past `MapBatches`. ([ray-project#57880](ray-project#57880)) | | **Feb 2, 2026** | **Property Fix** | Updated `MapBatches` to correctly report it modifies rows by default. This fixed the pushdown bug but broke fusion logic. ([PR ray-project#60448](ray-project#60448)) | | **Feb 4, 2026** | (This PR) | Add a special-case to preserve the historical `MapBatches` fusion behavior | --- <!-- BUGBOT_STATUS --><sup><a href="https://cursor.com/dashboard?tab=bugbot">Cursor Bugbot</a> reviewed your changes and found no issues for commit <u>d99e7b1</u></sup><!-- /BUGBOT_STATUS --> --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu> Signed-off-by: tiennguyentony <46289799+tiennguyentony@users.noreply.github.com>
tiennguyentony
pushed a commit
to tiennguyentony/ray
that referenced
this pull request
Feb 7, 2026
…ct#60756) This PR updates the operator fusion rule to fuse `MapBatches` even if they modify the row counts. The intention of this PR is to preserve the historical operator fusion behavior and avoid introducing regressions. For more details, see the timeline below. --- ### Timeline of Changes | Date | Event | Description | | :--- | :--- | :--- | | **June 8, 2023** | **Limit pushdown added** | Added limit pushdown and a property to `MapBatches` incorrectly stating it doesn't modify row counts. (ray-project#35950) | | **June 27, 2023** | **Limit pushdown disabled** | Rule disabled because it incorrectly pushed limits past UDFs that modified row counts. (ray-project#36831) | | **April 28, 2025** | **Fusion restricted** | Added logic to stop fusing operators that modify row counts when the downstream has a batch size. `MapBatches` stayed fused only because of its incorrect property (ray-project#52570). | | **July 8, 2025** | **Limit pushdown re-enabled with special case** | Re-enabled with a special case to prevent pushing limits past `MapBatches`. ([ray-project#39486](ray-project#39486)) | | **Oct 24, 2025** | **Special case removed** | Special case removed, re-introducing the bug where limits are pushed past `MapBatches`. ([ray-project#57880](ray-project#57880)) | | **Feb 2, 2026** | **Property Fix** | Updated `MapBatches` to correctly report it modifies rows by default. This fixed the pushdown bug but broke fusion logic. ([PR ray-project#60448](ray-project#60448)) | | **Feb 4, 2026** | (This PR) | Add a special-case to preserve the historical `MapBatches` fusion behavior | --- <!-- BUGBOT_STATUS --><sup><a href="https://cursor.com/dashboard?tab=bugbot">Cursor Bugbot</a> reviewed your changes and found no issues for commit <u>d99e7b1</u></sup><!-- /BUGBOT_STATUS --> --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
elliot-barn
pushed a commit
that referenced
this pull request
Feb 9, 2026
This PR updates the operator fusion rule to fuse `MapBatches` even if they modify the row counts. The intention of this PR is to preserve the historical operator fusion behavior and avoid introducing regressions. For more details, see the timeline below. --- ### Timeline of Changes | Date | Event | Description | | :--- | :--- | :--- | | **June 8, 2023** | **Limit pushdown added** | Added limit pushdown and a property to `MapBatches` incorrectly stating it doesn't modify row counts. (#35950) | | **June 27, 2023** | **Limit pushdown disabled** | Rule disabled because it incorrectly pushed limits past UDFs that modified row counts. (#36831) | | **April 28, 2025** | **Fusion restricted** | Added logic to stop fusing operators that modify row counts when the downstream has a batch size. `MapBatches` stayed fused only because of its incorrect property (#52570). | | **July 8, 2025** | **Limit pushdown re-enabled with special case** | Re-enabled with a special case to prevent pushing limits past `MapBatches`. ([#39486](#39486)) | | **Oct 24, 2025** | **Special case removed** | Special case removed, re-introducing the bug where limits are pushed past `MapBatches`. ([#57880](#57880)) | | **Feb 2, 2026** | **Property Fix** | Updated `MapBatches` to correctly report it modifies rows by default. This fixed the pushdown bug but broke fusion logic. ([PR #60448](#60448)) | | **Feb 4, 2026** | (This PR) | Add a special-case to preserve the historical `MapBatches` fusion behavior | --- <!-- BUGBOT_STATUS --><sup><a href="https://cursor.com/dashboard?tab=bugbot">Cursor Bugbot</a> reviewed your changes and found no issues for commit <u>d99e7b1</u></sup><!-- /BUGBOT_STATUS --> --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
elliot-barn
pushed a commit
that referenced
this pull request
Feb 9, 2026
This PR updates the operator fusion rule to fuse `MapBatches` even if they modify the row counts. The intention of this PR is to preserve the historical operator fusion behavior and avoid introducing regressions. For more details, see the timeline below. --- ### Timeline of Changes | Date | Event | Description | | :--- | :--- | :--- | | **June 8, 2023** | **Limit pushdown added** | Added limit pushdown and a property to `MapBatches` incorrectly stating it doesn't modify row counts. (#35950) | | **June 27, 2023** | **Limit pushdown disabled** | Rule disabled because it incorrectly pushed limits past UDFs that modified row counts. (#36831) | | **April 28, 2025** | **Fusion restricted** | Added logic to stop fusing operators that modify row counts when the downstream has a batch size. `MapBatches` stayed fused only because of its incorrect property (#52570). | | **July 8, 2025** | **Limit pushdown re-enabled with special case** | Re-enabled with a special case to prevent pushing limits past `MapBatches`. ([#39486](#39486)) | | **Oct 24, 2025** | **Special case removed** | Special case removed, re-introducing the bug where limits are pushed past `MapBatches`. ([#57880](#57880)) | | **Feb 2, 2026** | **Property Fix** | Updated `MapBatches` to correctly report it modifies rows by default. This fixed the pushdown bug but broke fusion logic. ([PR #60448](#60448)) | | **Feb 4, 2026** | (This PR) | Add a special-case to preserve the historical `MapBatches` fusion behavior | --- <!-- BUGBOT_STATUS --><sup><a href="https://cursor.com/dashboard?tab=bugbot">Cursor Bugbot</a> reviewed your changes and found no issues for commit <u>d99e7b1</u></sup><!-- /BUGBOT_STATUS --> --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Kunchd
pushed a commit
to Kunchd/ray
that referenced
this pull request
Feb 17, 2026
…ct#60756) This PR updates the operator fusion rule to fuse `MapBatches` even if they modify the row counts. The intention of this PR is to preserve the historical operator fusion behavior and avoid introducing regressions. For more details, see the timeline below. --- ### Timeline of Changes | Date | Event | Description | | :--- | :--- | :--- | | **June 8, 2023** | **Limit pushdown added** | Added limit pushdown and a property to `MapBatches` incorrectly stating it doesn't modify row counts. (ray-project#35950) | | **June 27, 2023** | **Limit pushdown disabled** | Rule disabled because it incorrectly pushed limits past UDFs that modified row counts. (ray-project#36831) | | **April 28, 2025** | **Fusion restricted** | Added logic to stop fusing operators that modify row counts when the downstream has a batch size. `MapBatches` stayed fused only because of its incorrect property (ray-project#52570). | | **July 8, 2025** | **Limit pushdown re-enabled with special case** | Re-enabled with a special case to prevent pushing limits past `MapBatches`. ([ray-project#39486](ray-project#39486)) | | **Oct 24, 2025** | **Special case removed** | Special case removed, re-introducing the bug where limits are pushed past `MapBatches`. ([ray-project#57880](ray-project#57880)) | | **Feb 2, 2026** | **Property Fix** | Updated `MapBatches` to correctly report it modifies rows by default. This fixed the pushdown bug but broke fusion logic. ([PR ray-project#60448](ray-project#60448)) | | **Feb 4, 2026** | (This PR) | Add a special-case to preserve the historical `MapBatches` fusion behavior | --- <!-- BUGBOT_STATUS --><sup><a href="https://cursor.com/dashboard?tab=bugbot">Cursor Bugbot</a> reviewed your changes and found no issues for commit <u>d99e7b1</u></sup><!-- /BUGBOT_STATUS --> --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
ans9868
pushed a commit
to ans9868/ray
that referenced
this pull request
Feb 18, 2026
…ct#60756) This PR updates the operator fusion rule to fuse `MapBatches` even if they modify the row counts. The intention of this PR is to preserve the historical operator fusion behavior and avoid introducing regressions. For more details, see the timeline below. --- ### Timeline of Changes | Date | Event | Description | | :--- | :--- | :--- | | **June 8, 2023** | **Limit pushdown added** | Added limit pushdown and a property to `MapBatches` incorrectly stating it doesn't modify row counts. (ray-project#35950) | | **June 27, 2023** | **Limit pushdown disabled** | Rule disabled because it incorrectly pushed limits past UDFs that modified row counts. (ray-project#36831) | | **April 28, 2025** | **Fusion restricted** | Added logic to stop fusing operators that modify row counts when the downstream has a batch size. `MapBatches` stayed fused only because of its incorrect property (ray-project#52570). | | **July 8, 2025** | **Limit pushdown re-enabled with special case** | Re-enabled with a special case to prevent pushing limits past `MapBatches`. ([ray-project#39486](ray-project#39486)) | | **Oct 24, 2025** | **Special case removed** | Special case removed, re-introducing the bug where limits are pushed past `MapBatches`. ([ray-project#57880](ray-project#57880)) | | **Feb 2, 2026** | **Property Fix** | Updated `MapBatches` to correctly report it modifies rows by default. This fixed the pushdown bug but broke fusion logic. ([PR ray-project#60448](ray-project#60448)) | | **Feb 4, 2026** | (This PR) | Add a special-case to preserve the historical `MapBatches` fusion behavior | --- <!-- BUGBOT_STATUS --><sup><a href="https://cursor.com/dashboard?tab=bugbot">Cursor Bugbot</a> reviewed your changes and found no issues for commit <u>d99e7b1</u></sup><!-- /BUGBOT_STATUS --> --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu> Signed-off-by: Adel Nour <ans9868@nyu.edu>
Aydin-ab
pushed a commit
to kunling-anyscale/ray
that referenced
this pull request
Feb 20, 2026
…ct#60756) This PR updates the operator fusion rule to fuse `MapBatches` even if they modify the row counts. The intention of this PR is to preserve the historical operator fusion behavior and avoid introducing regressions. For more details, see the timeline below. --- ### Timeline of Changes | Date | Event | Description | | :--- | :--- | :--- | | **June 8, 2023** | **Limit pushdown added** | Added limit pushdown and a property to `MapBatches` incorrectly stating it doesn't modify row counts. (ray-project#35950) | | **June 27, 2023** | **Limit pushdown disabled** | Rule disabled because it incorrectly pushed limits past UDFs that modified row counts. (ray-project#36831) | | **April 28, 2025** | **Fusion restricted** | Added logic to stop fusing operators that modify row counts when the downstream has a batch size. `MapBatches` stayed fused only because of its incorrect property (ray-project#52570). | | **July 8, 2025** | **Limit pushdown re-enabled with special case** | Re-enabled with a special case to prevent pushing limits past `MapBatches`. ([ray-project#39486](ray-project#39486)) | | **Oct 24, 2025** | **Special case removed** | Special case removed, re-introducing the bug where limits are pushed past `MapBatches`. ([ray-project#57880](ray-project#57880)) | | **Feb 2, 2026** | **Property Fix** | Updated `MapBatches` to correctly report it modifies rows by default. This fixed the pushdown bug but broke fusion logic. ([PR ray-project#60448](ray-project#60448)) | | **Feb 4, 2026** | (This PR) | Add a special-case to preserve the historical `MapBatches` fusion behavior | --- <!-- BUGBOT_STATUS --><sup><a href="https://cursor.com/dashboard?tab=bugbot">Cursor Bugbot</a> reviewed your changes and found no issues for commit <u>d99e7b1</u></sup><!-- /BUGBOT_STATUS --> --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
peterxcli
pushed a commit
to peterxcli/ray
that referenced
this pull request
Feb 25, 2026
…ct#60756) This PR updates the operator fusion rule to fuse `MapBatches` even if they modify the row counts. The intention of this PR is to preserve the historical operator fusion behavior and avoid introducing regressions. For more details, see the timeline below. --- ### Timeline of Changes | Date | Event | Description | | :--- | :--- | :--- | | **June 8, 2023** | **Limit pushdown added** | Added limit pushdown and a property to `MapBatches` incorrectly stating it doesn't modify row counts. (ray-project#35950) | | **June 27, 2023** | **Limit pushdown disabled** | Rule disabled because it incorrectly pushed limits past UDFs that modified row counts. (ray-project#36831) | | **April 28, 2025** | **Fusion restricted** | Added logic to stop fusing operators that modify row counts when the downstream has a batch size. `MapBatches` stayed fused only because of its incorrect property (ray-project#52570). | | **July 8, 2025** | **Limit pushdown re-enabled with special case** | Re-enabled with a special case to prevent pushing limits past `MapBatches`. ([ray-project#39486](ray-project#39486)) | | **Oct 24, 2025** | **Special case removed** | Special case removed, re-introducing the bug where limits are pushed past `MapBatches`. ([ray-project#57880](ray-project#57880)) | | **Feb 2, 2026** | **Property Fix** | Updated `MapBatches` to correctly report it modifies rows by default. This fixed the pushdown bug but broke fusion logic. ([PR ray-project#60448](ray-project#60448)) | | **Feb 4, 2026** | (This PR) | Add a special-case to preserve the historical `MapBatches` fusion behavior | --- <!-- BUGBOT_STATUS --><sup><a href="https://cursor.com/dashboard?tab=bugbot">Cursor Bugbot</a> reviewed your changes and found no issues for commit <u>d99e7b1</u></sup><!-- /BUGBOT_STATUS --> --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu> Signed-off-by: peterxcli <peterxcli@gmail.com>
peterxcli
pushed a commit
to peterxcli/ray
that referenced
this pull request
Feb 25, 2026
…ct#60756) This PR updates the operator fusion rule to fuse `MapBatches` even if they modify the row counts. The intention of this PR is to preserve the historical operator fusion behavior and avoid introducing regressions. For more details, see the timeline below. --- ### Timeline of Changes | Date | Event | Description | | :--- | :--- | :--- | | **June 8, 2023** | **Limit pushdown added** | Added limit pushdown and a property to `MapBatches` incorrectly stating it doesn't modify row counts. (ray-project#35950) | | **June 27, 2023** | **Limit pushdown disabled** | Rule disabled because it incorrectly pushed limits past UDFs that modified row counts. (ray-project#36831) | | **April 28, 2025** | **Fusion restricted** | Added logic to stop fusing operators that modify row counts when the downstream has a batch size. `MapBatches` stayed fused only because of its incorrect property (ray-project#52570). | | **July 8, 2025** | **Limit pushdown re-enabled with special case** | Re-enabled with a special case to prevent pushing limits past `MapBatches`. ([ray-project#39486](ray-project#39486)) | | **Oct 24, 2025** | **Special case removed** | Special case removed, re-introducing the bug where limits are pushed past `MapBatches`. ([ray-project#57880](ray-project#57880)) | | **Feb 2, 2026** | **Property Fix** | Updated `MapBatches` to correctly report it modifies rows by default. This fixed the pushdown bug but broke fusion logic. ([PR ray-project#60448](ray-project#60448)) | | **Feb 4, 2026** | (This PR) | Add a special-case to preserve the historical `MapBatches` fusion behavior | --- <!-- BUGBOT_STATUS --><sup><a href="https://cursor.com/dashboard?tab=bugbot">Cursor Bugbot</a> reviewed your changes and found no issues for commit <u>d99e7b1</u></sup><!-- /BUGBOT_STATUS --> --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu> Signed-off-by: peterxcli <peterxcli@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why are these changes needed?
The Limit Pushdown optimization rule, originally implemented in #35950, makes the assumption that
MapandMapBatchesoperators do not change the number of input and output rows. We currently do not have any checks to enforce this condition, so as a result, if this row count invariant condition is not met, it is possible that the output will be incorrect if Limit Pushdown is applied.This row check was expected to be a relatively small change to be implemented in #36295, but this spawned additional discussion around whether we should enforce this in the first place (particularly around filtering with map_batches, current users' typical use cases, etc).
We will delay implementing the row count invariant condition until Ray 2.7, so for Ray 2.6 release, we will disable the Limit Pushdown rule, and re-enable it once the aforementioned row count invariant discussion is resolved.
Related issue number
Closes #36821
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.