[data] explain optimized#58074
Conversation
…54857) Signed-off-by: EkinKarabulut <ekarabulut@nvidia.com> Signed-off-by: EkinKarabulut <82878945+EkinKarabulut@users.noreply.github.com> Signed-off-by: Rueian <rueiancsie@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Co-authored-by: fscnick <6858627+fscnick@users.noreply.github.com> Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com> Co-authored-by: Rueian <rueiancsie@gmail.com> Signed-off-by: iamjustinhsu <jhsu@anyscale.com>
…/explain-optimized
Signed-off-by: iamjustinhsu <jhsu@anyscale.com>
Make sense to me. But is unoptimized plan needed? 😂 |
|
@my-vegetable-has-exploded I think it's nice to have, to see how the plan is being transformed. |
| "+- Map(<lambda>)\n" | ||
| " +- ReadRange\n" | ||
| "-------- Physical Plan --------\n" | ||
| "Filter[Filter(<lambda>)]\n" |
There was a problem hiding this comment.
@richardliaw should this be verbose mode?
There was a problem hiding this comment.
hi @iamjustinhsu, maybe we can in pick up #57798 here?
There was a problem hiding this comment.
do you mean to combine the PRs? I think we should keep these separate because they serve different purposes, although merge conflicts will be a bit messy.
alexeykudinkin
left a comment
There was a problem hiding this comment.
LGTM, minor comments
python/ray/data/_internal/plan.py
Outdated
| convert_fns: List[Callable[[Plan], Plan]] = [ | ||
| lambda x: x, | ||
| LogicalOptimizer().optimize, | ||
| create_planner().plan, | ||
| PhysicalOptimizer().optimize, | ||
| ] |
There was a problem hiding this comment.
Instead, abstract bsae method from get_optimized_plan that will be returning all 4 (so that function we use here is exactly the same we're using when executing)
Signed-off-by: iamjustinhsu <jhsu@anyscale.com>
Signed-off-by: iamjustinhsu <jhsu@anyscale.com>
…/explain-optimized
| convert_fns = [lambda x: x] + get_plan_conversion_fns() | ||
| titles: List[str] = [ | ||
| "Logical Plan", | ||
| "Logical Plan (Optimized)", | ||
| "Physical Plan", | ||
| "Physical Plan (Optimized)", | ||
| ] |
There was a problem hiding this comment.
| convert_fns = [lambda x: x] + get_plan_conversion_fns() | |
| titles: List[str] = [ | |
| "Logical Plan", | |
| "Logical Plan (Optimized)", | |
| "Physical Plan", | |
| "Physical Plan (Optimized)", | |
| ] | |
| titles, plan_transform_fn = zip(*[ | |
| ("Logical Plan", None), | |
| ("Logical Plan (Optimized)", optimize_logical), | |
| ("Physical Plan", plan), | |
| ("Physical Plan (Optimized)", optimize_physical), | |
| ]) |
## Description
This PR introduces more information into the `explain` API. Before,
`explain` showed Unoptimized Logical Plan, and Optimized Physical Plan.
To make the `explain` API clearer, I introduce 4 types of plans
- Logical Plan
- Logical Plan (Optimized)
- Physical Plan
- Physical Plan (Optimized)
Example Output
```python
>>> import ray
>>> ray.data.range(1000).select_columns("id").explain()
-------- Logical Plan --------
Project[Project]
+- Read[ReadRange]
-------- Logical Plan (Optimized) --------
Project[Project]
+- Read[ReadRange]
-------- Physical Plan --------
TaskPoolMapOperator[Project]
+- TaskPoolMapOperator[ReadRange]
+- InputDataBuffer[Input]
-------- Physical Plan (Optimized) --------
TaskPoolMapOperator[ReadRange->Project]
+- InputDataBuffer[Input]
```
## Related issues
None
## Additional information
None
---------
Signed-off-by: EkinKarabulut <ekarabulut@nvidia.com>
Signed-off-by: EkinKarabulut <82878945+EkinKarabulut@users.noreply.github.com>
Signed-off-by: Rueian <rueiancsie@gmail.com>
Signed-off-by: iamjustinhsu <jhsu@anyscale.com>
Co-authored-by: EkinKarabulut <82878945+EkinKarabulut@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
Co-authored-by: fscnick <6858627+fscnick@users.noreply.github.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Co-authored-by: Rueian <rueiancsie@gmail.com>
## Description
This PR introduces more information into the `explain` API. Before,
`explain` showed Unoptimized Logical Plan, and Optimized Physical Plan.
To make the `explain` API clearer, I introduce 4 types of plans
- Logical Plan
- Logical Plan (Optimized)
- Physical Plan
- Physical Plan (Optimized)
Example Output
```python
>>> import ray
>>> ray.data.range(1000).select_columns("id").explain()
-------- Logical Plan --------
Project[Project]
+- Read[ReadRange]
-------- Logical Plan (Optimized) --------
Project[Project]
+- Read[ReadRange]
-------- Physical Plan --------
TaskPoolMapOperator[Project]
+- TaskPoolMapOperator[ReadRange]
+- InputDataBuffer[Input]
-------- Physical Plan (Optimized) --------
TaskPoolMapOperator[ReadRange->Project]
+- InputDataBuffer[Input]
```
## Related issues
None
## Additional information
None
---------
Signed-off-by: EkinKarabulut <ekarabulut@nvidia.com>
Signed-off-by: EkinKarabulut <82878945+EkinKarabulut@users.noreply.github.com>
Signed-off-by: Rueian <rueiancsie@gmail.com>
Signed-off-by: iamjustinhsu <jhsu@anyscale.com>
Co-authored-by: EkinKarabulut <82878945+EkinKarabulut@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
Co-authored-by: fscnick <6858627+fscnick@users.noreply.github.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Co-authored-by: Rueian <rueiancsie@gmail.com>
## Description
This PR introduces more information into the `explain` API. Before,
`explain` showed Unoptimized Logical Plan, and Optimized Physical Plan.
To make the `explain` API clearer, I introduce 4 types of plans
- Logical Plan
- Logical Plan (Optimized)
- Physical Plan
- Physical Plan (Optimized)
Example Output
```python
>>> import ray
>>> ray.data.range(1000).select_columns("id").explain()
-------- Logical Plan --------
Project[Project]
+- Read[ReadRange]
-------- Logical Plan (Optimized) --------
Project[Project]
+- Read[ReadRange]
-------- Physical Plan --------
TaskPoolMapOperator[Project]
+- TaskPoolMapOperator[ReadRange]
+- InputDataBuffer[Input]
-------- Physical Plan (Optimized) --------
TaskPoolMapOperator[ReadRange->Project]
+- InputDataBuffer[Input]
```
## Related issues
None
## Additional information
None
---------
Signed-off-by: EkinKarabulut <ekarabulut@nvidia.com>
Signed-off-by: EkinKarabulut <82878945+EkinKarabulut@users.noreply.github.com>
Signed-off-by: Rueian <rueiancsie@gmail.com>
Signed-off-by: iamjustinhsu <jhsu@anyscale.com>
Co-authored-by: EkinKarabulut <82878945+EkinKarabulut@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
Co-authored-by: fscnick <6858627+fscnick@users.noreply.github.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Co-authored-by: Rueian <rueiancsie@gmail.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
## Description
This PR introduces more information into the `explain` API. Before,
`explain` showed Unoptimized Logical Plan, and Optimized Physical Plan.
To make the `explain` API clearer, I introduce 4 types of plans
- Logical Plan
- Logical Plan (Optimized)
- Physical Plan
- Physical Plan (Optimized)
Example Output
```python
>>> import ray
>>> ray.data.range(1000).select_columns("id").explain()
-------- Logical Plan --------
Project[Project]
+- Read[ReadRange]
-------- Logical Plan (Optimized) --------
Project[Project]
+- Read[ReadRange]
-------- Physical Plan --------
TaskPoolMapOperator[Project]
+- TaskPoolMapOperator[ReadRange]
+- InputDataBuffer[Input]
-------- Physical Plan (Optimized) --------
TaskPoolMapOperator[ReadRange->Project]
+- InputDataBuffer[Input]
```
## Related issues
None
## Additional information
None
---------
Signed-off-by: EkinKarabulut <ekarabulut@nvidia.com>
Signed-off-by: EkinKarabulut <82878945+EkinKarabulut@users.noreply.github.com>
Signed-off-by: Rueian <rueiancsie@gmail.com>
Signed-off-by: iamjustinhsu <jhsu@anyscale.com>
Co-authored-by: EkinKarabulut <82878945+EkinKarabulut@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
Co-authored-by: fscnick <6858627+fscnick@users.noreply.github.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Co-authored-by: Rueian <rueiancsie@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
## Description
This PR introduces more information into the `explain` API. Before,
`explain` showed Unoptimized Logical Plan, and Optimized Physical Plan.
To make the `explain` API clearer, I introduce 4 types of plans
- Logical Plan
- Logical Plan (Optimized)
- Physical Plan
- Physical Plan (Optimized)
Example Output
```python
>>> import ray
>>> ray.data.range(1000).select_columns("id").explain()
-------- Logical Plan --------
Project[Project]
+- Read[ReadRange]
-------- Logical Plan (Optimized) --------
Project[Project]
+- Read[ReadRange]
-------- Physical Plan --------
TaskPoolMapOperator[Project]
+- TaskPoolMapOperator[ReadRange]
+- InputDataBuffer[Input]
-------- Physical Plan (Optimized) --------
TaskPoolMapOperator[ReadRange->Project]
+- InputDataBuffer[Input]
```
## Related issues
None
## Additional information
None
---------
Signed-off-by: EkinKarabulut <ekarabulut@nvidia.com>
Signed-off-by: EkinKarabulut <82878945+EkinKarabulut@users.noreply.github.com>
Signed-off-by: Rueian <rueiancsie@gmail.com>
Signed-off-by: iamjustinhsu <jhsu@anyscale.com>
Co-authored-by: EkinKarabulut <82878945+EkinKarabulut@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
Co-authored-by: fscnick <6858627+fscnick@users.noreply.github.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Co-authored-by: Rueian <rueiancsie@gmail.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
Description
This PR introduces more information into the
explainAPI. Before,explainshowed Unoptimized Logical Plan, and Optimized Physical Plan. To make theexplainAPI clearer, I introduce 4 types of plansExample Output
Related issues
None
Additional information
None