[Data] Progress Reporting Refactor 3#59880
Conversation
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request is a good refactoring of the progress reporting logic. It successfully unifies the interfaces for progress managers and centralizes the logic for selecting which manager to use into a new get_progress_manager function. The introduction of Noop progress managers for disabled cases is a clean and effective approach.
I've identified a couple of critical issues where the new verbose_progress parameter is not being passed to the progress manager constructors, which will lead to a TypeError. Additionally, I've pointed out some incorrect type hints in the tqdm progress manager that should be corrected for type safety and consistency.
After addressing these points, this will be a solid improvement to the codebase.
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
| ... | ||
|
|
||
|
|
||
| class NoopSubProgressBar(BaseProgressBar): |
There was a problem hiding this comment.
Nit: Should the Noop implementations go into its own module like the other concrete subclasses?
There was a problem hiding this comment.
decision was made because 1) its short, like no logic 2) other progress managers also use the NoopSubProgressBar, so its easier to import the base. Open to moving it out if needed
There was a problem hiding this comment.
SG. Not strongly opinionated
|
|
||
| contains_sub_progress_bars = isinstance(op, SubProgressBarMixin) | ||
| sub_progress_bar_enabled = show_op_progress and ( | ||
| contains_sub_progress_bars or verbose_progress | ||
| ) | ||
|
|
||
| # create operator progress bar | ||
| uid = uuid.uuid4() | ||
| if sub_progress_bar_enabled: |
There was a problem hiding this comment.
This diff introduces a bug.
import time
import ray
ray.data.DataContext.get_current().execution_options.verbose_progress = False
def sleep(row):
time.sleep(1)
return row
ray.data.range(100).map(sleep).materialize()On master, the progress bar shows the operator-level progress bars (expected), but on this PR, the operator-level progress bars are hidden.
There was a problem hiding this comment.
Also, as I read the code, I felt confused about where we create the operator-level progress bars and where we create the operator subprogress bars. I think I felt confused because this diff treats operator-level progress bars as subprogress bars, but that doesn't match my understanding of the concepts
There was a problem hiding this comment.
operator-level progress bars and operator subprogress bars are inherently different because the former is updated in streaming_executor while the latter is updated all over the place and passed around via TaskContext. I think you get confused with the tqdm implementation because the class TqdmSubProgressBar itself is reused for all tqdm-related progress bars.
for verbose_progress, as explained in later comment, this is fixing a feature that I accidentally broke before, so its more of a bug fix on current master branch vs. me creating a breaking behavior. Again, I am very sorry about this.
There was a problem hiding this comment.
I see. TqdmSubProgressBar is both operator-level progress bar, as well as operator sub-progress bars.
Would it make sense to just call the class TqdmProgressBar then? The "Sub" suggested to me that it's related to the sub-progress bars
| self._max_name_length = state.get("max_name_length", 100) | ||
|
|
||
|
|
||
| class RichExecutionProgressManager(BaseExecutionProgressManager): |
There was a problem hiding this comment.
I think there's a similar bug here where we don't show operator progress if verbose isn't enabled
import time
import ray
ray.data.DataContext.get_current().execution_options.verbose_progress = False
ray.data.DataContext.get_current().enable_rich_progress_bars = True
ray.data.DataContext.get_current().use_ray_tqdm = False
def sleep(row):
time.sleep(1)
return row
ray.data.range(100).map(sleep, memory=1).materialize()There was a problem hiding this comment.
ok for this, we also discussed this offline on slack, but this was actually a bug that I missed out on a previous PR. This is the intended behavior that I said was going to reintroduce.
There was a problem hiding this comment.
Oh, that's right. I think we should report operator-level progress even if verbose_progress=False, but I guess that's orthogonal to this PR...
There was a problem hiding this comment.
think this is more of a product decision. If we want to ignore verbose_progress, we should just remove it to confuse extra handles. Relevant handles already include enable_operator_progress_bars and enable_progress_bars, so verbose_progress does seem like a duplicate. Willing to do a deeper dive into this if this direction seems positive. CC'ing @richardliaw @gvspraveen
There was a problem hiding this comment.
Yeah. The current state is kind of a mess.
Open to proposals if you have ideas for how to clean this up
| _update_with_conditional_rate(progress, tid, metrics) | ||
| self._total.update(self._total_task_id, description=desc, **kwargs) | ||
| self.refresh() | ||
| time.sleep(0.02) |
There was a problem hiding this comment.
Out-of-scope for this PR, but I think this 0.02 magic number is hard to understand. Could you remind me what happens if we don't have it?
There was a problem hiding this comment.
added comment explaining. TLDR: this delay is needed between rich Live refresh and stop so that the latest data is correctly reflected and rendered (before Live module is stopped)
There was a problem hiding this comment.
Is there are more robust method to achieve this?
There was a problem hiding this comment.
I honestly tried and failed on multiple attempts. Delay to print output does seem necessary. Will keep this on the back of my head though.
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
|
|
||
| # create operator progress bar | ||
| uid = uuid.uuid4() | ||
| if sub_progress_bar_enabled: |
There was a problem hiding this comment.
Operator progress bars hidden when verbose_progress is disabled
High Severity
The condition sub_progress_bar_enabled = show_op_progress and (contains_sub_progress_bars or verbose_progress) incorrectly gates operator-level progress bar creation. When verbose_progress=False and an operator doesn't implement SubProgressBarMixin, operator-level progress bars are hidden even though show_op_progress is True. Previously, operator-level progress bars were shown whenever show_op_progress was True, regardless of these additional conditions. This causes a regression where standard operator progress bars disappear for most operators.
Additional Locations (1)
There was a problem hiding this comment.
no this is intended behavior...
bveeramani
left a comment
There was a problem hiding this comment.
LGTM as an incremental improvement
| ... | ||
|
|
||
|
|
||
| class NoopSubProgressBar(BaseProgressBar): |
There was a problem hiding this comment.
SG. Not strongly opinionated
| _update_with_conditional_rate(progress, tid, metrics) | ||
| self._total.update(self._total_task_id, description=desc, **kwargs) | ||
| self.refresh() | ||
| time.sleep(0.02) |
There was a problem hiding this comment.
Is there are more robust method to achieve this?
| self._max_name_length = state.get("max_name_length", 100) | ||
|
|
||
|
|
||
| class RichExecutionProgressManager(BaseExecutionProgressManager): |
There was a problem hiding this comment.
Oh, that's right. I think we should report operator-level progress even if verbose_progress=False, but I guess that's orthogonal to this PR...
|
|
||
| contains_sub_progress_bars = isinstance(op, SubProgressBarMixin) | ||
| sub_progress_bar_enabled = show_op_progress and ( | ||
| contains_sub_progress_bars or verbose_progress | ||
| ) | ||
|
|
||
| # create operator progress bar | ||
| uid = uuid.uuid4() | ||
| if sub_progress_bar_enabled: |
There was a problem hiding this comment.
I see. TqdmSubProgressBar is both operator-level progress bar, as well as operator sub-progress bars.
Would it make sense to just call the class TqdmProgressBar then? The "Sub" suggested to me that it's related to the sub-progress bars
## Description Third PR for isolating progress managers / making them easier to work with. - Fully unify interfaces for all progress managers - Introduce the `Noop` progress --> basically to use for no-op situations - Separate out function for determining which progress manager to use - Add in `verbose_progress` setting for `ExecutionOptions`. This was missing from previous versions, mb ## Related issues N/A ## Additional information N/A --------- Signed-off-by: Daniel Shin <kyuseung1016@gmail.com> Signed-off-by: jasonwrwang <jasonwrwang@tencent.com>
## Description Third PR for isolating progress managers / making them easier to work with. - Fully unify interfaces for all progress managers - Introduce the `Noop` progress --> basically to use for no-op situations - Separate out function for determining which progress manager to use - Add in `verbose_progress` setting for `ExecutionOptions`. This was missing from previous versions, mb ## Related issues N/A ## Additional information N/A --------- Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
## Description Third PR for isolating progress managers / making them easier to work with. - Fully unify interfaces for all progress managers - Introduce the `Noop` progress --> basically to use for no-op situations - Separate out function for determining which progress manager to use - Add in `verbose_progress` setting for `ExecutionOptions`. This was missing from previous versions, mb ## Related issues N/A ## Additional information N/A --------- Signed-off-by: Daniel Shin <kyuseung1016@gmail.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
## Description Third PR for isolating progress managers / making them easier to work with. - Fully unify interfaces for all progress managers - Introduce the `Noop` progress --> basically to use for no-op situations - Separate out function for determining which progress manager to use - Add in `verbose_progress` setting for `ExecutionOptions`. This was missing from previous versions, mb ## Related issues N/A ## Additional information N/A --------- Signed-off-by: Daniel Shin <kyuseung1016@gmail.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
Description
Third PR for isolating progress managers / making them easier to work with.
Noopprogress --> basically to use for no-op situationsverbose_progresssetting forExecutionOptions. This was missing from previous versions, mbRelated issues
N/A
Additional information
N/A