Skip to content

[Data] Progress Reporting Refactor 3#59880

Merged
bveeramani merged 16 commits intoray-project:masterfrom
kyuds:progress-refactor-3
Jan 12, 2026
Merged

[Data] Progress Reporting Refactor 3#59880
bveeramani merged 16 commits intoray-project:masterfrom
kyuds:progress-refactor-3

Conversation

@kyuds
Copy link
Member

@kyuds kyuds commented Jan 6, 2026

Description

Third PR for isolating progress managers / making them easier to work with.

  • Fully unify interfaces for all progress managers
  • Introduce the Noop progress --> basically to use for no-op situations
  • Separate out function for determining which progress manager to use
  • Add in verbose_progress setting for ExecutionOptions. This was missing from previous versions, mb

Related issues

N/A

Additional information

N/A

Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
@kyuds kyuds requested a review from a team as a code owner January 6, 2026 07:16
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a good refactoring of the progress reporting logic. It successfully unifies the interfaces for progress managers and centralizes the logic for selecting which manager to use into a new get_progress_manager function. The introduction of Noop progress managers for disabled cases is a clean and effective approach.

I've identified a couple of critical issues where the new verbose_progress parameter is not being passed to the progress manager constructors, which will lead to a TypeError. Additionally, I've pointed out some incorrect type hints in the tqdm progress manager that should be corrected for type safety and consistency.

After addressing these points, this will be a solid improvement to the codebase.

kyuds added 3 commits January 5, 2026 21:19
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
kyuds added 2 commits January 5, 2026 21:44
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
@kyuds kyuds added the go add ONLY when ready to merge, run all tests label Jan 6, 2026
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
@ray-gardener ray-gardener bot added data Ray Data-related issues community-contribution Contributed by the community labels Jan 6, 2026
...


class NoopSubProgressBar(BaseProgressBar):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Should the Noop implementations go into its own module like the other concrete subclasses?

Copy link
Member Author

@kyuds kyuds Jan 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decision was made because 1) its short, like no logic 2) other progress managers also use the NoopSubProgressBar, so its easier to import the base. Open to moving it out if needed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG. Not strongly opinionated

Comment on lines +85 to +93

contains_sub_progress_bars = isinstance(op, SubProgressBarMixin)
sub_progress_bar_enabled = show_op_progress and (
contains_sub_progress_bars or verbose_progress
)

# create operator progress bar
uid = uuid.uuid4()
if sub_progress_bar_enabled:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff introduces a bug.

import time

import ray

ray.data.DataContext.get_current().execution_options.verbose_progress = False

def sleep(row):
    time.sleep(1)
    return row


ray.data.range(100).map(sleep).materialize()

On master, the progress bar shows the operator-level progress bars (expected), but on this PR, the operator-level progress bars are hidden.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, as I read the code, I felt confused about where we create the operator-level progress bars and where we create the operator subprogress bars. I think I felt confused because this diff treats operator-level progress bars as subprogress bars, but that doesn't match my understanding of the concepts

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

operator-level progress bars and operator subprogress bars are inherently different because the former is updated in streaming_executor while the latter is updated all over the place and passed around via TaskContext. I think you get confused with the tqdm implementation because the class TqdmSubProgressBar itself is reused for all tqdm-related progress bars.

for verbose_progress, as explained in later comment, this is fixing a feature that I accidentally broke before, so its more of a bug fix on current master branch vs. me creating a breaking behavior. Again, I am very sorry about this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. TqdmSubProgressBar is both operator-level progress bar, as well as operator sub-progress bars.

Would it make sense to just call the class TqdmProgressBar then? The "Sub" suggested to me that it's related to the sub-progress bars

self._max_name_length = state.get("max_name_length", 100)


class RichExecutionProgressManager(BaseExecutionProgressManager):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a similar bug here where we don't show operator progress if verbose isn't enabled

import time

import ray

ray.data.DataContext.get_current().execution_options.verbose_progress = False
ray.data.DataContext.get_current().enable_rich_progress_bars = True
ray.data.DataContext.get_current().use_ray_tqdm = False

def sleep(row):
    time.sleep(1)
    return row


ray.data.range(100).map(sleep, memory=1).materialize()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok for this, we also discussed this offline on slack, but this was actually a bug that I missed out on a previous PR. This is the intended behavior that I said was going to reintroduce.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's right. I think we should report operator-level progress even if verbose_progress=False, but I guess that's orthogonal to this PR...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think this is more of a product decision. If we want to ignore verbose_progress, we should just remove it to confuse extra handles. Relevant handles already include enable_operator_progress_bars and enable_progress_bars, so verbose_progress does seem like a duplicate. Willing to do a deeper dive into this if this direction seems positive. CC'ing @richardliaw @gvspraveen

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. The current state is kind of a mess.

Open to proposals if you have ideas for how to clean this up

_update_with_conditional_rate(progress, tid, metrics)
self._total.update(self._total_task_id, description=desc, **kwargs)
self.refresh()
time.sleep(0.02)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out-of-scope for this PR, but I think this 0.02 magic number is hard to understand. Could you remind me what happens if we don't have it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment explaining. TLDR: this delay is needed between rich Live refresh and stop so that the latest data is correctly reflected and rendered (before Live module is stopped)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there are more robust method to achieve this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly tried and failed on multiple attempts. Delay to print output does seem necessary. Will keep this on the back of my head though.

kyuds added 4 commits January 9, 2026 17:27
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
@kyuds kyuds requested a review from bveeramani January 10, 2026 03:36

# create operator progress bar
uid = uuid.uuid4()
if sub_progress_bar_enabled:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operator progress bars hidden when verbose_progress is disabled

High Severity

The condition sub_progress_bar_enabled = show_op_progress and (contains_sub_progress_bars or verbose_progress) incorrectly gates operator-level progress bar creation. When verbose_progress=False and an operator doesn't implement SubProgressBarMixin, operator-level progress bars are hidden even though show_op_progress is True. Previously, operator-level progress bars were shown whenever show_op_progress was True, regardless of these additional conditions. This causes a regression where standard operator progress bars disappear for most operators.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no this is intended behavior...

Copy link
Member

@bveeramani bveeramani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as an incremental improvement

...


class NoopSubProgressBar(BaseProgressBar):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG. Not strongly opinionated

_update_with_conditional_rate(progress, tid, metrics)
self._total.update(self._total_task_id, description=desc, **kwargs)
self.refresh()
time.sleep(0.02)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there are more robust method to achieve this?

self._max_name_length = state.get("max_name_length", 100)


class RichExecutionProgressManager(BaseExecutionProgressManager):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's right. I think we should report operator-level progress even if verbose_progress=False, but I guess that's orthogonal to this PR...

Comment on lines +85 to +93

contains_sub_progress_bars = isinstance(op, SubProgressBarMixin)
sub_progress_bar_enabled = show_op_progress and (
contains_sub_progress_bars or verbose_progress
)

# create operator progress bar
uid = uuid.uuid4()
if sub_progress_bar_enabled:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. TqdmSubProgressBar is both operator-level progress bar, as well as operator sub-progress bars.

Would it make sense to just call the class TqdmProgressBar then? The "Sub" suggested to me that it's related to the sub-progress bars

@bveeramani bveeramani merged commit 1d22032 into ray-project:master Jan 12, 2026
6 checks passed
@kyuds kyuds deleted the progress-refactor-3 branch January 12, 2026 09:10
AYou0207 pushed a commit to AYou0207/ray that referenced this pull request Jan 13, 2026
## Description
Third PR for isolating progress managers / making them easier to work
with.

- Fully unify interfaces for all progress managers
- Introduce the `Noop` progress --> basically to use for no-op
situations
- Separate out function for determining which progress manager to use
- Add in `verbose_progress` setting for `ExecutionOptions`. This was
missing from previous versions, mb

## Related issues
N/A

## Additional information
N/A

---------

Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: jasonwrwang <jasonwrwang@tencent.com>
ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Feb 3, 2026
## Description
Third PR for isolating progress managers / making them easier to work
with.

- Fully unify interfaces for all progress managers
- Introduce the `Noop` progress --> basically to use for no-op
situations
- Separate out function for determining which progress manager to use
- Add in `verbose_progress` setting for `ExecutionOptions`. This was
missing from previous versions, mb

## Related issues
N/A

## Additional information
N/A

---------

Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
## Description
Third PR for isolating progress managers / making them easier to work
with.

- Fully unify interfaces for all progress managers
- Introduce the `Noop` progress --> basically to use for no-op
situations
- Separate out function for determining which progress manager to use
- Add in `verbose_progress` setting for `ExecutionOptions`. This was
missing from previous versions, mb

## Related issues
N/A

## Additional information
N/A

---------

Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
## Description
Third PR for isolating progress managers / making them easier to work
with.

- Fully unify interfaces for all progress managers
- Introduce the `Noop` progress --> basically to use for no-op
situations
- Separate out function for determining which progress manager to use
- Add in `verbose_progress` setting for `ExecutionOptions`. This was
missing from previous versions, mb

## Related issues
N/A

## Additional information
N/A

---------

Signed-off-by: Daniel Shin <kyuseung1016@gmail.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community data Ray Data-related issues go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants