Skip to content

[Data] - Tabular form for Streaming Executor Op Metrics#59774

Merged
raulchen merged 5 commits intoray-project:masterfrom
goutamvenkat-anyscale:goutam/op_metrics_table
Jan 5, 2026
Merged

[Data] - Tabular form for Streaming Executor Op Metrics#59774
raulchen merged 5 commits intoray-project:masterfrom
goutamvenkat-anyscale:goutam/op_metrics_table

Conversation

@goutamvenkat-anyscale
Copy link
Contributor

Description

Instead of rendering a large json blob for Operator metrics, render the log in a tabular form for better readability.

Related issues

Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234".

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

Signed-off-by: Goutam <goutam@anyscale.com>
@goutamvenkat-anyscale goutamvenkat-anyscale requested a review from a team as a code owner December 30, 2025 22:50
@goutamvenkat-anyscale goutamvenkat-anyscale added data Ray Data-related issues go add ONLY when ready to merge, run all tests labels Dec 30, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the readability of operator metrics in logs by rendering them in a tabular format instead of a JSON blob. The implementation is clean and includes a comprehensive test case. I've suggested a minor refactoring to the new formatting function to improve its clarity and efficiency.

Signed-off-by: Goutam <goutam@anyscale.com>
Signed-off-by: Goutam <goutam@anyscale.com>
@goutamvenkat-anyscale
Copy link
Contributor Author

/gemini summary

@gemini-code-assist
Copy link
Contributor

Summary of Changes

This pull request significantly improves the user experience when inspecting Streaming Executor Operator Metrics by transforming the previously unstructured JSON log output into a clear, categorized, and tabular format. This change aims to make debugging and monitoring more efficient by presenting critical performance data in an easily digestible manner, allowing users to quickly grasp the state and performance of operators.

Highlights

  • Improved Metrics Logging: Operator metrics are now logged in a tabular format for enhanced readability, replacing the previous raw JSON blob output.
  • New Utility Function: A new private utility function, _format_metrics_table, has been introduced to categorize, sort, and format metric dictionaries into a visually organized pivot table.
  • Updated Logging Calls: Existing logging statements for completed operator metrics and general operator metrics have been updated to utilize the new tabular formatting function.
  • New Test Case: A dedicated test case, test_format_metrics_table, has been added to ensure the correct functionality and output of the new metrics formatting utility.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • python/ray/data/_internal/execution/streaming_executor.py
    • Modified _scheduling_loop_step to use _format_metrics_table for logging completed operator metrics.
    • Added _format_metrics_table function to format metric dictionaries into a categorized pivot table.
    • Updated _log_op_metrics to also leverage _format_metrics_table for logging operator metrics.
  • python/ray/data/tests/test_streaming_executor.py
    • Imported _format_metrics_table for testing purposes.
    • Added test_format_metrics_table to validate the output of the new metrics formatting function, including an empty dictionary scenario.
Activity
  • The author, goutamvenkat-anyscale, requested a summary of the pull request.
  • A bot provided a review comment suggesting a refactoring of the _format_metrics_table function for improved clarity and conciseness, including a code example.


lines.append(
f"└{'─' * (cat_width + 2)}┴{'─' * (key_width + 2)}┴{'─' * (val_width + 2)}┘"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use from ray._private.thirdparty.tabulate.tabulate import tabulate for tabulation

Signed-off-by: Goutam <goutam@anyscale.com>
Signed-off-by: Goutam <goutam@anyscale.com>
@raulchen raulchen merged commit 815bf7f into ray-project:master Jan 5, 2026
6 checks passed
AYou0207 pushed a commit to AYou0207/ray that referenced this pull request Jan 13, 2026
…59774)

## Description
Instead of rendering a large json blob for Operator metrics, render the
log in a tabular form for better readability.

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Goutam <goutam@anyscale.com>
Signed-off-by: jasonwrwang <jasonwrwang@tencent.com>
lee1258561 pushed a commit to pinterest/ray that referenced this pull request Feb 3, 2026
…59774)

## Description
Instead of rendering a large json blob for Operator metrics, render the
log in a tabular form for better readability.

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Goutam <goutam@anyscale.com>
ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Feb 3, 2026
…59774)

## Description
Instead of rendering a large json blob for Operator metrics, render the
log in a tabular form for better readability.

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Goutam <goutam@anyscale.com>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
…59774)

## Description
Instead of rendering a large json blob for Operator metrics, render the
log in a tabular form for better readability.

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Goutam <goutam@anyscale.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
…59774)

## Description
Instead of rendering a large json blob for Operator metrics, render the
log in a tabular form for better readability.

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Goutam <goutam@anyscale.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Ray Data-related issues go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ray fails to serialize self-reference objects

2 participants