Skip to content

[rllib] Decreases log quantity for learning tests#59005

Merged
ArturNiederfahrenhorst merged 7 commits intoray-project:masterfrom
pseudo-rnd-thoughts:reduce-learning-log-freq
Jan 9, 2026
Merged

[rllib] Decreases log quantity for learning tests#59005
ArturNiederfahrenhorst merged 7 commits intoray-project:masterfrom
pseudo-rnd-thoughts:reduce-learning-log-freq

Conversation

@pseudo-rnd-thoughts
Copy link
Member

@pseudo-rnd-thoughts pseudo-rnd-thoughts commented Nov 26, 2025

Description

Reviewing our testing logs, they can often be incredibly long. This PR aims to reduce them by changing three things

  1. By default, the CLIReporter in run_rllib_example_script_experiment will report an algorithms training results at least every 5 seconds. This PR adds a tune-max-report-freq argument that we keep at 5 for end-users while in tests we change it to 30 seconds
  2. Change the verbosity of the tune results from 2 to 1 when testing
  3. Removed WARNING impala_learner.py:576 -- No old learner state to remove from the queue. warnings

Signed-off-by: Mark Towers <mark@anyscale.com>
@pseudo-rnd-thoughts pseudo-rnd-thoughts added the rllib RLlib related issues label Nov 26, 2025
@pseudo-rnd-thoughts
Copy link
Member Author

Reviewing the buildkite logs, the problem isn't the Tune CLIReporter frequency as I originally believed but the print of the results after each train I believe.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had
any activity for 14 days. It will be closed in another 14 days if no further activity occurs.
Thank you for your contributions.

You can always ask for help on our discussion forum or Ray's public slack channel.

If you'd like to keep this open, just leave any comment, and the stale label will be removed.

@github-actions github-actions bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Dec 11, 2025
@github-actions
Copy link

This pull request has been automatically closed because there has been no more activity in the 14 days
since being marked stale.

Please feel free to reopen or open a new pull request if you'd still like this to be addressed.

Again, you can always ask for help on our discussion forum or Ray's public slack channel.

Thanks again for your contribution!

@github-actions github-actions bot closed this Dec 26, 2025
@pseudo-rnd-thoughts pseudo-rnd-thoughts added unstale A PR that has been marked unstale. It will not get marked stale again if this label is on it. and removed stale The issue is stale. It will be closed within 7 days unless there are further conversation labels Dec 29, 2025
Mark Towers added 2 commits January 7, 2026 19:14
# Conflicts:
#	rllib/utils/test_utils.py
Signed-off-by: Mark Towers <mark@anyscale.com>
@pseudo-rnd-thoughts pseudo-rnd-thoughts marked this pull request as ready for review January 7, 2026 19:39
@pseudo-rnd-thoughts pseudo-rnd-thoughts requested a review from a team as a code owner January 7, 2026 19:39
Signed-off-by: Mark Towers <mark@anyscale.com>
@pseudo-rnd-thoughts pseudo-rnd-thoughts changed the title [rllib] Increase log frequency for learning tests [rllib] Decreases log quantity for learning tests Jan 7, 2026
],
# Include the offline data files.
data = [
"--tune-max-report-freq=30",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argument incorrectly placed in Bazel data instead of args

High Severity

The --tune-max-report-freq=30 argument is added to the data attribute instead of the args attribute in multiple py_test rules. In Bazel, data specifies data files needed at runtime while args specifies command-line arguments. This affects 10 test definitions: learning_tests_cartpole_bc, learning_tests_cartpole_bc_gpu, learning_tests_cartpole_bc_with_offline_evaluation, learning_tests_cartpole_bc_with_offline_evaluation_gpu, learning_tests_pendulum_cql, learning_tests_pendulum_cql_gpu, learning_tests_pendulum_iql, learning_tests_pendulum_iql_gpu, learning_tests_cartpole_marwil, and learning_tests_cartpole_marwil_gpu. The argument won't be passed to the scripts and Bazel may fail trying to find a file with that name.

Additional Locations (2)

Fix in Cursor Fix in Web

Signed-off-by: Mark Towers <mark@anyscale.com>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can create CLIReporter regardless the value of the args.num_agents (L658).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@kamil-kaczmarek kamil-kaczmarek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just left smal nit to improve

Mark Towers added 2 commits January 8, 2026 09:44
Signed-off-by: Mark Towers <mark@anyscale.com>
Signed-off-by: Mark Towers <mark@anyscale.com>
@pseudo-rnd-thoughts pseudo-rnd-thoughts added the go add ONLY when ready to merge, run all tests label Jan 8, 2026
@ArturNiederfahrenhorst ArturNiederfahrenhorst merged commit bbb55ac into ray-project:master Jan 9, 2026
7 checks passed
AYou0207 pushed a commit to AYou0207/ray that referenced this pull request Jan 13, 2026
## Description
Reviewing our testing logs, they can often be incredibly long. This PR
aims to reduce them by changing three things
1. By default, the CLIReporter in `run_rllib_example_script_experiment`
will report an algorithms training results at least every 5 seconds.
This PR adds a `tune-max-report-freq` argument that we keep at 5 for
end-users while in tests we change it to 30 seconds
2. Change the verbosity of the tune results from 2 to 1 when testing
3. Removed ` WARNING impala_learner.py:576 -- No old learner state to
remove from the queue.` warnings

---------

Signed-off-by: Mark Towers <mark@anyscale.com>
Co-authored-by: Mark Towers <mark@anyscale.com>
Signed-off-by: jasonwrwang <jasonwrwang@tencent.com>
lee1258561 pushed a commit to pinterest/ray that referenced this pull request Feb 3, 2026
## Description
Reviewing our testing logs, they can often be incredibly long. This PR
aims to reduce them by changing three things
1. By default, the CLIReporter in `run_rllib_example_script_experiment`
will report an algorithms training results at least every 5 seconds.
This PR adds a `tune-max-report-freq` argument that we keep at 5 for
end-users while in tests we change it to 30 seconds
2. Change the verbosity of the tune results from 2 to 1 when testing
3. Removed ` WARNING impala_learner.py:576 -- No old learner state to
remove from the queue.` warnings

---------

Signed-off-by: Mark Towers <mark@anyscale.com>
Co-authored-by: Mark Towers <mark@anyscale.com>
ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Feb 3, 2026
## Description
Reviewing our testing logs, they can often be incredibly long. This PR
aims to reduce them by changing three things
1. By default, the CLIReporter in `run_rllib_example_script_experiment`
will report an algorithms training results at least every 5 seconds.
This PR adds a `tune-max-report-freq` argument that we keep at 5 for
end-users while in tests we change it to 30 seconds
2. Change the verbosity of the tune results from 2 to 1 when testing
3. Removed ` WARNING impala_learner.py:576 -- No old learner state to
remove from the queue.` warnings

---------

Signed-off-by: Mark Towers <mark@anyscale.com>
Co-authored-by: Mark Towers <mark@anyscale.com>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
## Description
Reviewing our testing logs, they can often be incredibly long. This PR
aims to reduce them by changing three things
1. By default, the CLIReporter in `run_rllib_example_script_experiment`
will report an algorithms training results at least every 5 seconds.
This PR adds a `tune-max-report-freq` argument that we keep at 5 for
end-users while in tests we change it to 30 seconds
2. Change the verbosity of the tune results from 2 to 1 when testing
3. Removed ` WARNING impala_learner.py:576 -- No old learner state to
remove from the queue.` warnings

---------

Signed-off-by: Mark Towers <mark@anyscale.com>
Co-authored-by: Mark Towers <mark@anyscale.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
## Description
Reviewing our testing logs, they can often be incredibly long. This PR
aims to reduce them by changing three things
1. By default, the CLIReporter in `run_rllib_example_script_experiment`
will report an algorithms training results at least every 5 seconds.
This PR adds a `tune-max-report-freq` argument that we keep at 5 for
end-users while in tests we change it to 30 seconds
2. Change the verbosity of the tune results from 2 to 1 when testing
3. Removed ` WARNING impala_learner.py:576 -- No old learner state to
remove from the queue.` warnings

---------

Signed-off-by: Mark Towers <mark@anyscale.com>
Co-authored-by: Mark Towers <mark@anyscale.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests rllib RLlib related issues unstale A PR that has been marked unstale. It will not get marked stale again if this label is on it.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants