Skip to content

[rllib] Hyperparameter Optimisation Example#60182

Merged
simonsays1980 merged 22 commits intoray-project:masterfrom
pseudo-rnd-thoughts:hpo-example
Jan 23, 2026
Merged

[rllib] Hyperparameter Optimisation Example#60182
simonsays1980 merged 22 commits intoray-project:masterfrom
pseudo-rnd-thoughts:hpo-example

Conversation

@pseudo-rnd-thoughts
Copy link
Member

@pseudo-rnd-thoughts pseudo-rnd-thoughts commented Jan 15, 2026

Description

This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole.

Do we want this tested in nightly or premerge?

Mark Towers added 3 commits January 15, 2026 17:41
Signed-off-by: Mark Towers <mark@anyscale.com>
Signed-off-by: Mark Towers <mark@anyscale.com>
Signed-off-by: Mark Towers <mark@anyscale.com>
@pseudo-rnd-thoughts pseudo-rnd-thoughts marked this pull request as ready for review January 15, 2026 19:10
@pseudo-rnd-thoughts pseudo-rnd-thoughts requested a review from a team as a code owner January 15, 2026 19:10
@pseudo-rnd-thoughts pseudo-rnd-thoughts added alpha Alpha release features rllib RLlib related issues rllib-docs-or-examples Issues related to RLlib documentation or rllib/examples and removed alpha Alpha release features labels Jan 15, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new example script for hyperparameter optimization using HyperOpt with APPO on the CartPole-v1 environment. The script is well-structured and provides a clear demonstration of using Ray Tune for HPO in RLlib. I've included a few minor suggestions to correct typos in the documentation and improve a hyperparameter range for better clarity and convention.

kamil-kaczmarek and others added 2 commits January 15, 2026 14:33
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
kamil-kaczmarek and others added 2 commits January 15, 2026 14:55
Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
@kamil-kaczmarek
Copy link
Contributor

kamil-kaczmarek commented Jan 16, 2026

Excellent example, showing RLlib and Ray Tune together!

I'd test this as a premerge, like this:

py_test(
    name = "learning_tests_ray_tune_cartpole_hyperopt_multi_gpu",
    size = "large",
    srcs = ["examples/ray_tune/cartpole_hyperopt.py"],
    args = [
        ...
    ],
    main = "examples/ray_tune/cartpole_hyperopt.py",
    tags = [
        "exclusive",
        "learning_tests",
        "multi_gpu",
        "team:rllib",
    ],
)

With multi_gpu tag we have 4 GPUs and 48 CPUs available, so we can run this HPO job on a single node cluster.

Copy link
Contributor

@kamil-kaczmarek kamil-kaczmarek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great example! Added few comments and suggested testing strategy.

Signed-off-by: Mark Towers <mark@anyscale.com>
cursor[bot]

This comment was marked as outdated.

Signed-off-by: Mark Towers <mark@anyscale.com>
cursor[bot]

This comment was marked as outdated.

srcs = ["examples/ray_tune/appo_hyperparameter_tune.py"],
args = [
"--num-samples=6",
"--max-concurrent-trials=2",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can run 4, this node is 4 GPU.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use 4 GPUs and 12 samples for 3 iterations

Signed-off-by: Mark Towers <mark@anyscale.com>
@pseudo-rnd-thoughts pseudo-rnd-thoughts added the go add ONLY when ready to merge, run all tests label Jan 20, 2026
typo

Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
cursor[bot]

This comment was marked as outdated.

Copy link
Contributor

@kamil-kaczmarek kamil-kaczmarek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last details:

  • need to bring back deps = [":conftest"], in the BUILD.bazel
  • pytest has num-samples=12 samples, while python scripts has num-samples=16. I would parametrize the same in both places, to simplify. Either 12 or 16 looks good as long as we use the same number.
  • I think you wanted to parametrize to run 3 iterations, but it's not in the code yet.
  • In the previous PRs we agreed to list arg in the BUILD.bazel alphabetically. Can you fix here?

@pseudo-rnd-thoughts
Copy link
Member Author

pseudo-rnd-thoughts commented Jan 21, 2026

need to bring back deps = [":conftest"], in the BUILD.bazel

None of the examples use pytest therefore the conftest isn't actually used which is why I removed it. Why do you think we need it?

pytest has num-samples=12 samples, while python scripts has num-samples=16. I would parametrize the same in both places, to simplify. Either 12 or 16 looks good as long as we use the same number.

The python script doesn't have any performance testing, therefore, all we are testing is that script runs currently. To me, the difference in samples doesn't matter and just wastes premerge compute.

I think you wanted to parametrize to run 3 iterations, but it's not in the code yet.

What do you mean?

In the previous PRs we agreed to list arg in the BUILD.bazel alphabetically. Can you fix here?

Updated

@kamil-kaczmarek Checking premerge, this example is failing due to hyperopt not being installed.

pseudo-rnd-thoughts and others added 2 commits January 21, 2026 12:01
Signed-off-by: Mark Towers <mark@anyscale.com>
@kamil-kaczmarek
Copy link
Contributor

need to bring back deps = [":conftest"], in the BUILD.bazel

None of the examples use pytest therefore the conftest isn't actually used which is why I removed it. Why do you think we need it?

Ach right! Thought that you removed it from all py_tests.

pytest has num-samples=12 samples, while python scripts has num-samples=16. I would parametrize the same in both places, to simplify. Either 12 or 16 looks good as long as we use the same number.

The python script doesn't have any performance testing, therefore, all we are testing is that script runs currently. To me, the difference in samples doesn't matter and just wastes premerge compute.

My comment is about coherence/simplicity. How about 12 in both places?

I think you wanted to parametrize to run 3 iterations, but it's not in the code yet.

What do you mean?

I referred to your comment; you mentioned: "Updated to use 4 GPUs and 12 samples for 3 iterations".

In the previous PRs we agreed to list arg in the BUILD.bazel alphabetically. Can you fix here?

Updated

thx!

@kamil-kaczmarek Checking premerge, this example is failing due to hyperopt not being installed.

How about swapping HyperOpt with BasicVariantGenerator and still wrapping in ConcurrencyLimiter? I'd prefer not to add any extra dependencies to the testing image.

Signed-off-by: Mark Towers <mark@anyscale.com>
Mark Towers and others added 2 commits January 22, 2026 16:48
Signed-off-by: Mark Towers <mark@anyscale.com>
Copy link
Contributor

@kamil-kaczmarek kamil-kaczmarek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

@simonsays1980 simonsays1980 merged commit c31b898 into ray-project:master Jan 23, 2026
6 checks passed
xinyuangui2 pushed a commit to xinyuangui2/ray that referenced this pull request Jan 26, 2026
## Description
This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole.

Do we want this tested in nightly or premerge?

---------

Signed-off-by: Mark Towers <mark@anyscale.com>
Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
Co-authored-by: Mark Towers <mark@anyscale.com>
Co-authored-by: Kamil Kaczmarek <kamil@anyscale.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
jinbum-kim pushed a commit to jinbum-kim/ray that referenced this pull request Jan 29, 2026
## Description
This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole.

Do we want this tested in nightly or premerge?

---------

Signed-off-by: Mark Towers <mark@anyscale.com>
Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
Co-authored-by: Mark Towers <mark@anyscale.com>
Co-authored-by: Kamil Kaczmarek <kamil@anyscale.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: jinbum-kim <jinbum9958@gmail.com>
400Ping pushed a commit to 400Ping/ray that referenced this pull request Feb 1, 2026
## Description
This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole.

Do we want this tested in nightly or premerge?

---------

Signed-off-by: Mark Towers <mark@anyscale.com>
Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
Co-authored-by: Mark Towers <mark@anyscale.com>
Co-authored-by: Kamil Kaczmarek <kamil@anyscale.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: 400Ping <jiekaichang@apache.org>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
## Description
This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole.

Do we want this tested in nightly or premerge?

---------

Signed-off-by: Mark Towers <mark@anyscale.com>
Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
Co-authored-by: Mark Towers <mark@anyscale.com>
Co-authored-by: Kamil Kaczmarek <kamil@anyscale.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
## Description
This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole.

Do we want this tested in nightly or premerge?

---------

Signed-off-by: Mark Towers <mark@anyscale.com>
Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
Co-authored-by: Mark Towers <mark@anyscale.com>
Co-authored-by: Kamil Kaczmarek <kamil@anyscale.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests rllib RLlib related issues rllib-docs-or-examples Issues related to RLlib documentation or rllib/examples

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants