[rllib] Hyperparameter Optimisation Example by pseudo-rnd-thoughts · Pull Request #60182 · ray-project/ray

pseudo-rnd-thoughts · 2026-01-15T19:10:49Z

Description

This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole.

Do we want this tested in nightly or premerge?

Signed-off-by: Mark Towers <mark@anyscale.com>

rllib/examples/ray_tune/cartpole_hyperopt.py

gemini-code-assist

Code Review

This pull request introduces a new example script for hyperparameter optimization using HyperOpt with APPO on the CartPole-v1 environment. The script is well-structured and provides a clear demonstration of using Ray Tune for HPO in RLlib. I've included a few minor suggestions to correct typos in the documentation and improve a hyperparameter range for better clarity and convention.

rllib/examples/ray_tune/cartpole_hyperopt.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>

rllib/examples/ray_tune/cartpole_hyperopt.py

Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>

rllib/examples/ray_tune/cartpole_hyperopt.py

Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>

rllib/examples/ray_tune/appo_hyperparameter_tune.py

rllib/examples/ray_tune/cartpole_hyperopt.py

Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>

rllib/examples/ray_tune/cartpole_hyperopt.py

Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>

rllib/examples/ray_tune/appo_hyperparameter_tune.py

rllib/examples/ray_tune/cartpole_hyperopt.py

kamil-kaczmarek · 2026-01-16T00:50:56Z

Excellent example, showing RLlib and Ray Tune together!

I'd test this as a premerge, like this:

py_test(
    name = "learning_tests_ray_tune_cartpole_hyperopt_multi_gpu",
    size = "large",
    srcs = ["examples/ray_tune/cartpole_hyperopt.py"],
    args = [
        ...
    ],
    main = "examples/ray_tune/cartpole_hyperopt.py",
    tags = [
        "exclusive",
        "learning_tests",
        "multi_gpu",
        "team:rllib",
    ],
)

With multi_gpu tag we have 4 GPUs and 48 CPUs available, so we can run this HPO job on a single node cluster.

kamil-kaczmarek

Great example! Added few comments and suggested testing strategy.

Signed-off-by: Mark Towers <mark@anyscale.com>

rllib/BUILD.bazel

kamil-kaczmarek · 2026-01-16T16:49:42Z

rllib/BUILD.bazel

+    srcs = ["examples/ray_tune/appo_hyperparameter_tune.py"],
+    args = [
+        "--num-samples=6",
+        "--max-concurrent-trials=2",


can run 4, this node is 4 GPU.

Updated to use 4 GPUs and 12 samples for 3 iterations

Signed-off-by: Mark Towers <mark@anyscale.com>

rllib/examples/ray_tune/appo_hyperparameter_tune.py

typo Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>

kamil-kaczmarek

Last details:

need to bring back deps = [":conftest"], in the BUILD.bazel
pytest has num-samples=12 samples, while python scripts has num-samples=16. I would parametrize the same in both places, to simplify. Either 12 or 16 looks good as long as we use the same number.
I think you wanted to parametrize to run 3 iterations, but it's not in the code yet.
In the previous PRs we agreed to list arg in the BUILD.bazel alphabetically. Can you fix here?

pseudo-rnd-thoughts · 2026-01-21T10:11:55Z

need to bring back deps = [":conftest"], in the BUILD.bazel

None of the examples use pytest therefore the conftest isn't actually used which is why I removed it. Why do you think we need it?

pytest has num-samples=12 samples, while python scripts has num-samples=16. I would parametrize the same in both places, to simplify. Either 12 or 16 looks good as long as we use the same number.

The python script doesn't have any performance testing, therefore, all we are testing is that script runs currently. To me, the difference in samples doesn't matter and just wastes premerge compute.

I think you wanted to parametrize to run 3 iterations, but it's not in the code yet.

What do you mean?

In the previous PRs we agreed to list arg in the BUILD.bazel alphabetically. Can you fix here?

Updated

@kamil-kaczmarek Checking premerge, this example is failing due to hyperopt not being installed.

Signed-off-by: Mark Towers <mark@anyscale.com>

kamil-kaczmarek · 2026-01-21T22:35:02Z

need to bring back deps = [":conftest"], in the BUILD.bazel

None of the examples use pytest therefore the conftest isn't actually used which is why I removed it. Why do you think we need it?

Ach right! Thought that you removed it from all py_tests.

pytest has num-samples=12 samples, while python scripts has num-samples=16. I would parametrize the same in both places, to simplify. Either 12 or 16 looks good as long as we use the same number.

The python script doesn't have any performance testing, therefore, all we are testing is that script runs currently. To me, the difference in samples doesn't matter and just wastes premerge compute.

My comment is about coherence/simplicity. How about 12 in both places?

I think you wanted to parametrize to run 3 iterations, but it's not in the code yet.

What do you mean?

I referred to your comment; you mentioned: "Updated to use 4 GPUs and 12 samples for 3 iterations".

In the previous PRs we agreed to list arg in the BUILD.bazel alphabetically. Can you fix here?

Updated

thx!

@kamil-kaczmarek Checking premerge, this example is failing due to hyperopt not being installed.

How about swapping HyperOpt with BasicVariantGenerator and still wrapping in ConcurrencyLimiter? I'd prefer not to add any extra dependencies to the testing image.

Signed-off-by: Mark Towers <mark@anyscale.com>

rllib/examples/ray_tune/appo_hyperparameter_tune.py

Signed-off-by: Mark Towers <mark@anyscale.com>

kamil-kaczmarek

LGTM!

Signed-off-by: Mark Towers <mark@anyscale.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

rllib/examples/ray_tune/appo_hyperparameter_tune.py

## Description This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole. Do we want this tested in nightly or premerge? --------- Signed-off-by: Mark Towers <mark@anyscale.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: Mark Towers <mark@anyscale.com> Co-authored-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

## Description This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole. Do we want this tested in nightly or premerge? --------- Signed-off-by: Mark Towers <mark@anyscale.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: Mark Towers <mark@anyscale.com> Co-authored-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: jinbum-kim <jinbum9958@gmail.com>

## Description This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole. Do we want this tested in nightly or premerge? --------- Signed-off-by: Mark Towers <mark@anyscale.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: Mark Towers <mark@anyscale.com> Co-authored-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: 400Ping <jiekaichang@apache.org>

## Description This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole. Do we want this tested in nightly or premerge? --------- Signed-off-by: Mark Towers <mark@anyscale.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: Mark Towers <mark@anyscale.com> Co-authored-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: peterxcli <peterxcli@gmail.com>

Mark Towers added 3 commits January 15, 2026 17:41

[rllib] Add CartPole HPO example

3a160cf

Signed-off-by: Mark Towers <mark@anyscale.com>

Update documentation

1eb253e

Signed-off-by: Mark Towers <mark@anyscale.com>

Update documentation

f508a0f

Signed-off-by: Mark Towers <mark@anyscale.com>

pseudo-rnd-thoughts marked this pull request as ready for review January 15, 2026 19:10

pseudo-rnd-thoughts requested a review from a team as a code owner January 15, 2026 19:10

pseudo-rnd-thoughts added alpha Alpha release features rllib RLlib related issues rllib-docs-or-examples Issues related to RLlib documentation or rllib/examples and removed alpha Alpha release features labels Jan 15, 2026

cursor bot reviewed Jan 15, 2026

View reviewed changes

rllib/examples/ray_tune/cartpole_hyperopt.py Outdated Show resolved Hide resolved

gemini-code-assist bot reviewed Jan 15, 2026

View reviewed changes

rllib/examples/ray_tune/cartpole_hyperopt.py Outdated Show resolved Hide resolved

rllib/examples/ray_tune/cartpole_hyperopt.py Outdated Show resolved Hide resolved

rllib/examples/ray_tune/cartpole_hyperopt.py Outdated Show resolved Hide resolved

kamil-kaczmarek assigned pseudo-rnd-thoughts and kamil-kaczmarek Jan 15, 2026

kamil-kaczmarek and others added 2 commits January 15, 2026 14:33

Apply suggestion from @gemini-code-assist[bot]

1b2f025

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>

Apply suggestion from @gemini-code-assist[bot]

58aa5c2

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>

kamil-kaczmarek reviewed Jan 15, 2026

View reviewed changes

rllib/examples/ray_tune/cartpole_hyperopt.py Outdated Show resolved Hide resolved

kamil-kaczmarek and others added 2 commits January 15, 2026 14:55

lint

111215e

Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>

typo

1fea315

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>

kamil-kaczmarek reviewed Jan 16, 2026

View reviewed changes

rllib/examples/ray_tune/cartpole_hyperopt.py Outdated Show resolved Hide resolved

Apply suggestion from @kamil-kaczmarek

04c04fa

Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>

kamil-kaczmarek reviewed Jan 16, 2026

View reviewed changes

rllib/examples/ray_tune/appo_hyperparameter_tune.py Outdated Show resolved Hide resolved

cursor bot reviewed Jan 16, 2026

View reviewed changes

rllib/examples/ray_tune/appo_hyperparameter_tune.py Show resolved Hide resolved

kamil-kaczmarek reviewed Jan 16, 2026

View reviewed changes

rllib/examples/ray_tune/cartpole_hyperopt.py Outdated Show resolved Hide resolved

Apply suggestion from @kamil-kaczmarek

f8f92f9

Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>

kamil-kaczmarek reviewed Jan 16, 2026

View reviewed changes

rllib/examples/ray_tune/cartpole_hyperopt.py Outdated Show resolved Hide resolved

Apply suggestion from @kamil-kaczmarek

6f4fe08

Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>

kamil-kaczmarek reviewed Jan 16, 2026

View reviewed changes

rllib/examples/ray_tune/appo_hyperparameter_tune.py Outdated Show resolved Hide resolved

kamil-kaczmarek reviewed Jan 16, 2026

View reviewed changes

rllib/examples/ray_tune/cartpole_hyperopt.py Outdated Show resolved Hide resolved

kamil-kaczmarek requested changes Jan 16, 2026

View reviewed changes

Kamil code review

b8e5ed1

Signed-off-by: Mark Towers <mark@anyscale.com>

This comment was marked as outdated.

Sign in to view

Remove GPU example as it already exists

4c2e9e0

Signed-off-by: Mark Towers <mark@anyscale.com>

This comment was marked as outdated.

Sign in to view

kamil-kaczmarek reviewed Jan 16, 2026

View reviewed changes

rllib/BUILD.bazel Show resolved Hide resolved

kamil-kaczmarek reviewed Jan 16, 2026

View reviewed changes

Minimise Anyscale references

da29e78

Signed-off-by: Mark Towers <mark@anyscale.com>

pseudo-rnd-thoughts added the go add ONLY when ready to merge, run all tests label Jan 20, 2026

kamil-kaczmarek reviewed Jan 20, 2026

View reviewed changes

rllib/examples/ray_tune/appo_hyperparameter_tune.py Outdated Show resolved Hide resolved

Apply suggestion from @kamil-kaczmarek

726508c

typo Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>

This comment was marked as outdated.

Sign in to view

kamil-kaczmarek requested changes Jan 20, 2026

View reviewed changes

pseudo-rnd-thoughts and others added 2 commits January 21, 2026 12:01

Merge branch 'master' into hpo-example

8df1ea9

Ordered Arguments

c24196b

Signed-off-by: Mark Towers <mark@anyscale.com>

Update implementation

2300bb0

Signed-off-by: Mark Towers <mark@anyscale.com>

kamil-kaczmarek reviewed Jan 22, 2026

View reviewed changes

rllib/examples/ray_tune/appo_hyperparameter_tune.py Outdated Show resolved Hide resolved

Mark Towers and others added 2 commits January 22, 2026 16:48

Revert storage path

468484e

Signed-off-by: Mark Towers <mark@anyscale.com>

Merge branch 'master' into hpo-example

8864838

kamil-kaczmarek approved these changes Jan 22, 2026

View reviewed changes

kamil-kaczmarek and others added 2 commits January 22, 2026 20:39

Merge branch 'master' into hpo-example

c994fc1

Increase size and reduce number of samples

09590e4

Signed-off-by: Mark Towers <mark@anyscale.com>

cursor bot reviewed Jan 23, 2026

View reviewed changes

rllib/examples/ray_tune/appo_hyperparameter_tune.py Show resolved Hide resolved

simonsays1980 merged commit c31b898 into ray-project:master Jan 23, 2026
6 checks passed

Conversation

pseudo-rnd-thoughts commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kamil-kaczmarek commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kamil-kaczmarek left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

kamil-kaczmarek Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

pseudo-rnd-thoughts Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

kamil-kaczmarek left a comment

Choose a reason for hiding this comment

Uh oh!

pseudo-rnd-thoughts commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kamil-kaczmarek commented Jan 21, 2026

Uh oh!

Uh oh!

kamil-kaczmarek left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pseudo-rnd-thoughts commented Jan 15, 2026 •

edited

Loading

kamil-kaczmarek commented Jan 16, 2026 •

edited

Loading

pseudo-rnd-thoughts commented Jan 21, 2026 •

edited

Loading