[rllib] Hyperparameter Optimisation Example#60182
[rllib] Hyperparameter Optimisation Example#60182simonsays1980 merged 22 commits intoray-project:masterfrom
Conversation
Signed-off-by: Mark Towers <mark@anyscale.com>
Signed-off-by: Mark Towers <mark@anyscale.com>
Signed-off-by: Mark Towers <mark@anyscale.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a new example script for hyperparameter optimization using HyperOpt with APPO on the CartPole-v1 environment. The script is well-structured and provides a clear demonstration of using Ray Tune for HPO in RLlib. I've included a few minor suggestions to correct typos in the documentation and improve a hyperparameter range for better clarity and convention.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
|
Excellent example, showing RLlib and Ray Tune together! I'd test this as a premerge, like this: py_test(
name = "learning_tests_ray_tune_cartpole_hyperopt_multi_gpu",
size = "large",
srcs = ["examples/ray_tune/cartpole_hyperopt.py"],
args = [
...
],
main = "examples/ray_tune/cartpole_hyperopt.py",
tags = [
"exclusive",
"learning_tests",
"multi_gpu",
"team:rllib",
],
)With |
kamil-kaczmarek
left a comment
There was a problem hiding this comment.
Great example! Added few comments and suggested testing strategy.
Signed-off-by: Mark Towers <mark@anyscale.com>
Signed-off-by: Mark Towers <mark@anyscale.com>
rllib/BUILD.bazel
Outdated
| srcs = ["examples/ray_tune/appo_hyperparameter_tune.py"], | ||
| args = [ | ||
| "--num-samples=6", | ||
| "--max-concurrent-trials=2", |
There was a problem hiding this comment.
can run 4, this node is 4 GPU.
There was a problem hiding this comment.
Updated to use 4 GPUs and 12 samples for 3 iterations
Signed-off-by: Mark Towers <mark@anyscale.com>
typo Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
kamil-kaczmarek
left a comment
There was a problem hiding this comment.
Last details:
- need to bring back
deps = [":conftest"],in the BUILD.bazel - pytest has
num-samples=12samples, while python scripts hasnum-samples=16. I would parametrize the same in both places, to simplify. Either 12 or 16 looks good as long as we use the same number. - I think you wanted to parametrize to run 3 iterations, but it's not in the code yet.
- In the previous PRs we agreed to list arg in the BUILD.bazel alphabetically. Can you fix here?
None of the examples use pytest therefore the conftest isn't actually used which is why I removed it. Why do you think we need it?
The python script doesn't have any performance testing, therefore, all we are testing is that script runs currently. To me, the difference in samples doesn't matter and just wastes premerge compute.
What do you mean?
Updated @kamil-kaczmarek Checking premerge, this example is failing due to hyperopt not being installed. |
Signed-off-by: Mark Towers <mark@anyscale.com>
Ach right! Thought that you removed it from all py_tests.
My comment is about coherence/simplicity. How about 12 in both places?
I referred to your comment; you mentioned: "Updated to use 4 GPUs and 12 samples for 3 iterations".
thx!
How about swapping HyperOpt with BasicVariantGenerator and still wrapping in ConcurrencyLimiter? I'd prefer not to add any extra dependencies to the testing image. |
Signed-off-by: Mark Towers <mark@anyscale.com>
Signed-off-by: Mark Towers <mark@anyscale.com>
Signed-off-by: Mark Towers <mark@anyscale.com>
## Description This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole. Do we want this tested in nightly or premerge? --------- Signed-off-by: Mark Towers <mark@anyscale.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: Mark Towers <mark@anyscale.com> Co-authored-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
## Description This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole. Do we want this tested in nightly or premerge? --------- Signed-off-by: Mark Towers <mark@anyscale.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: Mark Towers <mark@anyscale.com> Co-authored-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: jinbum-kim <jinbum9958@gmail.com>
## Description This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole. Do we want this tested in nightly or premerge? --------- Signed-off-by: Mark Towers <mark@anyscale.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: Mark Towers <mark@anyscale.com> Co-authored-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: 400Ping <jiekaichang@apache.org>
## Description This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole. Do we want this tested in nightly or premerge? --------- Signed-off-by: Mark Towers <mark@anyscale.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: Mark Towers <mark@anyscale.com> Co-authored-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
## Description This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole. Do we want this tested in nightly or premerge? --------- Signed-off-by: Mark Towers <mark@anyscale.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: Mark Towers <mark@anyscale.com> Co-authored-by: Kamil Kaczmarek <kamil@anyscale.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
Description
This PR adds a HPO example to RLlib using HyperOpt and APPO + Cartpole.
Do we want this tested in nightly or premerge?