[Core] Add get_session_name() to RuntimeContext#59469
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new public API, get_session_name(), to the RuntimeContext. This is a valuable addition, as it provides a standardized way to access the Ray session name, which is particularly useful for monitoring and filtering Prometheus metrics across different clusters. The implementation is clean, and the accompanying tests are well-written, covering both driver and remote task contexts. My only suggestion is a minor documentation improvement to enhance clarity.
Signed-off-by: Marwan Sarieddine <sarieddine.marwan@gmail.com>
Signed-off-by: Marwan Sarieddine <sarieddine.marwan@gmail.com>
Signed-off-by: Marwan Sarieddine <sarieddine.marwan@gmail.com>
0775366 to
a66d3cd
Compare
python/ray/runtime_context.py
Outdated
| assert ( | ||
| ray.is_initialized() | ||
| ), "Session name is not available because Ray has not been initialized." |
There was a problem hiding this comment.
this should raise an exception type rather than be an assertion
assertions are typically used for invariant checking (they should only fail if an unexpected internal system state occurred)
| assert ( | |
| ray.is_initialized() | |
| ), "Session name is not available because Ray has not been initialized." | |
| if not ray.is_initialized(): | |
| raise RuntimeError("Session name is not available because Ray has not been initialized.") |
There was a problem hiding this comment.
Makes sense - FWIW I copied this assertion from the already implemented get_worker_id and get_node_id
There was a problem hiding this comment.
I know it is technically out of scope for this PR but I will add a commit fixing the get_worker_id, get_node_id and get_job_id methods to raise exceptions instead of relying on asserts.
There was a problem hiding this comment.
Please review the last two commits to resolve 🙏
Signed-off-by: Marwan Sarieddine <sarieddine.marwan@gmail.com>
Signed-off-by: Marwan Sarieddine <sarieddine.marwan@gmail.com>
|
Thanks @marwan116 |
Currently, there's no public API to retrieve the Ray session name. Users need to use private APIs like `ray._private.worker.global_worker.node.session_name` or query the dashboard REST API. This makes it difficult to filter Prometheus metrics by cluster when multiple clusters run the same application name, since Ray metrics use the `SessionName` label (which contains the session_name value). --------- Signed-off-by: Marwan Sarieddine <sarieddine.marwan@gmail.com>
Currently, there's no public API to retrieve the Ray session name. Users need to use private APIs like `ray._private.worker.global_worker.node.session_name` or query the dashboard REST API. This makes it difficult to filter Prometheus metrics by cluster when multiple clusters run the same application name, since Ray metrics use the `SessionName` label (which contains the session_name value). --------- Signed-off-by: Marwan Sarieddine <sarieddine.marwan@gmail.com> Signed-off-by: jasonwrwang <jasonwrwang@tencent.com>
Currently, there's no public API to retrieve the Ray session name. Users need to use private APIs like `ray._private.worker.global_worker.node.session_name` or query the dashboard REST API. This makes it difficult to filter Prometheus metrics by cluster when multiple clusters run the same application name, since Ray metrics use the `SessionName` label (which contains the session_name value). --------- Signed-off-by: Marwan Sarieddine <sarieddine.marwan@gmail.com>
Currently, there's no public API to retrieve the Ray session name. Users need to use private APIs like `ray._private.worker.global_worker.node.session_name` or query the dashboard REST API. This makes it difficult to filter Prometheus metrics by cluster when multiple clusters run the same application name, since Ray metrics use the `SessionName` label (which contains the session_name value). --------- Signed-off-by: Marwan Sarieddine <sarieddine.marwan@gmail.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
Description
Currently, there's no public API to retrieve the Ray session name. Users need to use private APIs like
ray._private.worker.global_worker.node.session_nameor query the dashboard REST API. This makes it difficult to filter Prometheus metrics by cluster when multiple clusters run the same application name, since Ray metrics use theSessionNamelabel (which contains the session_name value).Checks
scripts/format.shto lint the code