Respect multi-GPU outputs in nvidia-smi #15460

charliermarsh · 2025-08-22T17:27:50Z

Summary

This initially included NVIDIA_VISIBLE_DEVICES masking, though it's now omitted for simplicity.

Closes #14647.

geofft

This seems fine, but some nitpicks:

I don't actually think there's a point to us parsing NVIDIA_VISIBLE_DEVICES here. This variable is specifically used by nvidia-container-toolkit to determine which devices ought to be exposed inside the container. It doesn't seem to be something that's used in non-container tools at all. I think the reference to it in #14647 was just mentioning the lack of ability to use this as a workaround for being unable to parse multiple lines, but if we handle multiple lines I'm not sure we need the workaround. It doesn't really matter what we do here since the driver version ought to be the same for all lines of output, but if we extend this code to be about compute capabilities etc., I think we should put a tad more thought into whether we want this to be the interface, since I think we would be novel in using this environment variable in a non-container tool (e.g. I don't think that nvidia-variant-provider uses it).
Somewhat weirdly the parsing for the variable in nvidia-container-toolkit appears to allow all/none/void to be individual elements in the comma-separated list, as opposed to having to be the entire string, e.g., NVIDIA_VISIBLE_DEVICES=2,all is accepted and interpreted as all. See https://github.com/NVIDIA/nvidia-container-toolkit/blob/v1.17.8/internal/config/image/cuda_image.go#L123-L154 which splits on commas and passes a list to https://github.com/NVIDIA/nvidia-container-toolkit/blob/v1.17.8/internal/config/image/devices.go which loops through the list looking for these special keywords.

charliermarsh · 2025-11-02T21:05:48Z

Okay, sounds good. I removed the NVIDIA_VISIBLE_DEVICES parsing for now.

This MR contains the following updates: | Package | Update | Change | |---|---|---| | [astral-sh/uv](https://github.com/astral-sh/uv) | patch | `0.9.7` -> `0.9.8` | MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot). **Proposed changes to behavior should be submitted there as MRs.** --- ### Release Notes <details> <summary>astral-sh/uv (astral-sh/uv)</summary> ### [`v0.9.8`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#098) [Compare Source](astral-sh/uv@0.9.7...0.9.8) Released on 2025-11-07. ##### Enhancements - Accept multiple packages in `uv export` ([#16603](astral-sh/uv#16603)) - Accept multiple packages in `uv sync` ([#16543](astral-sh/uv#16543)) - Add a `uv cache size` command ([#16032](astral-sh/uv#16032)) - Add prerelease guidance for build-system resolution failures ([#16550](astral-sh/uv#16550)) - Allow Python requests to include `+gil` to require a GIL-enabled interpreter ([#16537](astral-sh/uv#16537)) - Avoid pluralizing 'retry' for single value ([#16535](astral-sh/uv#16535)) - Enable first-class dependency exclusions ([#16528](astral-sh/uv#16528)) - Fix inclusive constraints on available package versions in resolver errors ([#16629](astral-sh/uv#16629)) - Improve `uv init` error for invalid directory names ([#16554](astral-sh/uv#16554)) - Show help on `uv build -h` ([#16632](astral-sh/uv#16632)) - Include the Python variant suffix in "Using Python ..." messages ([#16536](astral-sh/uv#16536)) - Log most recently modified file for cache-keys ([#16338](astral-sh/uv#16338)) - Update Docker builds to use nightly Rust toolchain with musl v1.2.5 ([#16584](astral-sh/uv#16584)) - Add GitHub attestations for uv release artifacts ([#11357](astral-sh/uv#11357)) ##### Configuration - Expose `UV_NO_GROUP` as an environment variable ([#16529](astral-sh/uv#16529)) - Add `UV_NO_SOURCES` as an environment variable ([#15883](astral-sh/uv#15883)) ##### Bug fixes - Allow `--check` and `--locked` to be used together in `uv lock` ([#16538](astral-sh/uv#16538)) - Allow for unnormalized names in the METADATA file ([#16547](astral-sh/uv#16547)) ([#16548](astral-sh/uv#16548)) - Fix missing value\_type for `default-groups` in schema ([#16575](astral-sh/uv#16575)) - Respect multi-GPU outputs in `nvidia-smi` ([#15460](astral-sh/uv#15460)) - Fix DNS lookup errors in Docker containers ([#8450](astral-sh/uv#8450)) ##### Documentation - Fix typo in uv tool list doc ([#16625](astral-sh/uv#16625)) - Note `uv pip list` name normalization in docs ([#13210](astral-sh/uv#13210)) ##### Other changes - Update Rust toolchain to 1.91 and MSRV to 1.89 ([#16531](astral-sh/uv#16531)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this MR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this MR, check this box --- This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).

charliermarsh requested a review from geofft August 22, 2025 17:27

charliermarsh added the bug Something isn't working label Aug 22, 2025

charliermarsh marked this pull request as ready for review August 22, 2025 17:28

Respect multi-GPU outputs in nvidia-smi

eda8349

charliermarsh force-pushed the charlie/multi branch from d4f5a2d to eda8349 Compare August 22, 2025 17:28

charliermarsh temporarily deployed to uv-test-registries August 22, 2025 17:35 — with GitHub Actions Inactive

geofft approved these changes Oct 9, 2025

View reviewed changes

Merge branch 'main' into charlie/multi

8b53b59

charliermarsh enabled auto-merge (squash) November 2, 2025 21:07

charliermarsh disabled auto-merge November 2, 2025 21:07

charliermarsh had a problem deploying to uv-test-registries November 2, 2025 21:07 — with GitHub Actions Error

charliermarsh enabled auto-merge (squash) November 2, 2025 21:07

Remove visible devices

3f8adaf

charliermarsh force-pushed the charlie/multi branch from 06aa57f to 3f8adaf Compare November 2, 2025 21:10

charliermarsh temporarily deployed to uv-test-registries November 2, 2025 21:13 — with GitHub Actions Inactive

charliermarsh merged commit 6da135a into main Nov 2, 2025
99 checks passed

charliermarsh deleted the charlie/multi branch November 2, 2025 21:21

BrewTestBot mentioned this pull request Nov 7, 2025

uv 0.9.8 Homebrew/homebrew-core#253592

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Respect multi-GPU outputs in nvidia-smi #15460

Respect multi-GPU outputs in nvidia-smi #15460

Uh oh!

charliermarsh commented Aug 22, 2025 •

edited

Loading

Uh oh!

geofft left a comment

Uh oh!

charliermarsh commented Nov 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Respect multi-GPU outputs in nvidia-smi #15460

Respect multi-GPU outputs in nvidia-smi #15460

Uh oh!

Conversation

charliermarsh commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

geofft left a comment

Choose a reason for hiding this comment

Uh oh!

charliermarsh commented Nov 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

charliermarsh commented Aug 22, 2025 •

edited

Loading