Skip to content

Replace check_cuda.sh with simple in-line commands (New)#1750

Merged
fernando79513 merged 3 commits intomainfrom
CHECKBOX-1693-simplify-check_cuda
Mar 10, 2025
Merged

Replace check_cuda.sh with simple in-line commands (New)#1750
fernando79513 merged 3 commits intomainfrom
CHECKBOX-1693-simplify-check_cuda

Conversation

@motjuste
Copy link
Copy Markdown
Contributor

@motjuste motjuste commented Feb 25, 2025

Description

This is the next piece of the original PR #1724. I was drafting #1727 but then got the idea for a much simpler solution, partly also from the review of #1725. While not strictly required, it is best to merge this after #1743.

The main changes in this PR include:

  • Rename the unit nvidia_gpu_addon/enable to microk8s_nvidia_gpu_addon/enable to reflect that this unit will attempt enabling the GPU addon in microk8s. We expect to add a different unit for enabling the GPU addon in other K8s soon.
  • Extract out the a simple, re-usable check_nvidia_gpu_rollout.sh in anticipation that it will be useful in other K8s soon.
  • In-line the commands and checks from check_cuda.sh
  • Remove the now obsolete check_cuda.sh.

Resolved issues

Documentation

No changes to the Checkbox documentation.

Tests

No new tests were added. I have tested this branch in a machine from Testflinger manually. I need the changes to the job-def.yaml from #1743 to have a fully automated Testflinger job result.

We also rename this unit to include the work microk8s in anticipation of
a future request to enable the addon in a different k8s environment
The now obsolete check_cuda.sh can now also be removed
This will be re-usable when enabling NVIDIA GPU in other K8s
@motjuste motjuste marked this pull request as ready for review February 25, 2025 12:56
@motjuste motjuste requested a review from a team as a code owner February 25, 2025 12:56
@fernando79513 fernando79513 self-assigned this Feb 27, 2025
Copy link
Copy Markdown
Collaborator

@fernando79513 fernando79513 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job here, LGTM!
We can merge this as soon as the check_dss PR is merged

@fernando79513 fernando79513 merged commit df70356 into main Mar 10, 2025
10 checks passed
@fernando79513 fernando79513 deleted the CHECKBOX-1693-simplify-check_cuda branch March 10, 2025 15:28
stanley31huang pushed a commit that referenced this pull request Mar 28, 2025
* Inline script to enable gpu addon in microk8s

We also rename this unit to include the work microk8s in anticipation of
a future request to enable the addon in a different k8s environment

* Inline verifying nvidia gpu validations

The now obsolete check_cuda.sh can now also be removed

* Factor out checking NVIDIA GPU rollout

This will be re-usable when enabling NVIDIA GPU in other K8s
mreed8855 pushed a commit that referenced this pull request Jul 31, 2025
* Inline script to enable gpu addon in microk8s

We also rename this unit to include the work microk8s in anticipation of
a future request to enable the addon in a different k8s environment

* Inline verifying nvidia gpu validations

The now obsolete check_cuda.sh can now also be removed

* Factor out checking NVIDIA GPU rollout

This will be re-usable when enabling NVIDIA GPU in other K8s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants