Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1542 +/- ##
===========================================
+ Coverage 48.00% 91.13% +43.12%
===========================================
Files 371 3 -368
Lines 39837 327 -39510
Branches 6733 38 -6695
===========================================
- Hits 19125 298 -18827
+ Misses 19994 28 -19966
+ Partials 718 1 -717
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
|
Ok, the next step should be to get the repository setup part of the |
|
I'm wondering if the compiling should actually be part of |
19c5320 to
aa87cdc
Compare
These files are created by the system being tested. They shouldn't be pushed to github.
Now gpu-setup is merely installing dependencies for the provider. Ideally, these dependencies will get pulled in by the snap or deb package, and the system will be able to compile the tests. TOOD: With AMD tests being added, this Makefile will need to be skipped if the system being used does not have NVIDIA GPUs.
I'm curious to see if the packaging will succeed with what we have in the Ubuntu repositories. I'm skeptical, though.
The projects are downloaded to `/opt/` but the binaries are symlinked to `/usr/local/sbin`. Need to test if the symlinks are able to find the necessary data files.
The preinst and postinst solution did not work. preinst does not run before the dependency handling occurs.
This should be installed already, but might as well
These are no longer going to be living inside the data directory
44f8e49 to
d308544
Compare
(not yet owned by us, but should work soon)
This is suggested because servers with NVIDIA GPUs should be able to use this utility to install the appropriate driver.
We will figure these out in a follow-up PR.
Hook25
left a comment
There was a problem hiding this comment.
I have one commands change (pretty minor) and a major thing to ask, please create a sister PR in this repo and update the line I'm linking, else landing this will break our canary autopromotion script!
Hook25
left a comment
There was a problem hiding this comment.
+1, fantastic job, really well done
* Add CUDA repos to snap repos * Add gitignore in gpgpu/data These files are created by the system being tested. They shouldn't be pushed to github. * Extract compilation from gpu-setup. Now gpu-setup is merely installing dependencies for the provider. Ideally, these dependencies will get pulled in by the snap or deb package, and the system will be able to compile the tests. TOOD: With AMD tests being added, this Makefile will need to be skipped if the system being used does not have NVIDIA GPUs. * Generalize gpgpu/data gitignore * Add installation of CUDA as postinst * Try adding Ubuntu-repo CUDA I'm curious to see if the packaging will succeed with what we have in the Ubuntu repositories. I'm skeptical, though. * Fix typo * Install gpu-burn and cuda-samples in postinst The projects are downloaded to `/opt/` but the binaries are symlinked to `/usr/local/sbin`. Need to test if the symlinks are able to find the necessary data files. * Postinst is a bash script * GPGPU provider only works with amd64 for now * Re-use REPO_URL variable * Exit postinst on failure * Fix typos in postinst * Avoid failures when temporary files don't exist * Try without cleaning up temp files * Is set -e the problem? * Split `postinst` into `preinst` and `postinst` * Add AMD repos to snaps * Setup repos in postinst. The preinst and postinst solution did not work. preinst does not run before the dependency handling occurs. * Add rocm-validation-suite to snap dependencies * Test using snap GPGPU tools package * Fix typo in snapcraft.yaml * Slim down checkbox-gpgpu-tools snap * Install snaps in postinst * Fix requires snaps in gpgpu jobs * Add gpu-burn and rvs snaps * Update checkbox-gpgpu-tools snapcraft.yaml * Fix jobs.pxu * Add necessary plugs to checkbox-gpgpu-tools snap * Add snapd to depends This should be installed already, but might as well * Delete data gitignore These are no longer going to be living inside the data directory * There are stable releases of gpu-burn and rvs * Match spacing in jobs.pxu * Moving cuda-samples snap code to own repo * Snaps should have correct permissions now * Use (expected) cuda-samples snap * Add cuda-samples snap (not yet owned by us, but should work soon) * snapd should already be installed * Install bin_dir * Suggest ubuntu-drivers-common This is suggested because servers with NVIDIA GPUs should be able to use this utility to install the appropriate driver. * Remove GPGPU tools from stage-snaps We will figure these out in a follow-up PR. * tee output * Shellcheck fixes
Description
Currently, the GPGPU provider depends on the user running a setup script after installing the provider. This script sets up the NVIDIA/AMD repositories to get the required packages from upstream (i.e. CUDA toolkit or ROCm). This has worked so far, but it is negatively affecting the way we test the provider, especially with automated setups. It also adds some pain points in the installation and setup of the tests.
This PR removes the setup scripts. The tools being used have been made into snaps, so that they are installed automatically and work out of the box. The tools are installed in a
postinstscript in the deb package. This ensures that the tools are available as simple binaries.Resolved issues
Resolves CHECKBOX-1519
Documentation
I have an open PR to update the Appendix G - Setting Up and Testing a GPGPU documentation. It can be merged after this PR lands.
Tests
Tested on Luma:
Tested on Romano: