Replace check_intel.sh with an appropriate Python script (New)#1728
Closed
Replace check_intel.sh with an appropriate Python script (New)#1728
check_intel.sh with an appropriate Python script (New)#1728Conversation
The enabling is done using the `enable_intel.sh` also added in this commit, and is copied almost verbatim from `check_intel.sh`, which in turn was an almost verbatim implementation of instructions from the DSS documentation. The `enable_intel.sh` script was too involved to convert to a Python script.
we can now also specify the plugin version, and v0.30.0 is used based on DSS docs
The rollout of the daemonsets is verified while enabling the Intel GPU plugin in intel_gpu_plugin/install, and done so more reliably than what's done in the shell script.
The exact counting is going to be wrong because of issue where the Intel GPU plugin starts counting Nvidia GPUs too! We just test that it has enough, i.e. more than or equal to SLOTS_PER_GPU that we requested during intel_gpu_plugin/install
This test was already wrong, and it is not going to be maintainable due to issues with gpu.intel.com label. See associated bash fragment being remmoved in this commit.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This is the next piece of the original PR #1724 that requires #1725.
The main changes in this PR include replacing the existing
check_intel.shand its usages with the more or less the same functionality implemented as a Python script. The new Python script still uses aenable_intel.shscript to perform the multi-step, bash-heavy procedure to enable the Intel GPU plugin, but then verifies the success of the relevant rollout in a manner similar tocheck_cuda_with_mk8s.pyin #1727.Some test jobs have actually been removed. They were originally implemented by PE with only Intel GPU in mind, but now with NVIDIA GPUs also being relevant to these Checkbox tests, the bugs in the those tests made them useless. In particular, those tests would start counting NVIDIA GPUs too as Intel GPUs the way they were implemented (checking labels attached to the cluster node), and no suitable alternatives could be found. Finally, it can also be argued that verifying these exact quantities is not relevant to testing DSS; it is enough to test that the commands documented by DSS to enable Intel GPU work as a normal user would see them work.
Resolved issues
Documentation
No changes to the Checkbox documentation.
Tests
#1725 needs to be merged before this PR to enable running the tests in the CI.