Skip to content

Use launcher and cert-tools in Testflinger job for testing DSS (New)#1947

Merged
motjuste merged 32 commits intomainfrom
CHECKBOX-1905-use-cert-tools-in-dss-tf-job
Jun 17, 2025
Merged

Use launcher and cert-tools in Testflinger job for testing DSS (New)#1947
motjuste merged 32 commits intomainfrom
CHECKBOX-1905-use-cert-tools-in-dss-tf-job

Conversation

@motjuste
Copy link
Copy Markdown
Contributor

@motjuste motjuste commented Jun 3, 2025

Description

This PR will be best merged after #1946.

The main goal is to migrate away from a custom implementation of the Testflinger job for testing DSS, and instead to use cert-tools in a (roughly) standard looking job (job-def.yaml).
The job spec in tools/lab_dispatch was the inspiration for the new implementation.

The most important benefit of the new implementation is that the Testflinger agent will be the one controlling test execution, not the device itself, enabling us to do more advanced testing in the future. We also benefit from more stability using the scriptlets from cert-tools.

A default launcher for the provider has been added (checkbox-dss.conf). It is the same one that was used in the validate-with-gpu script in the Snap.

validate-with-gpu script has been removed from the Snap and the documentation now recommends using the launcher directly. This script would not have been useful in the advanced testing scenarios in the future anyway (e.g. for reboot testing).

Resolved issues

Final part of CHECKBOX-1905.

Documentation

Only updates to the README of the provider.

Tests

  • Trigger and report runs once network issues in TEL are resolved.

See this run.

motjuste added 27 commits June 3, 2025 12:24
I had seen this in the generic_source.yaml, but wanted to see if I could
get away without it.  Unfortunately, some devices do seem to need this,
and we need to try if allowing degraded SSH to let the tests proceed
actually have a bad impact or not.
@motjuste motjuste marked this pull request as ready for review June 4, 2025 13:08
@motjuste motjuste requested a review from a team as a code owner June 4, 2025 13:08
@motjuste motjuste requested a review from fernando79513 June 6, 2025 09:19
@motjuste motjuste removed the request for review from a team June 6, 2025 13:58
@fernando79513 fernando79513 self-assigned this Jun 6, 2025
Base automatically changed from CHECKBOX-1905-use-built-snap-in-dss-workflow to main June 6, 2025 15:35
@codecov
Copy link
Copy Markdown

codecov bot commented Jun 6, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 50.46%. Comparing base (e6ff355) to head (b701e89).
⚠️ Report is 110 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1947   +/-   ##
=======================================
  Coverage   50.46%   50.46%           
=======================================
  Files         384      384           
  Lines       41111    41111           
  Branches     7531     7531           
=======================================
  Hits        20745    20745           
  Misses      19620    19620           
  Partials      746      746           
Flag Coverage Δ
provider-dss 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@motjuste
Copy link
Copy Markdown
Contributor Author

motjuste commented Jun 9, 2025

Had a full successful run since merging from main with changes from #1946.

Copy link
Copy Markdown
Collaborator

@fernando79513 fernando79513 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job here!
I've just left one small comment, but the rests looks good.

Copy link
Copy Markdown
Collaborator

@fernando79513 fernando79513 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1!

@motjuste motjuste merged commit ae3ba6f into main Jun 17, 2025
23 of 25 checks passed
@motjuste motjuste deleted the CHECKBOX-1905-use-cert-tools-in-dss-tf-job branch June 17, 2025 11:26
mreed8855 pushed a commit that referenced this pull request Jul 30, 2025
…1947)

* Temporarily reduce test matrix while developing

* Trigger snap build and download atrifact

* Add step to find the downloaded snap

* Refactor step to build job.yaml from template

* Accept built snap in testflinger job-def.yaml

* Use attached pre-built snap in testflinger job

* Temp disable submitting job to testflinger

* Remove unused BRANCH env variable

* Set path to download artifact in working directory

* Make DSS and microk8s channels inputs

* Add summary to each queue in matrix

* Update image url for dell-precision-5680 queue

* Re-enable all queues in the matrix

* Add the default launcher for checkbox-dss

* Use launcher in testflinger job

* Remove validate-with-gpu from checkbox-dss snap

* Fix some misspellings in snapcraft.yaml

* Migrate testflinger job to use cert tools

* Re-enable submitting job to testflinger

* Fix path to built job.yaml

* Remove awaiting rollout of outdated daemonset

* Allow SSH to be degraded

I had seen this in the generic_source.yaml, but wanted to see if I could
get away without it.  Unfortunately, some devices do seem to need this,
and we need to try if allowing degraded SSH to let the tests proceed
actually have a bad impact or not.

* Wait for snap changes after each snap install

* Add running snap refresh

* Increase sleep time before checking rollout

* Switch from using sed to envsubst to fill template

* Fix typo in calling envsubst

* Switch to ENV_ from REPLACE_ for template params
mreed8855 pushed a commit that referenced this pull request Jul 31, 2025
…1947)

* Temporarily reduce test matrix while developing

* Trigger snap build and download atrifact

* Add step to find the downloaded snap

* Refactor step to build job.yaml from template

* Accept built snap in testflinger job-def.yaml

* Use attached pre-built snap in testflinger job

* Temp disable submitting job to testflinger

* Remove unused BRANCH env variable

* Set path to download artifact in working directory

* Make DSS and microk8s channels inputs

* Add summary to each queue in matrix

* Update image url for dell-precision-5680 queue

* Re-enable all queues in the matrix

* Add the default launcher for checkbox-dss

* Use launcher in testflinger job

* Remove validate-with-gpu from checkbox-dss snap

* Fix some misspellings in snapcraft.yaml

* Migrate testflinger job to use cert tools

* Re-enable submitting job to testflinger

* Fix path to built job.yaml

* Remove awaiting rollout of outdated daemonset

* Allow SSH to be degraded

I had seen this in the generic_source.yaml, but wanted to see if I could
get away without it.  Unfortunately, some devices do seem to need this,
and we need to try if allowing degraded SSH to let the tests proceed
actually have a bad impact or not.

* Wait for snap changes after each snap install

* Add running snap refresh

* Increase sleep time before checking rollout

* Switch from using sed to envsubst to fill template

* Fix typo in calling envsubst

* Switch to ENV_ from REPLACE_ for template params
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants