End-to-end CUDA container, remove peacock, bump python to 3.13#28114
End-to-end CUDA container, remove peacock, bump python to 3.13#28114loganharbour merged 49 commits intoidaholab:nextfrom
Conversation
d7505dc to
9a80bb6
Compare
f44e95f to
627db04
Compare
|
Job Documentation, step Docs: sync website on 552c9ab wanted to post the following: View the site here This comment will be updated on new commits. |
|
Job Coverage, step Generate coverage on 1c9fc96 wanted to post the following: Framework coverage
Modules coverageContact
Porous flow
Solid mechanics
Full coverage reportsReports
Warnings
This comment will be updated on new commits. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I'm pretty interested in this. With our recent PETSc update, we should be much closer to being able to run a clean test harness with a gpu-aware mpi/petsc |
|
Sounds good. I'll revive this today. |
627db04 to
017dc51
Compare
|
@lindsayad can you try when you have a moment? |
|
All tests are failing with this message |
What about if you run with the We shouldn't need to bind mount in the nvidia drivers here, but I think we're missing a flag with blas (or unsetting a variable) so that it doesn't try to run GPU code. |
017dc51 to
dda7285
Compare
6ae7031 to
6a87d91
Compare
1e3e71a to
8ccd004
Compare
|
is there a draft civet recipe for executing in this cuda container? Would be nice to see what progress we are making in CI |
https://civet.inl.gov/job/2675477/ This guy. It's just libtorch stuff at this point |
|
I would expect to see a lot more failures than I do |
What would you expect to see? this is just libtorch moose |
|
Oh I didn't understand that basically all tests are getting excluded due to |
Co-authored-by: Casey Icenhour <cticenho@ncsu.edu>
Solvers held by MFEM user objects make calls to GetDevicePtr. Consequently, we have to make sure that the memory manager, which is destroyed in the Device destructor held by the MFEMExecutuioner, is not destroyed before these user object calls
- Remove peacock - Bump python to 3.13 - Apptainer clang bump to 19 - Apptainer min gcc bump to 19 - Full stack cuda build from MPI on - Manual apptainer libtorch build
|
Job Precheck, step Versioner verify on 552c9ab wanted to post the following: Versioner templatesFound 14 templates, 0 failed Versioner influential filesFound 58 influential files, 20 changed, 2 added, 0 removed
Versioner versionsFound 9 packages, 9 changed, 0 failed
|
|
Modules parallel failure is unrelated. Libtorch CUDA recipe failure is related, but this recipe will be removed. |
Due to: idaholab#28114, bumping several minimums. Closes idaholab#30753
Due to: idaholab#28114, bumping several minimums. Closes idaholab#30753
Due to: idaholab#28114, bump several packages version strings. Turns out when you use yaml.safe_load(file), it will perform type detection and set that type, thus dropping 3.10 to 3.1. Therefore convert all entries in package_config.yml to strings, as thats all we need them to be for documentation. Closes idaholab#30753
Due to: idaholab#28114, bump several packages version strings. Turns out when you use yaml.safe_load(file), it will perform type detection and set that type, thus dropping 3.10 to 3.1. Therefore convert all entries in package_config.yml to strings, as thats all we need them to be for documentation. Closes idaholab#30753
Due to: idaholab#28114, bump several packages version strings. Turns out when you use yaml.safe_load(file), it will perform type detection and set that type, thus dropping 3.10 to 3.1. Therefore convert all entries in package_config.yml to strings, as thats all we need them to be for documentation. Closes idaholab#30753
Due to: idaholab#28114, bump several packages version strings. Turns out when you use yaml.safe_load(file), it will perform type detection and set that type, thus dropping 3.10 to 3.1. Therefore convert all entries in package_config.yml to strings, as thats all we need them to be for documentation. Closes idaholab#30753
I'm guessing this did not matter because we pointed at the cuda dir and so torch figured out we wanted cuda anyways Refs introduction of these cmake arguments in idaholab#28114
Closes #29374 (adds full cuda build)
Closes #28161 (removes moose-peacock from moose-dev)
Closes #30382 (removes moose-peacock)
Closes #30586 (removes extra vtk build that comes from moose-peaacock; confirmed by @hugary1995)