Conversation
|
This how it looks like when there are circular dependencies in Ginkgo: The pipeline step |
thoasm
left a comment
There was a problem hiding this comment.
Just something that I saw when scanning over the changes.
It is very neat that these small changes might resolve the circular dependencies.
|
@tcojean I am not sure if it related to this, but I remember resolving similar problems when I fixed a linker error with test cases in #199, where I brute forced the linker to accept linking against all ginkgo libraries again. It also might have been an issue with circular dependencies, so maybe we can remove these additional link targets with this PR. |
|
Works fine on MacOS |
|
@hartwiganzt just to confirm, with shared libraries, i.e. |
|
Confirm. |
aa17b61 to
cfffeac
Compare
|
I also added to the PR a reduction of the amount of CI jobs (six less) and added CI jobs for static builds. I do this by shuffling around (manually) the parameters between the different builds so that we try to test as much as possible overall without having one job for each single parameter change. This is also now more balanced compared to before where we had a lot of debug tests compared to the rest. |
bfceb57 to
1a45c36
Compare
|
it also works fine on my MacOS, with default settings but |
|
@yhmtsai thanks for the input Mike. About the benchmarks, this is related to #270 and #274. Indeed, the CI fails because there is still one circular dependency left in jacobi. Weirdly enough, it is only triggered in some particular case (clang 9.2, gcc 7.x, release build, ...), but it is there. I will fix this and this should be able to be merged afterwards. |
|
The CI will hopefully confirm, but the block jacobi circular dependency should now be fixed. Here is a summary of what I did: Fix the jacobi circular dependency using some code duplication.
|
+ GMRES used dense matrices and uses the preconditionner apply. These calls are moved to the core module + The static variables for the CudaExecutor were placed in the core module. They are moved to the Cuda module.
For this, we vary more than one parameter at once. Here is the new overview: + core + gcc + debug + static + core + clang + release + shared + omp + gcc + release + shared + omp + clang + debug + static + all + gcc + cuda 9.0 + debug + shared + all + clang + cuda 9.0 + release + static + all + gcc + cuda 9.1 + debug + static + all + clang + cuda 9.1 + release + shared + all + gcc + cuda 9.2 + release + shared + all + clang + cuda 9.2 + debug + static + all + gcc + cuda 10.0 + release + static + all + clang + cuda 10.0 + debug + shared
+ All the `apply` kernels and the related kernel selector is duplicated with an alpha call + The advanced version scale by alpha inside the kernel + Use a new `multiply_vec_add` instead of `multiply_vec` which preserves the value in `x` by storing with addition.
34df22b to
1ceb97f
Compare
|
I re-requested reviews because there is the whole jacobi fix part which is entirely new. |
thoasm
left a comment
There was a problem hiding this comment.
Thank you for fixing this!
I have some comments, including a possible solution to the code duplication, but that is totally optional and up to you.
But I would strongly suggest moving the creation of {before,after}_preconditioner.
thoasm
left a comment
There was a problem hiding this comment.
Just the documentation for the change to multiply_vec is missing, everything else looks very good!
+ For GMRES, only initialize the temporary `before_preconditioner` and `after_preconditioner` once. + For Jacobi, use only one `multiply_vec` function where the operation is parametrized by a lambda.
d4d8b69 to
0ddff67
Compare
This PR fixes detected circular dependencies and improves the CI.
The approach used here is to keep the current system and enforce that no kernel module ever use any function with implementation in the core module. That was the case for two things in particular, the static variables of the
CudaExecutorand the GMRESsolve_xfunction. ThejacobiCUDAadvanced_applyis borderline, I am somehow not able to make it throw an error. In fact, the CI is able to find some problem with the Jacobi (CUDA 9.2, GCC 7, Release build only), so it also requires fixing.I really need the confirmation of @hartwiganzt here (or any mac user): does this build with
-DBUILD_SHARED_LIBS=ON?Changes
moved to the core module
are moved to the Cuda module.
applykernels and the related kernel selector are duplicated with analpha parameter in the calls
multiply_vec_addinstead ofmultiply_vecwhich preserves thevalue in
xby storing with addition.Closes #203.
Closes #270.
TODO