Skip to content

Refactor kernels to use dense device view#1978

Open
upsj wants to merge 20 commits intodevelopfrom
refactor_device_accessor
Open

Refactor kernels to use dense device view#1978
upsj wants to merge 20 commits intodevelopfrom
refactor_device_accessor

Conversation

@upsj
Copy link
Member

@upsj upsj commented Jan 23, 2026

Using https://github.com/upsj/ginkgo-refactor to refactor headers and source files

Still WIP, since some of the replacements were incorrect

Running this requires compiling LLVM with PR llvm/llvm-project#177442
and compiling ginkgo-refactor based on it and then running

# for core source files
clang-tidy --load=./build/libRefactorDenseToView.so --checks=-*,gko-refactor-core-dense-to-view --fix file.cpp
# for kernel source files *_kernels.cpp
clang-tidy --load=./build/libRefactorDenseToView.so --checks=-*,gko-refactor-kernels-dense-to-view --fix file.cpp
# for kernel header files *_kernels.hpp
clang-tidy --load=./build/libRefactorDenseToView.so --checks=-*,gko-refactor-kernel-decls-dense-to-view --fix file.hpp

@upsj upsj self-assigned this Jan 23, 2026
@ginkgo-bot ginkgo-bot added mod:core This is related to the core module. mod:cuda This is related to the CUDA module. mod:openmp This is related to the OpenMP module. mod:reference This is related to the reference module. type:solver This is related to the solvers type:preconditioner This is related to the preconditioners type:matrix-format This is related to the Matrix formats mod:hip This is related to the HIP module. type:stopping-criteria This is related to the stopping criteria labels Jan 23, 2026
@upsj upsj mentioned this pull request Jan 23, 2026
@upsj
Copy link
Member Author

upsj commented Jan 24, 2026

Issues found so far:

  • some header files get mangled (these are few enough we can fix by hand, are those race conditions in parallel replacements? should be ignored by clang-tidy)
  • incorrect application of replacements to some kernel_launch headers
  • Dense::get_stride() replacement failing
  • need to add/replace kernel_launch mappings for device views
  • "missing" omp.h or hip_runtime.h headers preventing clang-tidy from running
  • dpcpp support (attempt compiling against icpx LLVM?)

@upsj
Copy link
Member Author

upsj commented Jan 26, 2026

The get_stride replacement was simply missing. Creating the compile_commands.json with clang as the host compiler should fix the missing header issues. I'll try to compile against dpcpp later. Avoiding replacements in header files and member functions should avoid the kernel_launch issues, also we could skip functions that don't have the executor as first parameter (those are kernels)

@MarcelKoch
Copy link
Member

Maybe it would make sense to add the clang formatting plugin to this PR? That could make it easier to talk about issues with the plugin. We could later decide if the plugin is also distributed in ginkgo (e.g. in the scripts dir) or not.

@upsj
Copy link
Member Author

upsj commented Jan 26, 2026

Not sure I follow, the clang-format step is just a call of clang-format afterwards. Since this is a one-time refactoring, I would keep it separate for now

@upsj upsj force-pushed the device_accessor branch from 2ef7c49 to a2b2791 Compare March 2, 2026 17:44
@upsj
Copy link
Member Author

upsj commented Mar 3, 2026

The automated refactoring needed some significant fixes (I'll use this as a running list until I push it)

  • Some macros used names like _type different from the template parameters they represented, which caused incorrect replacements. I fixed them manually
  • The restriction to functions that have an executor as the first argument was too strict, it excluded functions like solve_qy in cb_gmres_kernels.cpp
  • I needed to replace matrix::Dense by matrix::view::dense also in kernel_launch mapping functions, but I left the accessor in place, since it is a smaller type than the view, as it doesn't store the size. We can discuss this separately
  • Some replacements in macros didn't work properly and needed to be fixed manually
  • Since template deduction can deal correctly with the implicit conversion from Dense* to const Dense*, but the same isn't true for dense views, I needed to add a as_const function which casts explicitly.
  • The same is true at the call site, when the core code passes in a mutable pointer, but the kernel takes a const argument
  • Some pointers inside kernels could also be nullptr, I replaced them by std::optional

@upsj upsj mentioned this pull request Mar 4, 2026
@upsj upsj force-pushed the refactor_device_accessor branch from d7f0043 to a54e019 Compare March 5, 2026 00:05
@upsj upsj marked this pull request as ready for review March 5, 2026 00:06
@upsj upsj force-pushed the device_accessor branch from a2b2791 to 93bed02 Compare March 5, 2026 11:20
Base automatically changed from device_accessor to develop March 5, 2026 14:48
upsj added 11 commits March 5, 2026 15:51
some of the files couldn't successfully be compiled by clang-tidy
this was mostly compilation errors and issues with single-parameter ->at(i)
moving to value types means we no longer have an empty trans_b/x value
mostly turning get_device_view into get_const_device_view to make template argument deduction happy
This should give shorter error messages from only one of the branches if we call an incorrect function
this is due to the ->get_const_device_view() replacements not working properly, so I needed to add them manually
- assertions weren't compiled before
- MPI was missing from refactor compile_commands.json
@upsj upsj force-pushed the refactor_device_accessor branch from f68bd79 to d9dc141 Compare March 5, 2026 15:10
@pratikvn
Copy link
Member

For the most part, I think it looks fine to me. As the changes are quite widespread, it makes sense to try to get the CI with tests to pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

1:ST:run-full-test mod:core This is related to the core module. mod:cuda This is related to the CUDA module. mod:hip This is related to the HIP module. mod:openmp This is related to the OpenMP module. mod:reference This is related to the reference module. type:matrix-format This is related to the Matrix formats type:preconditioner This is related to the preconditioners type:solver This is related to the solvers type:stopping-criteria This is related to the stopping criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants