Skip to content

Use KernelAbstractions.jl#41

Merged
kylebeggs merged 5 commits intomainfrom
kernel-abstractions
Jul 28, 2025
Merged

Use KernelAbstractions.jl#41
kylebeggs merged 5 commits intomainfrom
kernel-abstractions

Conversation

@kylebeggs
Copy link
Member

This PR simply changes the construction of the operator to use KernelAbstractions.jl rather than plain threads, so in the future when we add support for GPU-based neighbor lists we can build the operator on the GPU.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR replaces the manual threading approach in _build_weights with a batched GPU/CPU kernel via KernelAbstractions.jl, adds a helper for vector-dimension detection, updates autoselect_k to accept any AbstractVector, and brings in KernelAbstractions at the module level.

  • Swap out Threads.@threads stencil construction for a @kernel-based batch processor.
  • Introduce _get_vector_dim to compute point dimensionality generically and update autoselect_k.
  • Add using KernelAbstractions in the main module and clean up old tests.

Reviewed Changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
test/solve.jl Removed legacy _calculate_matrix_entry! tests after pruning the old threaded implementation.
src/utils.jl Added _get_vector_dim methods and widened autoselect_k to AbstractVector.
src/solve.jl Replaced threaded loops with a batched build_stencils_kernel; defaults to CPU but supports GPU.
src/RadialBasisFunctions.jl Added using KernelAbstractions to enable kernel dispatch.
CLAUDE.md New guidance file for automated code assistants.
Comments suppressed due to low confidence (4)

test/solve.jl:46

  • The legacy test set for _calculate_matrix_entry! was removed, leaving no direct coverage for individual matrix-entry logic. Consider adding new tests that validate the outputs of the build_stencils_kernel or the resulting sparse matrix entries to maintain coverage.
end

src/solve.jl:50

  • The variable nmon is not defined or passed into the kernel; this will error. You should pass nmon as an argument to build_stencils_kernel or declare it as a compile-time constant.
        n = k + nmon

src/solve.jl:51

  • The type parameter TD is referenced inside the kernel but never passed in or defined within its scope. You need to supply TD (e.g., element type) as an argument or use a constant within the kernel.
        A = Symmetric(zeros(TD, n, n), :U)

src/solve.jl:39

  • The J array (column indices) is passed into the kernel but never assigned or updated, so it remains at its initial state. Add logic inside the kernel to populate J[(i-1)*k + idx] alongside I and V.
    @kernel function build_stencils_kernel(

@codecov
Copy link

codecov bot commented Jul 4, 2025

Codecov Report

❌ Patch coverage is 96.55172% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/solve.jl 95.65% 1 Missing ⚠️
Files with missing lines Coverage Δ
src/RadialBasisFunctions.jl 100.00% <ø> (ø)
src/utils.jl 78.57% <100.00%> (+3.57%) ⬆️
src/solve.jl 98.87% <95.65%> (+0.11%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@kylebeggs kylebeggs changed the title Kernel abstractions Use KernelAbstractions.jl Jul 4, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Jul 4, 2025

Benchmark Results

main c11a1fe... main / c11a1fe...
Directional 2.4 ± 0.089 ms 2.43 ± 0.11 ms 0.986 ± 0.057
Directional (per point) 2.33 ± 0.12 ms 2.42 ± 0.084 ms 0.964 ± 0.06
Gradient 7.89 ± 0.32 ms 7.48 ± 0.29 ms 1.05 ± 0.058
MonomialBasis/dim=1/deg=0 0.0475 ± 0.012 μs 0.0473 ± 0.011 μs 1.01 ± 0.34
MonomialBasis/dim=1/deg=1 0.0752 ± 0.013 μs 0.0757 ± 0.014 μs 0.993 ± 0.25
MonomialBasis/dim=1/deg=2 0.0835 ± 0.018 μs 0.0861 ± 0.019 μs 0.97 ± 0.3
MonomialBasis/dim=2/deg=0 0.0364 ± 0.0098 μs 0.0376 ± 0.0091 μs 0.969 ± 0.35
MonomialBasis/dim=2/deg=1 0.0353 ± 0.012 μs 0.0351 ± 0.012 μs 1 ± 0.47
MonomialBasis/dim=2/deg=2 0.0407 ± 0.011 μs 0.0419 ± 0.012 μs 0.972 ± 0.39
MonomialBasis/dim=3/deg=0 0.0344 ± 0.011 μs 0.0367 ± 0.012 μs 0.938 ± 0.43
MonomialBasis/dim=3/deg=1 0.0405 ± 0.012 μs 0.0417 ± 0.012 μs 0.972 ± 0.4
MonomialBasis/dim=3/deg=2 0.0494 ± 0.011 μs 0.0502 ± 0.012 μs 0.984 ± 0.32
Partial 2.35 ± 0.12 ms 2.39 ± 0.11 ms 0.986 ± 0.069
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∂ 10.7 ± 0.1 ns 10.2 ± 0.19 ns 1.05 ± 0.022
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∂² 10.6 ± 0.1 ns 10.5 ± 0.18 ns 1.01 ± 0.02
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∇ 17.3 ± 0.13 ns 17.3 ± 0.06 ns 1 ± 0.0083
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∇² 18.6 ± 0.069 ns 18.3 ± 0.099 ns 1.02 ± 0.0067
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∂ 10.7 ± 0.08 ns 10.1 ± 0.09 ns 1.05 ± 0.012
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∂² 10.6 ± 0.061 ns 10.5 ± 0.18 ns 1.01 ± 0.018
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∇ 17.3 ± 0.14 ns 17.3 ± 0.06 ns 1 ± 0.0088
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∇² 18.6 ± 0.061 ns 18.3 ± 0.1 ns 1.02 ± 0.0065
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∂ 10.6 ± 0.08 ns 10.1 ± 0.09 ns 1.05 ± 0.012
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∂² 10.6 ± 0.1 ns 11.1 ± 0.06 ns 0.957 ± 0.01
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∇ 17.3 ± 0.14 ns 17.3 ± 0.06 ns 1 ± 0.0088
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∇² 18.6 ± 0.061 ns 18.3 ± 0.09 ns 1.02 ± 0.0061
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∂ 6.8 ± 0.14 ns 6.32 ± 0.011 ns 1.08 ± 0.022
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∂² 14.2 ± 0.069 ns 14 ± 0.04 ns 1.02 ± 0.0057
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∇ 8.59 ± 0.14 ns 8.65 ± 0.031 ns 0.993 ± 0.017
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∇² 16.1 ± 0.1 ns 16.3 ± 0.061 ns 0.986 ± 0.0072
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∂ 6.74 ± 0.14 ns 6.33 ± 0.01 ns 1.06 ± 0.022
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∂² 14.1 ± 0.081 ns 13.9 ± 0.071 ns 1.01 ± 0.0078
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∇ 8.59 ± 0.14 ns 8.66 ± 0.06 ns 0.992 ± 0.018
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∇² 16.1 ± 0.089 ns 16.3 ± 0.061 ns 0.986 ± 0.0066
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∂ 6.74 ± 0.15 ns 6.32 ± 0.011 ns 1.07 ± 0.024
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∂² 14.1 ± 0.08 ns 14 ± 0.041 ns 1 ± 0.0064
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∇ 8.63 ± 0.09 ns 8.66 ± 0.059 ns 0.997 ± 0.012
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∇² 16.1 ± 0.09 ns 16.3 ± 0.061 ns 0.986 ± 0.0067
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 0/0/∂ 3.41 ± 0.02 ns 3.73 ± 0.01 ns 0.914 ± 0.0059
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 0/0/∂² 4.7 ± 0.01 ns 4.7 ± 0.01 ns 1 ± 0.003
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 0/0/∇ 5.44 ± 0.03 ns 5.49 ± 0.05 ns 0.991 ± 0.011
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 0/0/∇² 3.11 ± 0.001 ns 3.42 ± 0.01 ns 0.909 ± 0.0027
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 1/1/∂ 3.41 ± 0.021 ns 3.73 ± 0.01 ns 0.914 ± 0.0061
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 1/1/∂² 4.7 ± 0.01 ns 4.7 ± 0.01 ns 1 ± 0.003
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 1/1/∇ 5.44 ± 0.03 ns 5.5 ± 0.05 ns 0.989 ± 0.011
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 1/1/∇² 3.11 ± 0.001 ns 3.42 ± 0.01 ns 0.909 ± 0.0027
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 2/2/∂ 3.41 ± 0.02 ns 3.73 ± 0.01 ns 0.914 ± 0.0059
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 2/2/∂² 4.7 ± 0.01 ns 4.7 ± 0.01 ns 1 ± 0.003
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 2/2/∇ 5.44 ± 0.03 ns 5.49 ± 0.041 ns 0.991 ± 0.0092
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 2/2/∇² 3.11 ± 0.001 ns 3.42 ± 0.01 ns 0.909 ± 0.0027
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 0/0/∂ 4.28 ± 0.5 ns 4.27 ± 0.01 ns 1 ± 0.12
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 0/0/∂² 5.82 ± 0.01 ns 5.78 ± 0.01 ns 1.01 ± 0.0025
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 0/0/∇ 6.32 ± 0.049 ns 6.25 ± 0.01 ns 1.01 ± 0.008
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 0/0/∇² 4.27 ± 0.01 ns 4.5 ± 0.01 ns 0.949 ± 0.0031
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 1/1/∂ 4.28 ± 0.47 ns 4.27 ± 0.01 ns 1 ± 0.11
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 1/1/∂² 5.82 ± 0.01 ns 6.1 ± 0.021 ns 0.954 ± 0.0037
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 1/1/∇ 6.3 ± 0.04 ns 6.25 ± 0.019 ns 1.01 ± 0.0071
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 1/1/∇² 4.27 ± 0.01 ns 4.5 ± 0.01 ns 0.949 ± 0.0031
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 2/2/∂ 4.28 ± 1 ns 4.27 ± 0.01 ns 1 ± 0.24
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 2/2/∂² 5.82 ± 0.01 ns 5.78 ± 0.02 ns 1.01 ± 0.0039
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 2/2/∇ 6.3 ± 0.05 ns 6.52 ± 0.03 ns 0.966 ± 0.0089
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 2/2/∇² 4.27 ± 0.01 ns 4.5 ± 0.01 ns 0.949 ± 0.0031
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 0/0/∂ 4.65 ± 0.001 ns 5.26 ± 0.01 ns 0.884 ± 0.0017
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 0/0/∂² 5.26 ± 0.01 ns 4.96 ± 0.001 ns 1.06 ± 0.002
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 0/0/∇ 6.05 ± 0.03 ns 6.03 ± 0.06 ns 1 ± 0.011
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 0/0/∇² 3.11 ± 0.009 ns 3.73 ± 0.01 ns 0.833 ± 0.0033
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 1/1/∂ 4.65 ± 0.001 ns 5.26 ± 0.01 ns 0.884 ± 0.0017
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 1/1/∂² 5.26 ± 0.01 ns 4.96 ± 0.001 ns 1.06 ± 0.002
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 1/1/∇ 6.05 ± 0.03 ns 6.02 ± 0.051 ns 1 ± 0.0099
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 1/1/∇² 3.11 ± 0.01 ns 3.73 ± 0.01 ns 0.833 ± 0.0035
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 2/2/∂ 4.65 ± 0.001 ns 5.26 ± 0.01 ns 0.884 ± 0.0017
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 2/2/∂² 5.26 ± 0.01 ns 4.96 ± 0.001 ns 1.06 ± 0.002
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 2/2/∇ 6.05 ± 0.03 ns 6.02 ± 0.051 ns 1 ± 0.0099
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 2/2/∇² 3.11 ± 0.01 ns 3.73 ± 0.01 ns 0.833 ± 0.0035
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 0/0/∂ 10.3 ± 0.04 ns 10.3 ± 0.02 ns 0.995 ± 0.0043
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 0/0/∂² 4.96 ± 0.01 ns 4.96 ± 0.01 ns 1 ± 0.0029
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 0/0/∇ 12.6 ± 0.11 ns 12.5 ± 0.14 ns 1.01 ± 0.014
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 0/0/∇² 8.21 ± 0.08 ns 8.19 ± 0.059 ns 1 ± 0.012
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 1/1/∂ 10.3 ± 0.031 ns 10.3 ± 0.02 ns 0.995 ± 0.0036
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 1/1/∂² 4.96 ± 0.01 ns 4.96 ± 0.001 ns 1 ± 0.002
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 1/1/∇ 12.6 ± 0.11 ns 12.5 ± 0.15 ns 1.01 ± 0.015
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 1/1/∇² 8.21 ± 0.1 ns 8.19 ± 0.06 ns 1 ± 0.014
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 2/2/∂ 10.3 ± 0.03 ns 10.3 ± 0.02 ns 0.995 ± 0.0035
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 2/2/∂² 4.96 ± 0.01 ns 4.96 ± 0.01 ns 1 ± 0.0029
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 2/2/∇ 12.6 ± 0.1 ns 13.1 ± 0.15 ns 0.963 ± 0.013
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 2/2/∇² 8.23 ± 0.081 ns 8.19 ± 0.059 ns 1 ± 0.012
time_to_load 0.522 ± 0.0083 s 0.583 ± 0.0028 s 0.895 ± 0.015

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

@kylebeggs
Copy link
Member Author

@Davide-Miotti could you give this a quick look, it is not too big of a PR

Copy link
Member

@Davide-Miotti Davide-Miotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job! Good ideas both the CLAUDE.md and the integration of KernelAbstractions.jl

@kylebeggs kylebeggs merged commit 5a44e39 into main Jul 28, 2025
22 of 23 checks passed
@kylebeggs kylebeggs deleted the kernel-abstractions branch July 28, 2025 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants