Skip to content

GPU updates#89

Merged
kylebeggs merged 8 commits intomainfrom
gpu
Feb 20, 2026
Merged

GPU updates#89
kylebeggs merged 8 commits intomainfrom
gpu

Conversation

@kylebeggs
Copy link
Member

add Adapt, device kwarg

Thread `device` kwarg through all operator constructors and weight-
building paths. Add Adapt.adapt_structure for RadialBasisOperator and
Interpolator to enable GPU array conversion. Introduce _to_cpu helper
for KDTree compatibility, _solve_system! dispatch for symmetric vs
generic solvers, and an informative error when GPU stencil solve is
attempted. Clean up verbose argument lists in execution.jl.
@codecov
Copy link

codecov bot commented Feb 19, 2026

Codecov Report

❌ Patch coverage is 93.65079% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/operators/operators.jl 84.21% 3 Missing ⚠️
src/operators/operator_algebra.jl 83.33% 1 Missing ⚠️
Files with missing lines Coverage Δ
src/RadialBasisFunctions.jl 100.00% <ø> (ø)
src/interpolation.jl 100.00% <100.00%> (ø)
src/operators/directional.jl 100.00% <100.00%> (ø)
src/solve/api.jl 100.00% <100.00%> (ø)
src/solve/assembly.jl 97.48% <100.00%> (+0.09%) ⬆️
src/solve/execution.jl 99.06% <100.00%> (+0.02%) ⬆️
src/utils.jl 81.81% <100.00%> (+3.24%) ⬆️
src/operators/operator_algebra.jl 80.00% <83.33%> (-20.00%) ⬇️
src/operators/operators.jl 84.48% <84.21%> (+0.22%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 19, 2026

Benchmark Results

main 06f4efc... main / 06f4efc...
Directional 2.4 ± 0.13 ms 2.52 ± 0.089 ms 0.95 ± 0.061
Directional (per point) 2.43 ± 0.13 ms 2.52 ± 0.097 ms 0.962 ± 0.064
Gradient 8.51 ± 0.32 ms 9.09 ± 0.23 ms 0.935 ± 0.043
MonomialBasis/dim=1/deg=0 0.045 ± 0.012 μs 0.044 ± 0.011 μs 1.02 ± 0.37
MonomialBasis/dim=1/deg=1 0.0729 ± 0.019 μs 0.0779 ± 0.019 μs 0.936 ± 0.33
MonomialBasis/dim=1/deg=2 0.0811 ± 0.018 μs 0.0651 ± 0.018 μs 1.25 ± 0.45
MonomialBasis/dim=2/deg=0 0.0393 ± 0.0031 μs 24.6 ± 1 ns 1.59 ± 0.14
MonomialBasis/dim=2/deg=1 0.0338 ± 0.011 μs 0.0378 ± 0.011 μs 0.894 ± 0.41
MonomialBasis/dim=2/deg=2 0.0391 ± 0.012 μs 0.0448 ± 0.012 μs 0.874 ± 0.35
MonomialBasis/dim=3/deg=0 0.0349 ± 0.012 μs 0.0386 ± 0.012 μs 0.904 ± 0.42
MonomialBasis/dim=3/deg=1 0.045 ± 0.012 μs 0.0445 ± 0.012 μs 1.01 ± 0.37
MonomialBasis/dim=3/deg=2 0.0452 ± 0.012 μs 0.0459 ± 0.011 μs 0.984 ± 0.35
Partial 2.58 ± 0.14 ms 2.73 ± 0.095 ms 0.943 ± 0.06
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∂ 9.76 ± 0.2 ns 9.9 ± 0.18 ns 0.986 ± 0.027
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∂² 10.2 ± 0.079 ns 10.1 ± 0.19 ns 1.01 ± 0.021
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∇ 17 ± 0.061 ns 17.1 ± 0.07 ns 0.991 ± 0.0054
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∇² 18.1 ± 0.061 ns 18.6 ± 0.061 ns 0.975 ± 0.0046
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∂ 9.73 ± 0.13 ns 9.91 ± 0.16 ns 0.982 ± 0.021
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∂² 10.1 ± 0.17 ns 10.2 ± 0.18 ns 0.988 ± 0.024
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∇ 17.2 ± 0.14 ns 17.1 ± 0.07 ns 1 ± 0.0092
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∇² 18.1 ± 0.07 ns 18.6 ± 0.069 ns 0.975 ± 0.0052
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∂ 9.71 ± 0.16 ns 9.9 ± 0.18 ns 0.981 ± 0.024
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∂² 10.2 ± 0.08 ns 10.1 ± 0.19 ns 1.01 ± 0.021
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∇ 17 ± 0.061 ns 17.1 ± 0.07 ns 0.991 ± 0.0054
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∇² 18.1 ± 0.07 ns 18.7 ± 0.17 ns 0.97 ± 0.0097
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∂ 6.27 ± 0.12 ns 6.32 ± 0.01 ns 0.992 ± 0.019
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∂² 14.2 ± 0.07 ns 14.2 ± 0.089 ns 0.999 ± 0.008
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∇ 8.7 ± 0.07 ns 8.54 ± 0.18 ns 1.02 ± 0.023
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∇² 15.8 ± 0.08 ns 15.7 ± 0.051 ns 1.01 ± 0.0061
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∂ 6.81 ± 0.011 ns 6.32 ± 0.01 ns 1.08 ± 0.0024
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∂² 14.2 ± 0.02 ns 14.2 ± 0.021 ns 1 ± 0.002
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∇ 8.72 ± 0.06 ns 8.6 ± 0.09 ns 1.01 ± 0.013
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∇² 15.8 ± 0.07 ns 15.7 ± 0.12 ns 1.01 ± 0.0089
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∂ 6.33 ± 0.08 ns 6.32 ± 0.001 ns 1 ± 0.013
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∂² 14.2 ± 0.021 ns 14.2 ± 0.1 ns 0.999 ± 0.0072
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∇ 8.72 ± 0.06 ns 8.53 ± 0.18 ns 1.02 ± 0.023
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∇² 15.8 ± 0.07 ns 15.7 ± 0.06 ns 1.01 ± 0.0059
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 0/0/∂ 3.73 ± 0.01 ns 3.73 ± 0.009 ns 1 ± 0.0036
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 0/0/∂² 4.7 ± 0.01 ns 4.7 ± 0.011 ns 1 ± 0.0032
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 0/0/∇ 5.69 ± 0.02 ns 5.61 ± 0.04 ns 1.01 ± 0.0081
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 0/0/∇² 3.11 ± 0.001 ns 3.11 ± 0 ns 1 ± 0.00032
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 1/1/∂ 3.73 ± 0.01 ns 3.72 ± 0.01 ns 1 ± 0.0038
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 1/1/∂² 4.7 ± 0.01 ns 4.7 ± 0.01 ns 1 ± 0.003
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 1/1/∇ 5.69 ± 0.02 ns 5.61 ± 0.04 ns 1.01 ± 0.0081
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 1/1/∇² 3.11 ± 0.001 ns 3.11 ± 0 ns 1 ± 0.00032
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 2/2/∂ 3.73 ± 0.01 ns 3.72 ± 0.01 ns 1 ± 0.0038
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 2/2/∂² 4.7 ± 0.01 ns 4.7 ± 0.01 ns 1 ± 0.003
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 2/2/∇ 5.69 ± 0.029 ns 5.61 ± 0.039 ns 1.01 ± 0.0087
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 2/2/∇² 3.11 ± 0.001 ns 3.11 ± 0 ns 1 ± 0.00032
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 0/0/∂ 4.27 ± 0.01 ns 4.27 ± 0.01 ns 1 ± 0.0033
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 0/0/∂² 5.56 ± 0.02 ns 5.54 ± 0.02 ns 1 ± 0.0051
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 0/0/∇ 7.11 ± 0.3 ns 7.12 ± 0.01 ns 0.999 ± 0.042
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 0/0/∇² 4.27 ± 0.01 ns 4.28 ± 0.01 ns 0.998 ± 0.0033
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 1/1/∂ 4.27 ± 0.01 ns 4.27 ± 0.01 ns 1 ± 0.0033
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 1/1/∂² 5.56 ± 0.02 ns 5.54 ± 0.02 ns 1 ± 0.0051
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 1/1/∇ 7.11 ± 0.26 ns 7.12 ± 0.011 ns 0.999 ± 0.037
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 1/1/∇² 4.27 ± 0.01 ns 4.28 ± 0.01 ns 0.998 ± 0.0033
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 2/2/∂ 4.27 ± 0.01 ns 4.27 ± 0.01 ns 1 ± 0.0033
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 2/2/∂² 5.56 ± 0.02 ns 5.54 ± 0.019 ns 1 ± 0.005
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 2/2/∇ 7.11 ± 0.3 ns 7.11 ± 0.19 ns 1 ± 0.05
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 2/2/∇² 4.27 ± 0.01 ns 4.28 ± 0.01 ns 0.998 ± 0.0033
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 0/0/∂ 4.96 ± 0.01 ns 5.27 ± 0.01 ns 0.941 ± 0.0026
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 0/0/∂² 4.96 ± 0.01 ns 4.96 ± 0.01 ns 1 ± 0.0029
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 0/0/∇ 6.11 ± 0.07 ns 6.09 ± 0.05 ns 1 ± 0.014
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 0/0/∇² 3.11 ± 0.01 ns 3.11 ± 0.01 ns 1 ± 0.0046
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 1/1/∂ 4.96 ± 0.009 ns 5.27 ± 0.01 ns 0.941 ± 0.0025
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 1/1/∂² 4.96 ± 0.01 ns 4.96 ± 0.001 ns 1 ± 0.002
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 1/1/∇ 6.11 ± 0.08 ns 6.09 ± 0.06 ns 1 ± 0.016
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 1/1/∇² 3.11 ± 0.01 ns 3.11 ± 0.009 ns 1 ± 0.0043
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 2/2/∂ 4.96 ± 0.011 ns 5.27 ± 0.01 ns 0.941 ± 0.0027
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 2/2/∂² 4.96 ± 0.01 ns 4.96 ± 0.01 ns 1 ± 0.0029
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 2/2/∇ 6.11 ± 0.07 ns 6.09 ± 0.041 ns 1 ± 0.013
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 2/2/∇² 3.11 ± 0.01 ns 3.11 ± 0.01 ns 1 ± 0.0046
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 0/0/∂ 10.2 ± 0.25 ns 10.1 ± 0.25 ns 1 ± 0.035
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 0/0/∂² 5.18 ± 0.019 ns 4.96 ± 0.001 ns 1.04 ± 0.0038
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 0/0/∇ 12.5 ± 0.07 ns 12.5 ± 0.07 ns 1 ± 0.0079
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 0/0/∇² 8.06 ± 0.09 ns 8.05 ± 0.081 ns 1 ± 0.015
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 1/1/∂ 10.2 ± 0.25 ns 10.1 ± 0.25 ns 1 ± 0.035
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 1/1/∂² 4.72 ± 0.04 ns 4.96 ± 0.001 ns 0.952 ± 0.0081
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 1/1/∇ 12.5 ± 0.071 ns 12.5 ± 0.07 ns 1 ± 0.008
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 1/1/∇² 8.58 ± 0.14 ns 8.06 ± 0.08 ns 1.06 ± 0.02
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 2/2/∂ 10.1 ± 0.25 ns 10.2 ± 0.24 ns 0.995 ± 0.034
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 2/2/∂² 4.72 ± 0.04 ns 4.96 ± 0.01 ns 0.952 ± 0.0083
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 2/2/∇ 12.9 ± 0.1 ns 12.5 ± 0.069 ns 1.04 ± 0.0099
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 2/2/∇² 8.06 ± 0.09 ns 8.05 ± 0.071 ns 1 ± 0.014
time_to_load 0.799 ± 0.0076 s 0.794 ± 0.0016 s 1.01 ± 0.0098

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

- Widen batch call from Vector to AbstractVector and use map for GPU parallelism
- Add test that non-CPU device throws ArgumentError
- Add breadcrumb comment in _construct_sparse for future GPU sparse conversion (#88)
On Julia 1.10/1.11, _construct_sparse return type is inferred as a Union,
causing AugmentedRuleReturnError. Explicitly parameterize AugmentedReturn
with the primal type from the Enzyme return annotation so type parameters
match exactly.
…edOperator

Replace `out[:, d] = op.weights[d] * x` with `mul!(view(...), ...)` in
three _eval_op methods, matching the pattern already used in the in-place
variants. Also use `similar` instead of explicit Array constructors for
GPU-safe allocation.
Combine pre-computed weights directly instead of rebuilding from scratch
via the unified constructor, which always produces CPU SparseMatrixCSC.
This prevents scalar indexing errors when calling combined operators on
CuArrays.
Avoids errors when collections are empty or on GPU arrays where
first() may not be efficient.
@kylebeggs kylebeggs merged commit b373516 into main Feb 20, 2026
24 of 26 checks passed
@kylebeggs kylebeggs deleted the gpu branch February 20, 2026 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant