perf(solve): reduce memory allocations in solve system hot paths by kylebeggs · Pull Request #85 · JuliaMeshless/RadialBasisFunctions.jl

kylebeggs · 2026-02-17T14:32:42Z

Summary

forward_cache.jl: Pre-allocate A_full and b workspace outside the loop, reuse with fill!; replace [data[i] for i in neighbors] array comprehension with view(data, neighbors); replace manual lower-triangle copy with copytri!; eliminate intermediate w = λ[1:k, :] slice
assembly.jl: Add in-place _build_stencil!(λ, A, b, ...) variant using ldiv!/bunchkaufman! to write solution into pre-allocated buffer and return a view instead of allocating a new array + slice
execution.jl: Pre-allocate λ solve buffer alongside A and b in weight_kernel; explicitly reset A_full and b per stencil iteration; call new in-place _build_stencil! variant
interpolation.jl: Replace scalar accumulation loop with dot() for polynomial evaluation; add @inbounds to RBF accumulation loop

Closes #73

Test plan

Full test suite passes (julia --project=. -e "using Pkg; Pkg.test()")
AD extension tests pass (Enzyme + Mooncake) since _forward_with_cache output format is preserved
Benchmark with @benchmark update_weights!(lap) shows reduced allocations
Benchmark Interpolator evaluation shows reduced allocations

Benchmark results (10k points, 2D, PHS3 poly_deg=2)

`update_weights!` (Laplacian)

Metric	`main`	this PR	Change
Allocations	358,703	300,703	-16.2%
Memory	139.37 MiB	110.48 MiB	-20.7%

Interpolator evaluation (500 pts)

Metric	`main`	this PR	Notes
Single-point allocs	2	2	Already minimal; `dot()` + `@inbounds` improve speed
Multi-point allocs	202	202	Already minimal

- Pre-allocate λ buffer and use in-place _build_stencil! with ldiv!/bunchkaufman! - Use view() for local_data instead of allocating new vectors - Replace polynomial evaluation allocation with dot() in Interpolator - Add copytri! for efficient symmetric matrix caching in forward_cache - Add _weight_view dispatch for Vector vs Matrix result slicing

codecov · 2026-02-17T15:19:44Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines	Coverage Δ
src/interpolation.jl	`100.00% <100.00%> (ø)`
src/solve/assembly.jl	`97.38% <100.00%> (-2.62%)`	⬇️
src/solve/execution.jl	`99.03% <100.00%> (+0.03%)`	⬆️
src/solve/forward_cache.jl	`100.00% <100.00%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2026-02-17T15:24:34Z

Benchmark Results

	main	`7a8035d`...	main / `7a8035d`...
Directional	2.46 ± 0.14 ms	2.45 ± 0.14 ms	1 ± 0.082
Directional (per point)	2.4 ± 0.13 ms	2.49 ± 0.14 ms	0.964 ± 0.075
Gradient	8.33 ± 0.38 ms	8.29 ± 0.39 ms	1 ± 0.066
MonomialBasis/dim=1/deg=0	0.0356 ± 0.014 μs	0.0463 ± 0.013 μs	0.77 ± 0.37
MonomialBasis/dim=1/deg=1	0.0753 ± 0.022 μs	0.0767 ± 0.022 μs	0.981 ± 0.4
MonomialBasis/dim=1/deg=2	0.0693 ± 0.022 μs	0.0878 ± 0.024 μs	0.789 ± 0.33
MonomialBasis/dim=2/deg=0	0.0351 ± 0.013 μs	0.0345 ± 0.001 μs	1.02 ± 0.37
MonomialBasis/dim=2/deg=1	25.3 ± 13 ns	0.0353 ± 0.013 μs	0.717 ± 0.46
MonomialBasis/dim=2/deg=2	30 ± 13 ns	0.0413 ± 0.014 μs	0.726 ± 0.41
MonomialBasis/dim=3/deg=0	29.2 ± 14 ns	0.0361 ± 0.014 μs	0.81 ± 0.49
MonomialBasis/dim=3/deg=1	0.0367 ± 0.015 μs	0.0429 ± 0.014 μs	0.855 ± 0.44
MonomialBasis/dim=3/deg=2	0.038 ± 0.014 μs	0.0489 ± 0.013 μs	0.779 ± 0.36
Partial	2.43 ± 0.14 ms	2.68 ± 0.12 ms	0.909 ± 0.069
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∂	9.68 ± 0.079 ns	9.84 ± 0.11 ns	0.984 ± 0.014
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∂²	10.1 ± 0.17 ns	10.2 ± 0.13 ns	0.99 ± 0.021
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∇	17.1 ± 0.12 ns	17.1 ± 0.07 ns	0.999 ± 0.0081
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∇²	18.6 ± 0.04 ns	18.6 ± 0.031 ns	1 ± 0.0027
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∂	9.68 ± 0.08 ns	9.81 ± 0.09 ns	0.987 ± 0.012
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∂²	10.2 ± 0.08 ns	10.1 ± 0.16 ns	1.01 ± 0.018
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∇	17.1 ± 0.15 ns	17.1 ± 0.07 ns	0.999 ± 0.0097
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∇²	18.6 ± 0.04 ns	18.6 ± 0.041 ns	1 ± 0.0031
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∂	9.68 ± 0.06 ns	9.89 ± 0.1 ns	0.979 ± 0.012
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∂²	10 ± 0.21 ns	10.1 ± 0.18 ns	0.995 ± 0.027
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∇	17.1 ± 0.15 ns	17.1 ± 0.069 ns	0.999 ± 0.0096
RBF/Gaussian, exp(-(ε*r)²)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∇²	18.6 ± 0.031 ns	18.6 ± 0.061 ns	1 ± 0.0037
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∂	6.32 ± 0.08 ns	6.14 ± 0.33 ns	1.03 ± 0.057
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∂²	14.2 ± 0.02 ns	14.2 ± 0.031 ns	1 ± 0.0026
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∇	8.53 ± 0.23 ns	8.67 ± 0.24 ns	0.984 ± 0.038
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 0/0/∇²	16 ± 0.16 ns	15.8 ± 0.11 ns	1.01 ± 0.012
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∂	6.32 ± 0.01 ns	6.5 ± 0.37 ns	0.972 ± 0.055
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∂²	14.2 ± 0.08 ns	14.2 ± 0.071 ns	1 ± 0.0075
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∇	8.54 ± 0.13 ns	8.63 ± 0.29 ns	0.99 ± 0.037
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 1/1/∇²	16 ± 0.16 ns	16.1 ± 0.09 ns	0.991 ± 0.011
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∂	6.32 ± 0.08 ns	6.14 ± 0.27 ns	1.03 ± 0.047
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∂²	14.2 ± 0.08 ns	14.2 ± 0.031 ns	1 ± 0.006
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∇	8.6 ± 0.08 ns	8.62 ± 0.33 ns	0.997 ± 0.039
RBF/Inverse Multiquadrics, 1/sqrt((r*ε)²+1)
├─Shape factor: ε = 1
└─Polynomial augmentation: degree 2/2/∇²	16 ± 0.099 ns	15.8 ± 0.13 ns	1.01 ± 0.01
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 0/0/∂	3.42 ± 0.001 ns	3.72 ± 0.01 ns	0.919 ± 0.0025
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 0/0/∂²	4.7 ± 0.01 ns	4.7 ± 0.01 ns	1 ± 0.003
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 0/0/∇	5.62 ± 0.039 ns	5.69 ± 0.011 ns	0.988 ± 0.0071
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 0/0/∇²	3.11 ± 0 ns	3.11 ± 0 ns	1 ± 0
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 1/1/∂	3.42 ± 0.001 ns	3.72 ± 0.01 ns	0.919 ± 0.0025
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 1/1/∂²	4.7 ± 0.01 ns	4.7 ± 0.011 ns	1 ± 0.0032
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 1/1/∇	5.61 ± 0.041 ns	5.7 ± 0.02 ns	0.984 ± 0.008
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 1/1/∇²	3.11 ± 0 ns	3.11 ± 0 ns	1 ± 0
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 2/2/∂	3.42 ± 0.01 ns	3.72 ± 0.01 ns	0.919 ± 0.0037
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 2/2/∂²	4.7 ± 0.01 ns	4.7 ± 0.01 ns	1 ± 0.003
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 2/2/∇	5.62 ± 0.049 ns	5.69 ± 0.02 ns	0.988 ± 0.0093
RBF/Polyharmonic spline (r³)
└─Polynomial augmentation: degree 2/2/∇²	3.11 ± 0 ns	3.11 ± 0 ns	1 ± 0
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 0/0/∂	4.27 ± 0.01 ns	4.27 ± 0.01 ns	1 ± 0.0033
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 0/0/∂²	5.58 ± 0.08 ns	5.54 ± 0.021 ns	1.01 ± 0.015
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 0/0/∇	8.6 ± 0.1 ns	6.85 ± 0.01 ns	1.26 ± 0.015
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 0/0/∇²	4.27 ± 0.01 ns	4.27 ± 0.01 ns	1 ± 0.0033
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 1/1/∂	4.27 ± 0.01 ns	4.27 ± 0.01 ns	1 ± 0.0033
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 1/1/∂²	5.58 ± 0.06 ns	5.52 ± 0.03 ns	1.01 ± 0.012
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 1/1/∇	7.04 ± 0.27 ns	6.85 ± 0.01 ns	1.03 ± 0.039
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 1/1/∇²	4.27 ± 0.01 ns	4.27 ± 0.01 ns	1 ± 0.0033
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 2/2/∂	4.27 ± 0.01 ns	4.27 ± 0.01 ns	1 ± 0.0033
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 2/2/∂²	5.58 ± 0.08 ns	5.54 ± 0.03 ns	1.01 ± 0.015
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 2/2/∇	6.97 ± 0.27 ns	6.85 ± 0.01 ns	1.02 ± 0.04
RBF/Polyharmonic spline (r¹)
└─Polynomial augmentation: degree 2/2/∇²	4.27 ± 0.01 ns	4.27 ± 0.01 ns	1 ± 0.0033
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 0/0/∂	4.96 ± 0.001 ns	5.26 ± 0.01 ns	0.943 ± 0.0018
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 0/0/∂²	4.65 ± 0.01 ns	4.65 ± 0.01 ns	1 ± 0.003
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 0/0/∇	6.19 ± 0.011 ns	6.11 ± 0.079 ns	1.01 ± 0.013
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 0/0/∇²	3.42 ± 0.001 ns	3.42 ± 0.001 ns	1 ± 0.00041
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 1/1/∂	4.96 ± 0.001 ns	5.26 ± 0.01 ns	0.943 ± 0.0018
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 1/1/∂²	4.65 ± 0.01 ns	4.65 ± 0.01 ns	1 ± 0.003
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 1/1/∇	6.19 ± 0.011 ns	6.1 ± 0.071 ns	1.01 ± 0.012
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 1/1/∇²	3.42 ± 0.001 ns	3.42 ± 0.001 ns	1 ± 0.00041
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 2/2/∂	4.96 ± 0.001 ns	5.26 ± 0.01 ns	0.943 ± 0.0018
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 2/2/∂²	4.65 ± 0.01 ns	4.65 ± 0.01 ns	1 ± 0.003
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 2/2/∇	6.19 ± 0.011 ns	6.11 ± 0.07 ns	1.01 ± 0.012
RBF/Polyharmonic spline (r⁵)
└─Polynomial augmentation: degree 2/2/∇²	3.42 ± 0.01 ns	3.42 ± 0.001 ns	1 ± 0.0029
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 0/0/∂	10.4 ± 0.081 ns	10.3 ± 0.11 ns	1.01 ± 0.013
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 0/0/∂²	4.96 ± 0.01 ns	4.96 ± 0.01 ns	1 ± 0.0029
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 0/0/∇	12.5 ± 0.071 ns	12.5 ± 0.081 ns	1 ± 0.0086
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 0/0/∇²	8.08 ± 0.09 ns	8.05 ± 0.071 ns	1 ± 0.014
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 1/1/∂	10.4 ± 0.089 ns	10.3 ± 0.11 ns	1.01 ± 0.014
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 1/1/∂²	4.96 ± 0.001 ns	4.96 ± 0.01 ns	1 ± 0.002
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 1/1/∇	12.5 ± 0.091 ns	12.6 ± 0.091 ns	0.995 ± 0.01
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 1/1/∇²	8.08 ± 0.09 ns	8.06 ± 0.08 ns	1 ± 0.015
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 2/2/∂	10.4 ± 0.09 ns	10.2 ± 0.16 ns	1.01 ± 0.018
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 2/2/∂²	4.96 ± 0.001 ns	4.96 ± 0.01 ns	1 ± 0.002
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 2/2/∇	12.5 ± 0.1 ns	12.5 ± 0.1 ns	1 ± 0.011
RBF/Polyharmonic spline (r⁷)
└─Polynomial augmentation: degree 2/2/∇²	8.08 ± 0.07 ns	8.05 ± 0.062 ns	1 ± 0.012
time_to_load	0.646 ± 0.0014 s	0.806 ± 0.0054 s	0.802 ± 0.0056

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

kylebeggs force-pushed the perf/reduce-allocations-solve-hotpaths branch from 5548b87 to 571e263 Compare February 17, 2026 15:11

kylebeggs added 2 commits February 17, 2026 10:34

Merge branch 'main' into perf/reduce-allocations-solve-hotpaths

5f473a5

Merge branch 'main' into perf/reduce-allocations-solve-hotpaths

7a8035d

kylebeggs merged commit 0089df9 into main Feb 17, 2026
25 of 26 checks passed

kylebeggs deleted the perf/reduce-allocations-solve-hotpaths branch February 20, 2026 20:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(solve): reduce memory allocations in solve system hot paths#85

perf(solve): reduce memory allocations in solve system hot paths#85
kylebeggs merged 3 commits intomainfrom
perf/reduce-allocations-solve-hotpaths

kylebeggs commented Feb 17, 2026 •

edited

Loading

Uh oh!

codecov bot commented Feb 17, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kylebeggs commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Benchmark results (10k points, 2D, PHS3 poly_deg=2)

update_weights! (Laplacian)

Interpolator evaluation (500 pts)

Uh oh!

codecov bot commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results

Benchmark Plots

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kylebeggs commented Feb 17, 2026 •

edited

Loading

`update_weights!` (Laplacian)

codecov bot commented Feb 17, 2026 •

edited

Loading

github-actions bot commented Feb 17, 2026 •

edited

Loading