Various pointer-related improvements#765
Merged
christiangnrd merged 10 commits intomainfrom Apr 11, 2026
Merged
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #765 +/- ##
==========================================
+ Coverage 82.66% 83.02% +0.36%
==========================================
Files 62 62
Lines 2872 2852 -20
==========================================
- Hits 2374 2368 -6
+ Misses 498 484 -14 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Contributor
There was a problem hiding this comment.
Metal Benchmarks
Details
| Benchmark suite | Current: a5acbf5 | Previous: 1af3aef | Ratio |
|---|---|---|---|
array/accumulate/Float32/1d |
1157708 ns |
1113770.5 ns |
1.04 |
array/accumulate/Float32/dims=1 |
1574625.5 ns |
1551084 ns |
1.02 |
array/accumulate/Float32/dims=1L |
9826688 ns |
9823458.5 ns |
1.00 |
array/accumulate/Float32/dims=2 |
1918875 ns |
1869125 ns |
1.03 |
array/accumulate/Float32/dims=2L |
7253791.5 ns |
7234500 ns |
1.00 |
array/accumulate/Int64/1d |
1270791 ns |
1259292 ns |
1.01 |
array/accumulate/Int64/dims=1 |
1819479.5 ns |
1833292 ns |
0.99 |
array/accumulate/Int64/dims=1L |
11682958 ns |
11548646 ns |
1.01 |
array/accumulate/Int64/dims=2 |
2200208.5 ns |
2156000 ns |
1.02 |
array/accumulate/Int64/dims=2L |
9796271 ns |
9765500 ns |
1.00 |
array/broadcast |
610125 ns |
608792 ns |
1.00 |
array/construct |
6709 ns |
6000 ns |
1.12 |
array/permutedims/2d |
1191167 ns |
1167020.5 ns |
1.02 |
array/permutedims/3d |
1699625 ns |
1679667 ns |
1.01 |
array/permutedims/4d |
2419334 ns |
2380542 ns |
1.02 |
array/private/copy |
585583 ns |
596666 ns |
0.98 |
array/private/copyto!/cpu_to_gpu |
806583 ns |
786083.5 ns |
1.03 |
array/private/copyto!/gpu_to_cpu |
812000 ns |
796625 ns |
1.02 |
array/private/copyto!/gpu_to_gpu |
647958 ns |
631500 ns |
1.03 |
array/private/iteration/findall/bool |
1408584 ns |
1411938 ns |
1.00 |
array/private/iteration/findall/int |
1578875 ns |
1563375 ns |
1.01 |
array/private/iteration/findfirst/bool |
2068437.5 ns |
2023833 ns |
1.02 |
array/private/iteration/findfirst/int |
2133458.5 ns |
2045020.5 ns |
1.04 |
array/private/iteration/findmin/1d |
2530542 ns |
2470417 ns |
1.02 |
array/private/iteration/findmin/2d |
1828667 ns |
1774854 ns |
1.03 |
array/private/iteration/logical |
2701334 ns |
2618500 ns |
1.03 |
array/private/iteration/scalar |
4795375 ns |
4934917 ns |
0.97 |
array/random/rand/Float32 |
584250 ns |
634500 ns |
0.92 |
array/random/rand/Int64 |
794375 ns |
771417 ns |
1.03 |
array/random/rand!/Float32 |
584833 ns |
578709 ns |
1.01 |
array/random/rand!/Int64 |
553916 ns |
547250 ns |
1.01 |
array/random/randn/Float32 |
976292 ns |
1008833.5 ns |
0.97 |
array/random/randn!/Float32 |
755750 ns |
714625 ns |
1.06 |
array/reductions/mapreduce/Float32/1d |
1073333 ns |
1041062.5 ns |
1.03 |
array/reductions/mapreduce/Float32/dims=1 |
846583.5 ns |
838292 ns |
1.01 |
array/reductions/mapreduce/Float32/dims=1L |
1346709 ns |
1563812.5 ns |
0.86 |
array/reductions/mapreduce/Float32/dims=2 |
861917 ns |
864854 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=2L |
1828000 ns |
1798229.5 ns |
1.02 |
array/reductions/mapreduce/Int64/1d |
1546000 ns |
1534250 ns |
1.01 |
array/reductions/mapreduce/Int64/dims=1 |
1113250 ns |
1102709 ns |
1.01 |
array/reductions/mapreduce/Int64/dims=1L |
2035687 ns |
1996770.5 ns |
1.02 |
array/reductions/mapreduce/Int64/dims=2 |
1175167 ns |
1183062.5 ns |
0.99 |
array/reductions/mapreduce/Int64/dims=2L |
3639562.5 ns |
3617646 ns |
1.01 |
array/reductions/reduce/Float32/1d |
1051062 ns |
1029479.5 ns |
1.02 |
array/reductions/reduce/Float32/dims=1 |
843166 ns |
830333 ns |
1.02 |
array/reductions/reduce/Float32/dims=1L |
1333499.5 ns |
1317584 ns |
1.01 |
array/reductions/reduce/Float32/dims=2 |
867958.5 ns |
856145.5 ns |
1.01 |
array/reductions/reduce/Float32/dims=2L |
1816875 ns |
1795250 ns |
1.01 |
array/reductions/reduce/Int64/1d |
1399437.5 ns |
1500500 ns |
0.93 |
array/reductions/reduce/Int64/dims=1 |
1112792 ns |
1106458 ns |
1.01 |
array/reductions/reduce/Int64/dims=1L |
2017437.5 ns |
2016083 ns |
1.00 |
array/reductions/reduce/Int64/dims=2 |
1173833 ns |
1154354.5 ns |
1.02 |
array/reductions/reduce/Int64/dims=2L |
4246374.5 ns |
4213709 ns |
1.01 |
array/shared/copy |
243000 ns |
276625 ns |
0.88 |
array/shared/copyto!/cpu_to_gpu |
82042 ns |
97833 ns |
0.84 |
array/shared/copyto!/gpu_to_cpu |
83958 ns |
98208 ns |
0.85 |
array/shared/copyto!/gpu_to_gpu |
83666 ns |
122000 ns |
0.69 |
array/shared/iteration/findall/bool |
1432541 ns |
1433249.5 ns |
1.00 |
array/shared/iteration/findall/int |
1577875 ns |
1571812.5 ns |
1.00 |
array/shared/iteration/findfirst/bool |
1653354.5 ns |
1630083 ns |
1.01 |
array/shared/iteration/findfirst/int |
1676834 ns |
1635084 ns |
1.03 |
array/shared/iteration/findmin/1d |
2134292 ns |
2098979 ns |
1.02 |
array/shared/iteration/findmin/2d |
1819958.5 ns |
1779458 ns |
1.02 |
array/shared/iteration/logical |
2265937.5 ns |
2292083 ns |
0.99 |
array/shared/iteration/scalar |
213375 ns |
202833 ns |
1.05 |
integration/byval/reference |
1589042 ns |
1549208 ns |
1.03 |
integration/byval/slices=1 |
1604833 ns |
1564292 ns |
1.03 |
integration/byval/slices=2 |
2669875 ns |
2597625 ns |
1.03 |
integration/byval/slices=3 |
10157500 ns |
8449250 ns |
1.20 |
integration/metaldevrt |
881833 ns |
863917 ns |
1.02 |
kernel/indexing |
608500 ns |
594792 ns |
1.02 |
kernel/indexing_checked |
621708 ns |
606541.5 ns |
1.03 |
kernel/launch |
12667 ns |
11833 ns |
1.07 |
kernel/rand |
570459 ns |
569167 ns |
1.00 |
latency/import |
1424736708.5 ns |
1420952187.5 ns |
1.00 |
latency/precompile |
25499036000 ns |
25612710875 ns |
1.00 |
latency/ttfp |
2349867458.5 ns |
2339829875 ns |
1.00 |
metal/synchronization/context |
20541 ns |
20000 ns |
1.03 |
metal/synchronization/stream |
19667 ns |
19000 ns |
1.04 |
This comment was automatically generated by workflow using github-action-benchmark.
b93c98c to
65c2e85
Compare
Member
Author
|
Copy-related performance improvements and regressions seem to be noise. |
A more specific version is defined above
65c2e85 to
a5acbf5
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Each commit should be a very quick review. Best viewed with whitespace hidden.
Some code coverage improvements, bug fixes, and simplification.