Skip to content

Fix sret alloca alignment to match callee's preferred type alignment#61192

Merged
maleadt merged 1 commit intomasterfrom
tb/sret_align
Mar 4, 2026
Merged

Fix sret alloca alignment to match callee's preferred type alignment#61192
maleadt merged 1 commit intomasterfrom
tb/sret_align

Conversation

@maleadt
Copy link
Copy Markdown
Member

@maleadt maleadt commented Feb 27, 2026

The caller's sret alloca used julia_alignment (union_align) which can be smaller than the LLVM preferred type alignment that the callee uses for its loads/stores. For example, a struct of floats gets julia_alignment=4 but the callee uses DL.getPrefTypeAlign()=8, generating 8-byte-aligned memcpy operations. On strict-alignment targets (NVPTX), the resulting misaligned access causes CUDA_ERROR_MISALIGNED_ADDRESS.

Fix by computing the sret type's preferred alignment from the callee's StructRet attribute and taking the max with union_align, matching the alignment the callee computes for its sret parameter.

Fixes JuliaGPU/CUDA.jl#3034
Regression introduced in 1.12 by #55730

@maleadt maleadt requested a review from vtjnash February 27, 2026 14:37
@maleadt maleadt added compiler:codegen Generation of LLVM IR and native code gpu Affects running Julia on a GPU backport 1.12 Change should be backported to release-1.12 backport 1.13 Change should be backported to release-1.13 labels Feb 27, 2026
@maleadt maleadt force-pushed the tb/sret_align branch 2 times, most recently from a908e3c to 2464148 Compare March 2, 2026 14:43
@gbaraldi
Copy link
Copy Markdown
Member

gbaraldi commented Mar 2, 2026

LGTM

@gbaraldi gbaraldi added the merge me PR is reviewed. Merge when all tests are passing label Mar 2, 2026
@maleadt maleadt removed the merge me PR is reviewed. Merge when all tests are passing label Mar 2, 2026
@maleadt maleadt marked this pull request as draft March 3, 2026 08:38
@maleadt maleadt marked this pull request as ready for review March 3, 2026 09:12
@maleadt maleadt requested a review from vtjnash March 3, 2026 09:13
@KristofferC KristofferC mentioned this pull request Mar 3, 2026
56 tasks
@maleadt maleadt requested a review from topolarity March 3, 2026 19:41
The sret parameter's alignment attribute was set to LLVM's preferred type
alignment (getPrefTypeAlign), which can exceed julia_alignment. This caused
misaligned memory accesses on strict-alignment targets like NVPTX, since the
caller's alloca uses julia_alignment. Fix by setting the sret alignment to
julia_alignment and not overriding it in the function definition, so that
caller and callee agree on the same alignment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@topolarity topolarity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @maleadt !

@topolarity topolarity added the merge me PR is reviewed. Merge when all tests are passing label Mar 3, 2026
@maleadt maleadt merged commit f519f3e into master Mar 4, 2026
8 of 9 checks passed
@maleadt maleadt deleted the tb/sret_align branch March 4, 2026 07:05
@DilumAluthge DilumAluthge removed the merge me PR is reviewed. Merge when all tests are passing label Mar 4, 2026
maleadt added a commit that referenced this pull request Mar 6, 2026
The sret parameter's alignment attribute was set to LLVM's preferred type
alignment (getPrefTypeAlign), which can exceed julia_alignment. This caused
misaligned memory accesses on strict-alignment targets like NVPTX, since the
caller's alloca uses julia_alignment. Fix by setting the sret alignment to
julia_alignment and not overriding it in the function definition, so that
caller and callee agree on the same alignment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
maleadt added a commit that referenced this pull request Mar 6, 2026
The sret parameter's alignment attribute was set to LLVM's preferred type
alignment (getPrefTypeAlign), which can exceed julia_alignment. This caused
misaligned memory accesses on strict-alignment targets like NVPTX, since the
caller's alloca uses julia_alignment. Fix by setting the sret alignment to
julia_alignment and not overriding it in the function definition, so that
caller and callee agree on the same alignment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@maleadt maleadt removed the backport 1.13 Change should be backported to release-1.13 label Mar 6, 2026
@maleadt maleadt mentioned this pull request Mar 6, 2026
37 tasks
@KristofferC KristofferC removed the backport 1.12 Change should be backported to release-1.12 label Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

compiler:codegen Generation of LLVM IR and native code gpu Affects running Julia on a GPU

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Julia 1.12: Misaligned address error

6 participants