Skip to content

runtime_intrinsics: Fix f16 ABI fallback#61094

Merged
Keno merged 1 commit intomasterfrom
kf/f16abifix
Feb 20, 2026
Merged

runtime_intrinsics: Fix f16 ABI fallback#61094
Keno merged 1 commit intomasterfrom
kf/f16abifix

Conversation

@Keno
Copy link
Copy Markdown
Member

@Keno Keno commented Feb 20, 2026

The x86 psABI specifies that f16 goes on the stack (but is returned in xmm0). Our emulation for compilers that do not support _Float16 did not handle this properly. Fixes #61072.

I think this fixes it at least - I'm too tired to actually go check at this point, so I'll just let CI try it ;). My remaining concern is that I don't quite understand why seeing this behavior is AVX512 dependent.

The x86 psABI specifies that f16 goes on the stack (but is returned in xmm0).
Our emulation for compilers that do not support _Float16 did not handle this
properly. Fixes #61072.
@giordano giordano added system:32-bit Affects only 32-bit systems float16 labels Feb 20, 2026
@DilumAluthge
Copy link
Copy Markdown
Member

Okay, CI passed on Windows, but neither of the Windows jobs ran on rhea, so I don't know if we would have caught the bug.

@gbaraldi
Copy link
Copy Markdown
Member

I suspect it's dependent on it because it changes if the cpu has native support or not

@DilumAluthge
Copy link
Copy Markdown
Member

I suppose someone with access to rhea will need to run these tests manually on the machine, to confirm the fix? @gbaraldi Any chance you have access?

@giordano
Copy link
Copy Markdown
Member

giordano commented Feb 20, 2026

$ julia +nightly~x86 --cpu-target=pentium4 -E 'Float16(3.0) * 2'
Float16(4.0)
$ julia +nightly~x86 -E 'Float16(3.0) * 2'
Float16(6.0)
$ ./julia-46c8113b1a/bin/julia --cpu-target=pentium4 -E 'Float16(3.0) * 2'
Float16(6.0)
$ ./julia-46c8113b1a/bin/julia -E 'Float16(3.0) * 2'
Float16(6.0)
$ julia +nightly~x86 -e 'using InteractiveUtils; versioninfo()'
Julia Version 1.14.0-DEV.1761
Commit 9aff288d6de (2026-02-20 13:53 UTC)
Build Info:
  Official https://julialang.org release
Platform Info:
  OS: Linux (i686-linux-gnu)
  CPU: 224 × Intel(R) Xeon(R) CPU Max 9480
  WORD_SIZE: 32
  LLVM: libLLVM-20.1.8 (ORCJIT, sapphirerapids)
  GC: Built with stock GC
Threads: 1 default, 1 interactive, 1 GC (on 224 virtual cores)
$ ./julia-46c8113b1a/bin/julia -e 'using InteractiveUtils; versioninfo()'
Julia Version 1.14.0-DEV.1760
Commit 46c8113b1a4 (2026-02-20 07:23 UTC)
Build Info:
  Official https://julialang.org release
Platform Info:
  OS: Linux (i686-linux-gnu)
  CPU: 224 × Intel(R) Xeon(R) CPU Max 9480
  WORD_SIZE: 32
  LLVM: libLLVM-20.1.8 (ORCJIT, sapphirerapids)
  GC: Built with stock GC
Threads: 1 default, 1 interactive, 1 GC (on 224 virtual cores)

Seems to work for me (not on rhea, no clue of what machine is that)

@Keno Keno merged commit 2ba9b37 into master Feb 20, 2026
11 checks passed
@Keno Keno deleted the kf/f16abifix branch February 20, 2026 16:47
@Keno
Copy link
Copy Markdown
Member Author

Keno commented Feb 20, 2026

I suspect it's dependent on it because it changes if the cpu has native support or not

I don't understand this. On demeter (no AVX512FP16):

keno@demeter6:~$ julia +nightly~x86 --cpu-target=pentium4 -E 'Float16(3.0) * 2'
Float16(4.0)

So the bug at least is not dependent on having support.

@giordano
Copy link
Copy Markdown
Member

What's the microarchitecture of demeter? I couldn't reproduce on a bunch of machines I tried (including znver4, which has avx512, but not avx512-fp16), only on sapphirerapids.

@DilumAluthge
Copy link
Copy Markdown
Member

This needs backporting to all active release branches, I assume?

@Keno
Copy link
Copy Markdown
Member Author

Keno commented Feb 20, 2026

What's the microarchitecture of demeter?

znver4

@giordano
Copy link
Copy Markdown
Member

Ah, checking the history of the znver4 machine I tested this on, I had tried release~x86, not nightly~x86. I can confirm I reproduce the bug with nightly~x86 (and 1.13-nightly~x86) on znver4, but not release~x86.

@giordano giordano added the backport 1.13 Change should be backported to release-1.13 label Feb 20, 2026
@DilumAluthge DilumAluthge added the bugfix This change fixes an existing bug label Feb 20, 2026
KristofferC pushed a commit that referenced this pull request Mar 3, 2026
The x86 psABI specifies that f16 goes on the stack (but is returned in
xmm0). Our emulation for compilers that do not support _Float16 did not
handle this properly. Fixes #61072.

I think this fixes it at least - I'm too tired to actually go check at
this point, so I'll just let CI try it ;). My remaining concern is that
I don't quite understand why seeing this behavior is AVX512 dependent.

(cherry picked from commit 2ba9b37)
@KristofferC KristofferC mentioned this pull request Mar 3, 2026
56 tasks
@KristofferC KristofferC removed the backport 1.13 Change should be backported to release-1.13 label Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix This change fixes an existing bug float16 system:32-bit Affects only 32-bit systems

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Float16 i686-w64-mingw32 CI tests fail consistently on rhea agent

5 participants