Fix ARM64 unsigned div by const perf regression#57400
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsThis fixes #54166. This is the alternative version of #57372 that just improves Diff: https://www.diffchecker.com/9mkVf4se Unfortunately this doesn't yet solve #47490, probably because the operand nodes are extended to 64-bit earlier. cc @AndyAyersMS @echesakovMSFT
|
|
CC @AndyAyersMS please review the code. |
|
Changes look reasonable to me, but I'll defer to @echesakov here... |
SingleAccretion
left a comment
There was a problem hiding this comment.
I suppose this is a low risk change for .NET 6, but longer term I'd think we'd want to support GT_MUL_LONG in ARM64 codegen directly and use the existing methods to detect it in lowering.
|
I ran Here are the results (baseline is e0f6071): BenchmarkDotNet=v0.12.1.1466-nightly, OS=Windows 10.0.19043
Microsoft SQ2 3.15 GHz, 1 CPU, 8 logical and 8 physical cores
.NET SDK=6.0.100-preview.7.21379.14
[Host] : .NET 5.0.9 (5.0.921.35908), Arm64 RyuJIT
Job-TRWEYK : .NET 6.0.0 (42.42.42.42424), Arm64 RyuJIT
Job-OBGLQV : .NET 6.0.0 (42.42.42.42424), Arm64 RyuJIT
Job-COTIHO : .NET 6.0.0 (42.42.42.42424), Arm64 RyuJIT
PowerPlanMode=00000000-0000-0000-0000-000000000000 Arguments=/p:DebugType=portable IterationTime=250.0000 ms
MaxIterationCount=20 MinIterationCount=15 WarmupCount=1
|
|
@echesakov diff seems improvements. Do you approve the change? |
|
@JulieLeeMSFT Yes, the numbers were mostly improvements with an exception of
|
|
The spmi-asmdiffs are also mostly positive with an exception of I will take a look at the diff to understand what is going on. benchmarks.run.windows.arm64.checked.mch:Detail diffscoreclr_tests.pmi.windows.arm64.checked.mch:Detail diffslibraries.crossgen2.windows.arm64.checked.mch:Detail diffslibraries.pmi.windows.arm64.checked.mch:Detail diffslibraries_tests.pmi.windows.arm64.checked.mch:Detail diffscoreclr_tests.pmi.Linux.arm64.checked.mch:Detail diffslibraries.crossgen2.Linux.arm64.checked.mch:Detail diffslibraries.pmi.Linux.arm64.checked.mch:Detail diffslibraries_tests.pmi.Linux.arm64.checked.mch:Detail diffs |
@echesakov ToString numbers show some improvments and some regression. So, do you want that to be addressed before approving the code? |
|
@JulieLeeMSFT I wouldn't worry about |
|
Thanks for the note @SingleAccretion - I wasn't aware of that issue. An example of such regression - ldr x0, [x22,#0xd1ffab1e]
+ movz x0, #0xd1ffab1e
+ movk x0, #0xd1ffab1e LSL #16
+ movk x0, #0xd1ffab1e LSL #32
+ ldr x0, [x0] |
|
@pentp Can you please address the feedback by @SingleAccretion, so we can merge the PR before RC1? |
|
@echesakov do you want to push the changes discussed above? We are running out of time for RC1. |
|
@pentp I pushed the changes suggested by @SingleAccretion. |
|
@AndyAyersMS Can you please also take a look? In case I missed anything |
|
runtime (Libraries Test Run checked coreclr windows x86 Release) failure is due to #57471 |
|
runtime (Installer Build and Test coreclr windows_x64 Release) is an infra issue: |
|
Since the change is arm64 specific I think we can merge despite those failures. Anyone disagree? |
|
Improvements: dotnet/perf-autofiling-issues#828 and dotnet/perf-autofiling-issues#1109 |
This fixes #54166. This is the alternative version of #57372 that just improves
GT_MULcodegen for arm64. Diffs are identical.Diff: https://www.diffchecker.com/9mkVf4se
Diff with baseline (no #52893): https://www.diffchecker.com/x2SRZJvW
Unfortunately this doesn't yet solve #47490, probably because the operand nodes are extended to 64-bit earlier.
cc @AndyAyersMS @echesakovMSFT