arm64: Add support for BFI and BFX instruction#123138
arm64: Add support for BFI and BFX instruction#123138jonathandavies-arm wants to merge 18 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
ce512bf to
1a8d58f
Compare
1a8d58f to
83a6004
Compare
|
Spmidiff errors |
- Remove Expect.cs
Fixed |
|
/azp run runtime-coreclr superpmi-replay |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Looks like there are replay failures.
|
| #ifdef TARGET_ARM64 | ||
| case GT_BFX: | ||
| if (operand == this->AsOp()->gtOp1) | ||
| { | ||
| *pUse = &this->AsOp()->gtOp1; | ||
| return true; | ||
| } | ||
| return false; | ||
| #endif |
There was a problem hiding this comment.
Please double check with another node type like GT_AND that this is being added to all the general utilities.
There was a problem hiding this comment.
I think in this case the BFI is fine falling into the default case. I can't see any other cases where BFI needs something other than the default.
| GTNODE(BFI , GenTreeBfm ,0,0,GTK_BINOP|GTK_EXOP|DBK_NOTHIR) // Bitfield Insert. | ||
| GTNODE(BFIZ , GenTreeOp ,0,0,GTK_BINOP|DBK_NOTHIR) // Bitfield Insert in Zero. | ||
| GTNODE(BFX , GenTreeBfm ,0,0,GTK_UNOP|GTK_EXOP|DBK_NOTHIR) // Bitfield Extract. |
There was a problem hiding this comment.
I assume bfiz is subsumed by bfi? It might be nice to replace uses of one by the other and confirm that you see no regressions.
There was a problem hiding this comment.
I think BFI and BFIZ aren't similar enough to merge. I also have a another task after this is work to look into more UBFIZ/UBFX opportunities and I'd that would be better with the separate node.
| return false; | ||
| } | ||
|
|
||
| BlockRange().Remove(tree); |
There was a problem hiding this comment.
The loop above seems to be able to match more nodes than are being removed here. I would expect that to hit asserts -- you remove some of the nodes and those nodes may have operands that you aren't marking as unused values.
Do you have test cases where the loop above matches a large number of nodes?
There was a problem hiding this comment.
The test case ComposeBits_Pack32Values_Int in Bfi.cs tries to pack 32 values into an int. It converts a lot of the ors into bfi but not all of them. There are several ors at the end. The JitDump of this test case doesn't have any asserts or errors.
ComposeBits_Pack32Values_Int JitDump
Change-Id: I6ed28f62956404fb7d803c20bd3d1f8113996e14
Change-Id: Ib707d9ca89d85e66f7d37ad593a95048f7ff7ced
Change-Id: Ie4d17eabc9710c88927be211863b07d4e2933650
Change-Id: I08b0a2e4515a0fb445f646d545648e180ebdb893
Change-Id: I015989fa1653edf79fb494ec727ca12555389420
Change-Id: I00caf785bfe1fb9ac1a56ea1f13248084200608d
This patch adds support for the Arm BFI and BFX instructions. I've added a optimisation in lowering for bit packing and unpacking using each instruction respectively.
BFI
This is used when you pack 2 or more values into an integer. e.g.
return (a & 0xf) | ((b & 0x3) << 4);This is the node pattern it is looking for
and changes to
BFX
When you extract a continuous range of bits from an integer. This is the inverse of above. e.g.
return (x >> 5) & 0x1F;This is the node pattern it's looking for
and changes to