AArch64: Allow rebasing pointers to make more use of stp/ldp#9678
Open
david-arm wants to merge 2 commits intofacebook:masterfrom
Open
AArch64: Allow rebasing pointers to make more use of stp/ldp#9678david-arm wants to merge 2 commits intofacebook:masterfrom
david-arm wants to merge 2 commits intofacebook:masterfrom
Conversation
We sometimes see sequences like this: store %123(...b), [addr + 8] store %124(...q), [addr] The existing storepair simplify code requires that the stored registers be physical GP registers, presumably because the lowering for storepair/storepairl cannot handle FP/SIMD regs. However, during register allocation we materialise these constants and end up with the sequence: ldimmb ... => x0 store x0, [addr + 8] ldimmq ... => x0 store x0, [addr] which then makes it very difficult to combine these into storepairs in the post-regalloc simplify pass. This PR permits combining pairs of stores prior to regalloc, provided we can show they are either: 1. A GP physical register, or 2. An integer constant that will be materialised into a GP reg.
If the disp (offset) in a Vptr is negative and outside the range of stp/ldp, then it's also outside the range of str/stur/ldr/ldur. This will force the offset to be materialised in a register and to then use the reg+reg addressing mode for the loads and stores. We'll end up with assembly like this: mov x2, -1024 str x0, [x29, x2] mov x2, -1032 str x1, [x29, x2] If the immediate can easily be encoded into an add or sub instruction, then we can rebase the pointer and it will still be worth it. We'll then end up with assembly like this: add x2, x29, -1024 stp x0, x1, [x2] which still saves two instructions.
Contributor
Author
|
NOTE: This PR includes #9676, so should probably wait for that to land first. |
|
@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this in D88008213. (Because this pull request was imported automatically, there will not be any future comments.) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
If the disp (offset) in a Vptr is negative and outside the range
of stp/ldp, then it's also outside the range of
str/stur/ldr/ldur. This will force the offset to be materialised
in a register and to then use the reg+reg addressing mode for
the loads and stores. We'll end up with assembly like this:
mov x2, -1024
str x0, [x29, x2]
mov x2, -1032
str x1, [x29, x2]
If the immediate can easily be encoded into an add or sub
instruction, then we can rebase the pointer and it will still be
worth it. We'll then end up with assembly like this:
add x2, x29, -1024
stp x0, x1, [x2]
which still saves two instructions.