Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue Detailsclass MyClass
{
public byte A;
public byte B;
public char C;
public int D;
}
void Test(MyClass c1, MyClass c2)
{
c1.A = c2.A;
c1.B = c2.B;
c1.C = c2.C;
c1.D = c2.D;
}Codegen diff for ; Method Program:Test(MyClass,MyClass):this (FullOpts)
- movzx rax, byte ptr [r8+0x0E]
- mov byte ptr [rdx+0x0E], al
- movzx rax, byte ptr [r8+0x0F]
- mov byte ptr [rdx+0x0F], al
- movzx rax, word ptr [r8+0x0C]
- mov word ptr [rdx+0x0C], ax
- mov eax, dword ptr [r8+0x08]
- mov dword ptr [rdx+0x08], eax
+ mov rax, qword ptr [r8+0x08]
+ mov qword ptr [rdx+0x08], rax
ret
-; Total bytes of code: 33
+; Total bytes of code: 9For more diffs it needs a sort of "forward sub" in lowering to handle cases like a[0] = b[0];
a[1] = b[1]etc
|
| // | ||
| // IND<byte> is always fine (and all IND<X> created here from such) | ||
| // IND<simd> is not required to be atomic per our Memory Model | ||
| const bool allowsNonAtomic = data1.allowsNonAtomic && data2.allowsNonAtomic; |
There was a problem hiding this comment.
Would it make a difference (cover more cases) if it was a positive check: allowAtomic?
There was a problem hiding this comment.
I am not sure I follow - can you clarify?
There was a problem hiding this comment.
Was just wondering if the condition below actually care for "if atomics are allowed, do the transformation" instead of "if non-atomics are allowed, skip the transformation", then maybe it can be based on allowAtomic? Not sure if it's feasible or would make any difference. 😅
There was a problem hiding this comment.
The idea that we assume that we can't use wider loads by default, we need to find a proof that we can and we only have a few hints for that today so it's quite conservative
|
@jakobbotsch PTAL since you reviewed the initial version cc @dotnet/jit-contrib Diffs are not too big due to conservative alias analysis, but I have some future improvements in mind which might increase the coverage (e.g. GT_STORE_LCL_FLD) |
| // * STOREIND int | ||
| // +--* LEA(b+8) byref | ||
| // | \--* LCL_VAR ref | ||
| // \--* IND int | ||
| // \--* LEA(b+8) byref | ||
| // \--* LCL_VAR ref | ||
| // | ||
| // * STOREIND int | ||
| // +--* LEA(b+12) byref | ||
| // | \--* LCL_VAR ref | ||
| // \--* IND int | ||
| // \--* LEA(b+12) byref | ||
| // \--* LCL_VAR ref | ||
| // | ||
| // is transformed into: | ||
| // | ||
| // * STOREIND long | ||
| // +--* LEA(b+8) byref | ||
| // | \--* LCL_VAR ref | ||
| // \--* IND long | ||
| // \--* LEA(b+8) byref | ||
| // \--* LCL_VAR ref |
There was a problem hiding this comment.
Indentation of these trees seems wrong.
| // Data is either a constant or GT_IND | ||
| // TODO-CoalescingStores: allow locals (to then broadcast them) | ||
| if (isStore && !ind->Data()->IsCnsIntOrI() && !ind->Data()->IsVectorConst() && !ind->Data()->OperIs(GT_IND)) | ||
| { | ||
| return false; | ||
| } |
There was a problem hiding this comment.
This logic seems like it needs some form of interference checking somewhere when the data is (nevermind, looks like GT_IND. What if the definition occurs 200 nodes before, with interfering stores?LowerStoreIndirCoalescing takes care of it by virtue of ensuring things are contiguous)
| // Get coalescing data for the previous STOREIND | ||
| GenTreeStoreInd* prevInd = prevTree->AsStoreInd(); | ||
| if (!GetStoreCoalescingData(comp, prevInd->AsStoreInd(), &prevData)) | ||
| if (!GetLoadStoreCoalescingData(comp, prevInd->AsStoreInd(), &prevData) || !CanBeCoalesced(prevData, currData)) |
There was a problem hiding this comment.
Ah, I guess the loop above this indirectly ensures that there won't be any interference between the two INDs.
| const int storeStart = min(currData.offset, prevData.offset); | ||
| const int loadStart = min(currValueData.offset, prevValueData.offset); | ||
|
|
||
| const int smallerOffset = min(storeStart, loadStart); | ||
| const int largerOffset = max(storeStart, loadStart); | ||
|
|
||
| if ((smallerOffset != largerOffset) && ((smallerOffset + (int)genTypeSize(newType)) > largerOffset)) | ||
| { | ||
| // May alias | ||
| return; | ||
| } |
There was a problem hiding this comment.
Not sure I totally understand this logic. Can it use a normal interval intersection check? Like e.g.
runtime/src/coreclr/jit/promotion.cpp
Lines 1510 to 1533 in c9088fe
When I checked |
|
I'll re-think this one later |
Codegen diff for
Test:For more diffs it needs a sort of "forward sub" in lowering to handle cases like