-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid hoisting of indirections proven to be loop invariant #54118
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
which tree do we hoist? |
JIT dump: https://gist.github.com/jakobbotsch/3a71d78c99c3c267d697bbb267ddeb27
It seems like the compiler reasons that since |
@sandreenko assigning this to you for now. |
I'll try to take a look at this. |
Value numbering seems a bit inconsistent with regards to by-refs. For public static int Main()
{
uint[] arr = new uint[1];
ref uint r = ref arr[0];
for (int vr1 = 0; vr1 < 2; vr1++)
{
r = 1;
if (r != 1)
return 101;
}
return 100;
} we get for the second read of
so value numbering knows that the indirection is constant 1, and subsequently concludes that it is invariant and hoists it out. public static int Main()
{
uint loc = 0;
ref uint r = ref loc;
for (int vr1 = 0; vr1 < 2; vr1++)
{
r = 1;
if (r != 1)
return 101;
}
return 100;
} then we get
and |
The problem remains that we conclude that an indirection is constant (ok), conclude that constants are invariant (ok) and then hoist out the indirection (not ok). benchmarks.run.windows.x64.checked.mch:
Detail diffs
coreclr_tests.pmi.windows.x64.checked.mch:
Detail diffs
libraries.pmi.windows.x64.checked.mch:
Detail diffs
libraries_tests.pmi.windows.x64.checked.mch:
Detail diffs
The diff in benchmarks is an actual miscompilation where we hoist the first array access out in this loop: here. Thankfully, it has no effect here since the value is 0 and the loop condition check ends up reloading it from the array (diff after change). The biggest regressions are when we no longer hoist out an indirection into a register: - mov r12, qword ptr [(reloc)]
- ;; bbWeight=1.00 PerfScore 27.50
+ ;; bbWeight=1.00 PerfScore 25.50
G_M57024_IG04: ; gcrefRegs=0000C000 {r14 r15}, byrefRegs=00000000 {}, byref, isz
mov rcx, gword ptr [rbp-100H]
; gcrRegs +[rcx]
+ mov r12, qword ptr [(reloc)]
call gword ptr [r12+56]hackishModuleName:hackishMethodName()
; gcrRegs -[rcx] +[rax]
; gcr arg pop 0
inc edi
cmp edi, esi
jl SHORT G_M57024_IG04 But in many cases these hoists also seem costly, since often we end up needing spills and reloads in the loop because of it: jle SHORT G_M2715_IG07
- mov r8, qword ptr [(reloc)]
- mov r10, r8
- ;; bbWeight=1 PerfScore 17.00
+ ;; bbWeight=1 PerfScore 14.75
G_M2715_IG06: ; gcrefRegs=0000F068 {rbx rbp rsi r12 r13 r14 r15}, byrefRegs=00000000 {}, byref, isz
movzx r8, dil
mov rcx, gword ptr [rsp+50H]
; gcrRegs +[rcx]
mov rdx, rsi
; gcrRegs +[rdx]
- mov qword ptr [rsp+80H], r10
+ mov r10, qword ptr [(reloc)]
call qword ptr [r10+16]hackishModuleName:hackishMethodName()
; gcrRegs -[rcx rdx]
; gcr arg pop 0
@@ -220,29 +217,26 @@ G_M2715_IG06: ; gcrefRegs=0000F068 {rbx rbp rsi r12 r13 r14 r15}, byrefRe
; gcr arg pop 0
mov eax, dword ptr [rsp+B4H]
inc eax
- cmp dword ptr [rsp+74H], eax
+ cmp dword ptr [rsp+78H], eax
mov dword ptr [rsp+B4H], eax
- mov r10, qword ptr [rsp+80H]
jg SHORT G_M2715_IG06 |
It is possible they are constant because they were assigned inside the loop, so we cannot hoist them out in this situation. Fix dotnet#54118
I think the fix is that to consider an indir tree invariant we must also show that (the right kind of) memory is also invariant in the loop, eg: iff --git a/src/coreclr/jit/optimizer.cpp b/src/coreclr/jit/optimizer.cpp
index a7df5b95905..f4be1d3f70f 100644
--- a/src/coreclr/jit/optimizer.cpp
+++ b/src/coreclr/jit/optimizer.cpp
@@ -6102,6 +6102,24 @@ void Compiler::optHoistLoopBlocks(unsigned loopNum, ArrayStack<BasicBlock*>* blo
{
return false;
}
+
+ // For indirections, memory must also be invariant in the loop.
+ //
+ if (tree->OperIsIndir())
+ {
+ BasicBlock* const loopEntry = m_compiler->optLoopTable[m_loopNum].lpEntry;
+
+ // Todo: use knowledge of where address points to avoid considering some memory kinds
+ //
+ for (MemoryKind memoryKind : allMemoryKinds())
+ {
+ ValueNum loopMemoryVN = m_compiler->GetMemoryPerSsaData(loopEntry->bbMemorySsaNumIn[memoryKind])->m_vnPair.GetLiberal();
+ if (!m_compiler->optVNIsLoopInvariant(loopMemoryVN, m_loopNum, &m_hoistContext->m_curLoopVnInvariantCache))
+ {
+ return false;
+ }
+ }
+ }
} |
The change above (which is in
|
That fix sounds much better conceptually, thanks @AndyAyersMS. I think we need to do the same for |
Hmm, I'm seeing quite a few SPMI diffs with that patch applied. For example: diff --git "a/.\\base\\10172.dasm" "b/.\\diff\\10172.dasm"
index 6100b53..44568ad 100644
--- "a/.\\base\\10172.dasm"
+++ "b/.\\diff\\10172.dasm"
@@ -1,206 +1,187 @@
; Assembly listing for method System.Net.Primitives.Tests.CredentialCacheTests:CreateCredentialCache(int,int):System.Net.CredentialCache
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; No matching PGO data
; 0 inlinees with PGO data; 4 single block inlinees; 0 inlinees without PGO data
; Final local variable assignments
;
-; V00 arg0 [V00,T14] ( 4, 7 ) int -> rsi
-; V01 arg1 [V01,T15] ( 4, 7 ) int -> rdi
+; V00 arg0 [V00,T13] ( 4, 7 ) int -> rsi
+; V01 arg1 [V01,T14] ( 4, 7 ) int -> rdi
; V02 loc0 [V02,T12] ( 4, 10 ) ref -> rbx class-hnd exact
; V03 loc1 [V03,T01] ( 5, 17 ) int -> rbp ld-addr-op
;* V04 loc2 [V04 ] ( 0, 0 ) ref -> zero-ref class-hnd exact
-; V05 loc3 [V05,T02] ( 5, 17 ) int -> r14 ld-addr-op
-; V06 loc4 [V06,T16] ( 2, 8 ) ref -> rbp class-hnd
+; V05 loc3 [V05,T02] ( 5, 17 ) int -> rsi ld-addr-op
+; V06 loc4 [V06,T15] ( 2, 8 ) ref -> rbp class-hnd
; V07 OutArgs [V07 ] ( 1, 1 ) lclBlk (40) [rsp+00H] "OutgoingArgSpace"
-; V08 tmp1 [V08,T19] ( 2, 4 ) ref -> rbx class-hnd exact "NewObj constructor temp"
-; V09 tmp2 [V09,T00] ( 3, 24 ) ref -> r12 class-hnd exact "NewObj constructor temp"
+; V08 tmp1 [V08,T16] ( 2, 4 ) ref -> rbx class-hnd exact "NewObj constructor temp"
+; V09 tmp2 [V09,T00] ( 3, 24 ) ref -> r14 class-hnd exact "NewObj constructor temp"
; V10 tmp3 [V10,T03] ( 2, 16 ) ref -> rdx "argument with side effect"
-; V11 tmp4 [V11,T04] ( 2, 16 ) ref -> r13 "argument with side effect"
+; V11 tmp4 [V11,T04] ( 2, 16 ) ref -> r15 "argument with side effect"
; V12 tmp5 [V12,T05] ( 2, 16 ) ref -> rdx "argument with side effect"
; V13 tmp6 [V13,T06] ( 2, 16 ) ref -> r9 "argument with side effect"
-; V14 tmp7 [V14,T07] ( 2, 16 ) ref -> r13 "argument with side effect"
+; V14 tmp7 [V14,T07] ( 2, 16 ) ref -> r15 "argument with side effect"
; V15 tmp8 [V15,T08] ( 2, 16 ) ref -> rdx "argument with side effect"
; V16 tmp9 [V16,T09] ( 2, 16 ) ref -> rbp "argument with side effect"
; V17 tmp10 [V17,T10] ( 2, 16 ) ref -> r9 "argument with side effect"
-; V18 tmp11 [V18,T11] ( 2, 16 ) ref -> r12 "argument with side effect"
-; V19 cse0 [V19,T13] ( 4, 10 ) ref -> r15 "CSE - aggressive"
-; V20 cse1 [V20,T17] ( 2, 5 ) ref -> r14 "CSE - moderate"
-; V21 cse2 [V21,T18] ( 2, 5 ) ref -> rsi "CSE - moderate"
+; V18 tmp11 [V18,T11] ( 2, 16 ) ref -> r14 "argument with side effect"
;
; Lcl frame size = 40
G_M38214_IG01: ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, nogc <-- Prolog IG
push r15
push r14
- push r13
- push r12
push rdi
push rsi
push rbp
push rbx
sub rsp, 40
mov esi, ecx
mov edi, edx
- ;; bbWeight=1 PerfScore 8.75
+ ;; bbWeight=1 PerfScore 6.75
G_M38214_IG02: ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
mov rcx, 0xD1FFAB1E
call CORINFO_HELP_NEWSFAST
; gcrRegs +[rax]
; gcr arg pop 0
mov rbx, rax
; gcrRegs +[rbx]
xor ebp, ebp
test esi, esi
jle G_M38214_IG04
- mov rcx, 0xD1FFAB1E
- mov r14, gword ptr [rcx]
- ; gcrRegs +[r14]
- mov rcx, 0xD1FFAB1E
- mov r15, gword ptr [rcx]
- ; gcrRegs +[r15]
- ;; bbWeight=1 PerfScore 7.50
-G_M38214_IG03: ; gcrefRegs=0000C008 {rbx r14 r15}, byrefRegs=00000000 {}, byref, isz
+ ;; bbWeight=1 PerfScore 3.00
+G_M38214_IG03: ; gcrefRegs=00000008 {rbx}, byrefRegs=00000000 {}, byref, isz
; gcrRegs -[rax]
mov rcx, 0xD1FFAB1E
call CORINFO_HELP_NEWSFAST
; gcrRegs +[rax]
; gcr arg pop 0
- mov r12, rax
- ; gcrRegs +[r12]
- mov r13, r14
- ; gcrRegs +[r13]
+ mov r14, rax
+ ; gcrRegs +[r14]
+ mov rcx, 0xD1FFAB1E
+ mov r15, gword ptr [rcx]
+ ; gcrRegs +[r15]
mov ecx, ebp
call System.Number:Int32ToDecStr(int):System.String
; gcr arg pop 0
mov rdx, rax
; gcrRegs +[rdx]
- mov rcx, r13
+ mov rcx, r15
; gcrRegs +[rcx]
call System.String:Concat(System.String,System.String):System.String
- ; gcrRegs -[rcx rdx r13]
+ ; gcrRegs -[rcx rdx r15]
; gcr arg pop 0
mov rdx, rax
; gcrRegs +[rdx]
- mov rcx, r12
+ mov rcx, r14
; gcrRegs +[rcx]
call System.Uri:.ctor(System.String):this
; gcrRegs -[rax rcx rdx]
; gcr arg pop 0
- mov r13, r15
- ; gcrRegs +[r13]
+ mov rcx, 0xD1FFAB1E
+ mov r15, gword ptr [rcx]
+ ; gcrRegs +[r15]
mov rcx, 0xD1FFAB1E
mov edx, 553
call CORINFO_HELP_GETSHARED_NONGCSTATIC_BASE
; gcr arg pop 0
mov r9, 0xD1FFAB1E
mov r9, gword ptr [r9]
; gcrRegs +[r9]
- mov r8, r13
+ mov r8, r15
; gcrRegs +[r8]
- mov rdx, r12
+ mov rdx, r14
; gcrRegs +[rdx]
mov rcx, rbx
; gcrRegs +[rcx]
call System.Net.CredentialCache:Add()
- ; gcrRegs -[rcx rdx r8-r9 r12-r13]
+ ; gcrRegs -[rcx rdx r8-r9 r14-r15]
; gcr arg pop 0
inc ebp
cmp ebp, esi
jl SHORT G_M38214_IG03
- ;; bbWeight=4 PerfScore 53.00
+ ;; bbWeight=4 PerfScore 69.00
G_M38214_IG04: ; gcrefRegs=00000008 {rbx}, byrefRegs=00000000 {}, byref, isz
- ; gcrRegs -[r14-r15]
- xor r14d, r14d
+ xor esi, esi
test edi, edi
jle SHORT G_M38214_IG06
+ ;; bbWeight=1 PerfScore 1.50
+G_M38214_IG05: ; gcrefRegs=00000008 {rbx}, byrefRegs=00000000 {}, byref, isz
mov rcx, 0xD1FFAB1E
- mov rsi, gword ptr [rcx]
- ; gcrRegs +[rsi]
- mov rcx, 0xD1FFAB1E
- mov r15, gword ptr [rcx]
- ; gcrRegs +[r15]
- ;; bbWeight=1 PerfScore 6.00
-G_M38214_IG05: ; gcrefRegs=00008048 {rbx rsi r15}, byrefRegs=00000000 {}, byref, isz
- mov rbp, rsi
+ mov rbp, gword ptr [rcx]
; gcrRegs +[rbp]
- mov ecx, r14d
+ mov ecx, esi
call System.Number:Int32ToDecStr(int):System.String
; gcrRegs +[rax]
; gcr arg pop 0
mov rdx, rax
; gcrRegs +[rdx]
mov rcx, rbp
; gcrRegs +[rcx]
call System.String:Concat(System.String,System.String):System.String
; gcrRegs -[rcx rdx rbp]
; gcr arg pop 0
mov rbp, rax
; gcrRegs +[rbp]
- mov r12, r15
- ; gcrRegs +[r12]
+ mov rcx, 0xD1FFAB1E
+ mov r14, gword ptr [rcx]
+ ; gcrRegs +[r14]
mov rcx, 0xD1FFAB1E
mov edx, 553
call CORINFO_HELP_GETSHARED_NONGCSTATIC_BASE
; gcrRegs -[rax]
; gcr arg pop 0
mov r9, 0xD1FFAB1E
mov r9, gword ptr [r9]
; gcrRegs +[r9]
mov gword ptr [rsp+20H], r9
; gcr arg write
- mov r9, r12
+ mov r9, r14
mov rcx, rbx
; gcrRegs +[rcx]
mov rdx, rbp
; gcrRegs +[rdx]
mov r8d, 80
call System.Net.CredentialCache:Add()
- ; gcrRegs -[rcx rdx rbp r9 r12]
+ ; gcrRegs -[rcx rdx rbp r9 r14]
; gcr arg pop 0
- inc r14d
- cmp r14d, edi
+ inc esi
+ cmp esi, edi
jl SHORT G_M38214_IG05
- ;; bbWeight=4 PerfScore 47.00
+ ;; bbWeight=4 PerfScore 63.00
G_M38214_IG06: ; gcrefRegs=00000008 {rbx}, byrefRegs=00000000 {}, byref
- ; gcrRegs -[rsi r15]
mov rax, rbx
; gcrRegs +[rax]
;; bbWeight=1 PerfScore 0.25
G_M38214_IG07: ; , epilog, nogc, extend
add rsp, 40
pop rbx
pop rbp
pop rsi
pop rdi
- pop r12
- pop r13
pop r14
pop r15
ret
- ;; bbWeight=1 PerfScore 5.25
+ ;; bbWeight=1 PerfScore 4.25
-; Total bytes of code 327, prolog size 16, PerfScore 160.45, instruction count 86, allocated bytes for code 327 (MethodHash=ed236ab9) for method System.Net.Primitives.Tests.CredentialCacheTests:CreateCredentialCache(int,int):System.Net.CredentialCache
+; Total bytes of code 303, prolog size 12, PerfScore 178.05, instruction count 78, allocated bytes for code 303 (MethodHash=ed236ab9) for method System.Net.Primitives.Tests.CredentialCacheTests:CreateCredentialCache(int,int):System.Net.CredentialCache
; ============================================================ That seems expected since the loops have function calls, doesn't it? Did you do something more or know what I'm missing? |
Ah, I see now that the patch is only doing it for constant VNs. This is also a problem for InitVal, phis and casts. Here's a test that has more cases: |
Ok, I'll look at the bigger set of cases. We might also be able to exempt |
Suspect there's a broader problem where we can lose track of the (implicit) memory dependence. That is, if VN can deduce an in-loop indir has a relatively simple VN, and then that VN propagates to other computations later in the loop yielding more relatively simple VNs, those second-level VNs won't be attached to scary-looking IR nodes and changes like the one I came up with will fail to detect the memory dependence. So if those second-level IR nodes look like attractive hosting candidates we could be in trouble. We might get lucky for now depending on whether trees are simplified, etc. |
Here's a more general version, it causes quite a few diffs, but fixes the all the cases in the more involved test:
|
I tried it in https://github.com/dotnet/runtime/compare/main...jakobbotsch:hoist-opt-trees?expand=1. |
@briansull as Jakob noted the general problem is that there's a tree with a VN that is loop invariant, but the tree will not evaluate to the same VN if it's hoisted out of the loop. I don't think we can const/copy prop all these cases away, nor should we have one optimization depend on another to avoid unsafe behavior. Hoisting needs to be able to prove that the tree will have the same value once hoisted. And it seemingly can't do that without more information. |
To clarify a bit, the fix above does not actually rely on const/copy prop to clean things up. Rather, it recognizes when we are hoisting a constant by liberal VN and creates the constant directly. So instead of hoisting out an indirection it hoists out a constant node. Similarly, it tries to do the same for InitVal and phis, creating loads of local variables. I'm not sure if this part works correctly as I'm not sure if loading locals outside the loop will give the right values. Also, the fix has some weird side effects, for example sometimes we hoist the constant out only to end up spilling it and reloading it from the stack in the loop. The current reason that things do not get cleaned by const prop/copy prop seems to be that in these cases we end up with conservative/liberal VNs that do not match, and const/copy prop works on conservative VNs. Which brings me to a couple of questions:
|
An indirection that reads a known constant from memory will get different ValueNumbers for liberal/conservative***. A simple constant node will (or should) have the same value for liberal/conservative ValueNumbers. If you create a new GT_CONST node because you recognize that the liberal ValueNumber is a constant, you also should update the ValueNumberPair so that both liberal/conservative have the constant ValueNumber. That should fix the const prop issues that you saw. *** - The reason for this behavior is that the liberal ValueNumber is allowed to assume that the method executes single-threaded, where as the conservative ValueNumber considers that a different thread can modify the memory being read, so that each indirection gets a different value number. |
@AndyAyersMS I agree that we need to insure correctness when hoisting loop-invariant indirections. If the indirection loads a known constant we can transform the hoisted node into a constant and try to make it into a CSE. Another alternative would be to add a pass that goes through all the nodes and replaces the ones that have a constant ValueNumber with their actual constant, thus removing all of the indirections. This would also allow some additional folding/tree reordering optimizations to occur. |
Here's what I have been working on: main...AndyAyersMS:FixLicmMemoryDependence I think the optimizer side captures what I was thinking of pretty well. It should give the "right" level of dependence for loop hoisting. It's not fine-grained enough for other kinds of code motion; that's ok for now. But I'm not super happy with the VN side. I was thinking of revising it to make clearer distinctions between reads and writes of memory, so that there aren't so many call site nullptrs floating about. Or maybe rely more on ambient state (we use It also seems fragile; if we miss one update on the VN side we'll draw the wrong conclusion and possibly do an invalid hoist, and all parties need to implicitly agree on which trees read memory (or at least read memory that can be disambiguated the way VN does it). One other angle to pursue here is to see how far back this bug goes; my guess is it has been there for a while now. Perhaps that reduces the urgency of a fix...? |
I don't think that I implemented any of this hoisting code |
The code does give the same VN number for both numbers of the pair, I didn't have any const prop issues. But I'm confused why const prop needs to use the conservative number instead of the liberal number, and why we should be allowed to use the liberal number in hoisting to do this optimization if const prop cannot. Is there a simple example where const prop would break memory safety if we allowed it to use liberal VNs? Or is it simply that we don't want to do the optimization because users might rely on current behavior in multi-threaded scenarios?
This does not cover the other cases that fail. For example: [MethodImpl(MethodImplOptions.NoInlining)]
static bool TestPhiArr()
{
int[] arr = { -1 };
int val = -1;
for (int i = 0; i < 2; i++)
{
for (int j = 0; j < 2; j++)
{
arr[0] = i;
val = arr[0];
}
}
return val == 1;
}
In .NET core 1.0 all the cases pass. in .NET core 1.1 the byref cases start to fail. In .NET core 2.0 the cls var cases (except the constant one, covered by dotnet/coreclr#26952) start failing. And finally in .NET 5, the array ones start failing. So it seems like a pretty old problem at this point. |
Using an ambient tree ( This seems fixable, but possibly requires a new range of VN chunks just for |
I suppose another option is to put |
FWIW here's the latest main...AndyAyersMS:FixLicmMemoryDependenceAmbientVnTree As you can see the plumbing in VN is minimal. It works on cases where the loop-dependent memory update is the first update in the loop... so most of the tests above. More complex patterns (with multiple disambiguable updates) or with statics aren't fixed yet because of the |
@AndyAyersMS The latest changes look good. It is best to keep the VN changes as small as possible. |
Use a unary function to capture opaque VN loop dependence, rather than factoring that out as part of the owning chunk. As a result the VN chunk data no longer has a per-loop component (no dependence on MAX_LOOP_NUM). This gives us the flexibility to address dotnet#54118 by doing something similar for `MapUpdate` VNs.
I've updated my fix main...AndyAyersMS:FixLicmMemoryDependenceAmbientVnTree to build upon #55853; it passes all the test cases here with minimal diffs per SPMI (5 methods total with diffs). Will put up a PR for this once #55853 is in and I've added the test cases. |
Use a unary function to capture opaque VN loop dependence, rather than factoring that out as part of the owning chunk. As a result the VN chunk data no longer has a per-loop component (no dependence on MAX_LOOP_NUM). This gives us the flexibility to address #54118 by doing something similar for `MapUpdate` VNs.
Leverage value numbering's alias analysis to annotate trees with the loop memory dependence of the tree's value number. First, refactor the `mapStore` value number so that it also tracks the loop number where the store occurs. This is done via an extra non-value-num arg, so add appropriate bypasses to logic in the jit that expect to only find value number args. Also update the dumping to display the loop information. Next, during VN computation, record loop memory dependence from `mapStores` with the tree currently being value numbered, whenever a value number comes from a particular map. There may be multiple such recording events per tree, so add logic on the recording side to track the most constraining dependence. Note value numbering happens in execution order, so there is an unambiguous current tree being value numbered. This dependence info is tracked via a side map. Finally, during hoisting, for each potentially hoistable tree, consult the side map to recover the loop memory dependence of a tree, and if that dependence is at or within the loop that we're hoisting from, block the hoist. I've also absorbed the former class var (static field) hosting exclusion into this new logic. This gives us slightly more relaxed dependence in some cases. Resolves dotnet#54118.
…5936) Leverage value numbering's alias analysis to annotate trees with the loop memory dependence of the tree's value number. First, refactor the `mapStore` value number so that it also tracks the loop number where the store occurs. This is done via an extra non-value-num arg, so add appropriate bypasses to logic in the jit that expect to only find value number args. Also update the dumping to display the loop information. Next, during VN computation, record loop memory dependence from `mapStores` with the tree currently being value numbered, whenever a value number comes from a particular map. There may be multiple such recording events per tree, so add logic on the recording side to track the most constraining dependence. Note value numbering happens in execution order, so there is an unambiguous current tree being value numbered. This dependence info is tracked via a side map. Finally, during hoisting, for each potentially hoistable tree, consult the side map to recover the loop memory dependence of a tree, and if that dependence is at or within the loop that we're hoisting from, block the hoist. I've also absorbed the former class var (static field) hosting exclusion into this new logic. This gives us slightly more relaxed dependence in some cases. Resolves #54118.
If a loop is removed (because of unrolling) then the loop dependence tracking introduced in dotnet#55936 and dotnet#56184 may not properly update. So when a loop is removed, walk up the chain of parent loops looking for one that is not removed, and record the dependence on that parent. Addresses last part of dotnet#54118.
…56436) If a loop is removed (because of unrolling) then the loop dependence tracking introduced in #55936 and #56184 may not properly update. So when a loop is removed, walk up the chain of parent loops looking for one that is not removed, and record the dependence on that parent. Addresses last part of #54118.
We hoist
vr1[0]
out of the loop because it's constant. Seems to be a variant of #13328 which was fixed for static fields in dotnet/coreclr#26952.Another variant:
cc @dotnet/jit-contrib
The text was updated successfully, but these errors were encountered: