JIT: Add a pass of liveness for the new locals inside physical promotion #86043

jakobbotsch · 2023-05-10T14:37:49Z

Used when decomposing stores to skip the fields that are dying and the remainder if it is dying. Also allows forward sub to kick in for the inserted replacement locals.

Not sure yet this is how I want to go about the liveness here... we might instead combine this with the existing early liveness pass, or use the early liveness information to figure out where to look when calculating more precise liveness. But this at least gives us an implementation to experiment with. The throughput impact seems very small, so doing something smarter seems unnecessary.

…ion-liveness

ghost · 2023-05-10T14:38:05Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Used when decomposing stores to skip the fields that are dying and the remainder if it is dying. Also allows forward sub to kick in for the inserted replacement locals.

Not sure yet this is how I want to go about the liveness here... we might instead combine this with the existing early liveness pass, or use the early liveness information to figure out where to look when calculating more precise liveness. But this at least gives us an implementation to experiment with.

Author:	jakobbotsch
Assignees:	jakobbotsch
Labels:	`area-CodeGen-coreclr`
Milestone:	-

jakobbotsch · 2023-05-10T15:04:48Z

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress, runtime-jit-experimental

azure-pipelines · 2023-05-10T15:05:11Z

Azure Pipelines successfully started running 3 pipeline(s).

jakobbotsch · 2023-05-12T10:17:42Z

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress, runtime-jit-experimental

azure-pipelines · 2023-05-12T10:18:05Z

Azure Pipelines successfully started running 3 pipeline(s).

jakobbotsch · 2023-05-12T12:29:13Z

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress, runtime-jit-experimental

azure-pipelines · 2023-05-12T12:29:31Z

Azure Pipelines successfully started running 3 pipeline(s).

…ion-liveness

jakobbotsch · 2023-05-13T13:58:28Z

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress, runtime-jit-experimental

azure-pipelines · 2023-05-13T13:58:51Z

Azure Pipelines successfully started running 3 pipeline(s).

jakobbotsch · 2023-05-13T19:34:43Z

jitstress failures are a couple of timeouts in event source tests in scenario without physical promotion enabled, so almost certainly unrelated.

runtime-jit-experimental failures were recently fixed I believe, so will rerun that one, but will push a comment cleanup PR first.

No failures in libraries-jitstress.

jakobbotsch · 2023-05-13T19:38:45Z

/azp run runtime-jit-experimental

azure-pipelines · 2023-05-13T19:38:53Z

Azure Pipelines successfully started running 1 pipeline(s).

jakobbotsch · 2023-05-15T13:38:11Z

The failure looks like #86233.

cc @dotnet/jit-contrib PTAL @AndyAyersMS

Diffs.

Diffs without old promotion.

The throughput impact is very small so I don't think there will be a reason to try to be much smarter here. Well, unless it turns out that's because of a bug (but I have looked at a few examples and things look reasonable).

After this is merged I think most basics are in place and to the point where I can get started on readjusting the heuristics.

jakobbotsch · 2023-05-15T14:51:16Z

src/coreclr/jit/promotionliveness.cpp

+        // TODO: We need a scalability limit on these, we cannot always track
+        // the remainder and all fields.


Will address this in a separate follow-up.

This is similar to the "coarsening" discussion we had re alias analysis the other day, you should be able to have models where you group together fields and or fields + remainder and get approximate liveness for them.

Would be interesting to know if you see cases where there are large structs with "hot fields" so that it is worth tracking a subset but coarsening everything else, and if that diverges somehow from just promoting the hot fields and leaving the rest.

Would be interesting to know if you see cases where there are large structs with "hot fields" so that it is worth tracking a subset but coarsening everything else, and if that diverges somehow from just promoting the hot fields and leaving the rest.

That's a good point... the heuristics should likely include something to avoid creating locals that are just going to end up untracked.

jakobbotsch · 2023-05-15T14:59:30Z

src/coreclr/jit/compiler.h

+#ifdef DEBUG
+    unsigned char lvTrackedWithoutIndex : 1; // Tracked but has no lvVarIndex (i.e. only valid GTF_VAR_DEATH flags, used
+                                             // by physical promotion)
+#endif


This is just used to print "(last use)" in the dump outside the new locals introduced by physical promotion.

jakobbotsch · 2023-05-15T15:00:10Z

src/coreclr/jit/promotion.cpp

+bool AggregateInfo::OverlappingReplacements(unsigned      offset,
+                                            unsigned      size,
+                                            Replacement** firstReplacement,
+                                            Replacement** endReplacement)
+{
+    size_t firstIndex = Promotion::BinarySearch<Replacement, &Replacement::Offset>(Replacements, offset);
+    if ((ssize_t)firstIndex < 0)
+    {
+        firstIndex = ~firstIndex;
+        if (firstIndex > 0)
+        {
+            Replacement& lastRepBefore = Replacements[firstIndex - 1];
+            if ((lastRepBefore.Offset + genTypeSize(lastRepBefore.AccessType)) > offset)
+            {
+                // Overlap with last entry starting before offs.
+                firstIndex--;
+            }
+            else if (firstIndex >= Replacements.size())
+            {
+                // Starts after last replacement ends.
+                return false;
+            }
+        }
+
+        const Replacement& first = Replacements[firstIndex];
+        if (first.Offset >= (offset + size))
+        {
+            // First candidate starts after this ends.
+            return false;
+        }
+    }
+
+    assert((firstIndex < Replacements.size()) && Replacements[firstIndex].Overlaps(offset, size));
+    *firstReplacement = &Replacements[firstIndex];
+
+    if (endReplacement != nullptr)
+    {
+        size_t lastIndex = Promotion::BinarySearch<Replacement, &Replacement::Offset>(Replacements, offset + size);
+        if ((ssize_t)lastIndex < 0)
+        {
+            lastIndex = ~lastIndex;
+        }
+
+        // Since we verified above that there is an overlapping replacement
+        // we know that lastIndex exists and is the next one that does not
+        // overlap.
+        assert(lastIndex > 0);
+        *endReplacement = Replacements.data() + lastIndex;
+    }
+
+    return true;
+}


I introduced a new AggregateInfo and moved this method into it.

jakobbotsch · 2023-05-15T15:00:57Z

src/coreclr/jit/promotion.cpp

+bool StructSegments::Segment::IntersectsOrAdjacent(const Segment& other) const
+{
+    if (End < other.Start)
+    {
+        return false;
+    }
+
+    if (other.End < Start)
+    {
+        return false;
+    }
+
+    return true;
+}


These StructSegments method are now being used as part of replacement picking, so I've moved it into the main promotion.cpp.

jakobbotsch · 2023-05-19T14:34:42Z

Ping @AndyAyersMS, but no rush if you're busy with PGO things.

AndyAyersMS

Looks good. Left some comments but those can be handled in follow-ups.

AndyAyersMS · 2023-05-19T14:59:40Z

src/coreclr/jit/promotionliveness.cpp

+            // For LCL_ADDR this is a retbuf and we expect it to be a def. We
+            // don't know the exact size here so we cannot mark anything as
+            // being fully defined, thus we can just return.
+            assert(isDef);


I wonder about this part -- for managed callees it seems like we could access their return type and figure out the extent of the writes they do.

In principle the callee must either write that entire extent or write nothing (nothing only if it throws rather than returns; then if we're in a try the previous definition can reach handlers/filters). But maybe we skip writing to padding in some cases.

Yes, the problem is finding the call node that contains the retbuf, since the linked locals list does not contain it. In principle it shouldn't be hard to quickly find it utilizing GTF_CALL, so we could be a bit more precise about this.

AndyAyersMS · 2023-05-19T15:05:40Z

src/coreclr/jit/promotionliveness.cpp

+        // TODO: We need a scalability limit on these, we cannot always track
+        // the remainder and all fields.


This is similar to the "coarsening" discussion we had re alias analysis the other day, you should be able to have models where you group together fields and or fields + remainder and get approximate liveness for them.

Would be interesting to know if you see cases where there are large structs with "hot fields" so that it is worth tracking a subset but coarsening everything else, and if that diverges somehow from just promoting the hot fields and leaving the rest.

AndyAyersMS · 2023-05-19T15:15:59Z

src/coreclr/jit/promotionliveness.cpp

+
+        for (BasicBlock* block = m_compiler->fgLastBB; block != nullptr; block = block->bbPrev)
+        {
+            m_hasPossibleBackEdge |= block->bbNext && (block->bbNext->bbNum <= block->bbNum);


We know for sure we have proper bbNum ordering?

In cases with loops this will converge faster if you walk the blocks in post-order here, to maximize the propagation per iteration.

Looks like the same goes for regular liveness so perhaps we can fix that too -- maybe do both as a follow-up?

We know for sure we have proper bbNum ordering?

Even if we don't it will just result in m_hasPossibleBackEdge being set to true which might result in one more iteration of the fixpoint computation than necessary.

When I did early liveness I considered computing strongly connected components and doing the localized fixpoint computation within each SCC, which should be the optimal thing. However, when I measured the actual throughput cost of the fixpoint computation I felt it wasn't worth more complexity, from #79346 (comment):

Base: 232865278205, Diff: 236405285056, +1.5202% ?PerBlockAnalysis@LiveVarAnalysis@@AEAA_NPEAUBasicBlock@@_N1@Z : 616293246 : +30.93% : 11.44% : +0.2647% ?fgMarkUseDef@Compiler@@AEAAXPEAUGenTreeLclVarCommon@@@Z : 426414825 : +39.60% : 7.91% : +0.1831% ?fgInterBlockLocalVarLiveness@Compiler@@QEAAXXZ : 353468679 : +43.94% : 6.56% : +0.1518% ?fgPerBlockLocalVarLiveness@Compiler@@QEAAXXZ : 258048984 : +26.39% : 4.79% : +0.1108% ?lvaSortByRefCount@Compiler@@QEAAXXZ : 222829814 : +39.99% : 4.14% : +0.0957% ?GetSucc@BasicBlock@@QEAAPEAU1@IPEAVCompiler@@@Z : 142747231 : +11.06% : 2.65% : +0.0613% ?PostOrderVisit@ForwardSubVisitor@@QEAA?AW4fgWalkResult@Compiler@@PEAPEAUGenTree@@PEAU4@@Z : 136636902 : NA : 2.54% : +0.0587% ?WalkTree@?$GenTreeVisitor@VLocalSequencer@@@@QEAA?AW4fgWalkResult@Compiler@@PEAPEAUGenTree@@PEAU4@@Z : 134894223 : NA : 2.50% : +0.0579% ??0AllSuccessorIterPosition@@QEAA@PEAVCompiler@@PEAUBasicBlock@@@Z : 129181805 : +13.59% : 2.40% : +0.0555% ?fgComputeLifeTrackedLocalUse@Compiler@@QEAAXAEAPEA_KAEAVLclVarDsc@@PEAUGenTreeLclVarCommon@@@Z : 127997140 : +37.14% : 2.38% : +0.0550% ?fgForwardSubStatement@Compiler@@AEAA_NPEAUStatement@@@Z : 119968922 : +70.17% : 2.23% : +0.0515% ?fgMarkAddressExposedLocals@Compiler@@AEAA?AW4PhaseStatus@@XZ : 119662088 : +72.72% : 2.22% : +0.0514% ?FindNextRegSuccTry@EHSuccessorIterPosition@@AEAAXPEAVCompiler@@PEAUBasicBlock@@@Z : 116051428 : +13.69% : 2.15% : +0.0498% ?WalkTree@?$GenTreeVisitor@V?$GenericTreeWalker@$00$0A@$0A@$00@@@@QEAA?AW4fgWalkResult@Compiler@@PEAPEAUGenTree@@PEAU4@@Z : 111845378 : +56.69% : 2.08% : +0.0480% ?fgComputeLifeTrackedLocalDef@Compiler@@QEAA_NAEAPEA_KAEBQEA_KAEAVLclVarDsc@@PEAUGenTreeLclVarCommon@@@Z : 108064056 : +34.46% : 2.01% : +0.0464% memset : 104530112 : +5.76% : 1.94% : +0.0449%

PerBlockLiveness is the method called for each basic block in each iteration of the fixpoint computation. So around 0.26% in TP cost for the fixpoint computation from adding early liveness.

It would still be an interesting experiment, and RPO traversal is also simpler than computing full SCCs.

jakobbotsch added 8 commits May 8, 2023 21:01

JIT: Physical promotion liveness

5d676a2

Compiler fixes

e7c2805

Fix build

92d28a6

Stretch 2

dbd223d

Run jit-format

3e1ef9a

Merge branch 'main' of github.com:dotnet/runtime into physical-promot…

8e688cd

…ion-liveness

Fix

1c7e184

Fixes, run jit-format

cb8ea30

ghost assigned jakobbotsch May 10, 2023

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 10, 2023

build-analysis bot mentioned this pull request May 10, 2023

Test_EventSource_EtwManifestGeneration* tests failing in CI #48798

Closed

Handle non-remainder uses of struct local

68b822b

jakobbotsch added 2 commits May 12, 2023 14:05

Cleanup; run jit-format

6798af5

Handle QMARKs

70c65b3

build-analysis bot mentioned this pull request May 12, 2023

Tracking issue for CI build timeouts #76454

Closed

jakobbotsch added 6 commits May 12, 2023 22:45

Oopsie

6a68b10

More JITDUMPing, mark more locals as dying

585fbb6

Merge branch 'main' of github.com:dotnet/runtime into physical-promot…

3225eb9

…ion-liveness

Revert, run jit-format

8d73b0e

Fix some logging

5f0f264

Fix build

c78f76e

Rename, fix some comments

361ab86

jakobbotsch mentioned this pull request May 15, 2023

JIT: Generalized struct promotion #76928

Closed

40 tasks

jakobbotsch marked this pull request as ready for review May 15, 2023 13:38

jakobbotsch commented May 15, 2023

View reviewed changes

AndyAyersMS approved these changes May 19, 2023

View reviewed changes

jakobbotsch merged commit e62cb64 into dotnet:main May 19, 2023

jakobbotsch deleted the physical-promotion-liveness branch May 19, 2023 20:07

This was referenced May 19, 2023

JIT: Try more optimal fixpoint computation strategies for liveness #86527

Closed

Unsafe.BitCast/As does not remove stack spilling for custom SIMD type #86448

Closed

ghost locked as resolved and limited conversation to collaborators Jun 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Add a pass of liveness for the new locals inside physical promotion #86043

JIT: Add a pass of liveness for the new locals inside physical promotion #86043

jakobbotsch commented May 10, 2023 •

edited

Loading

ghost commented May 10, 2023

jakobbotsch commented May 10, 2023

azure-pipelines bot commented May 10, 2023

jakobbotsch commented May 12, 2023

azure-pipelines bot commented May 12, 2023

jakobbotsch commented May 12, 2023

azure-pipelines bot commented May 12, 2023

jakobbotsch commented May 13, 2023

azure-pipelines bot commented May 13, 2023

jakobbotsch commented May 13, 2023

jakobbotsch commented May 13, 2023

azure-pipelines bot commented May 13, 2023

jakobbotsch commented May 15, 2023

jakobbotsch May 15, 2023

AndyAyersMS May 19, 2023

jakobbotsch May 19, 2023

jakobbotsch May 15, 2023

jakobbotsch May 15, 2023

jakobbotsch May 15, 2023

jakobbotsch commented May 19, 2023

AndyAyersMS left a comment

AndyAyersMS May 19, 2023

jakobbotsch May 19, 2023

AndyAyersMS May 19, 2023

AndyAyersMS May 19, 2023

AndyAyersMS May 19, 2023

jakobbotsch May 19, 2023 •

edited

Loading

		// TODO: We need a scalability limit on these, we cannot always track
		// the remainder and all fields.

JIT: Add a pass of liveness for the new locals inside physical promotion #86043

JIT: Add a pass of liveness for the new locals inside physical promotion #86043

Conversation

jakobbotsch commented May 10, 2023 • edited Loading

ghost commented May 10, 2023

jakobbotsch commented May 10, 2023

azure-pipelines bot commented May 10, 2023

jakobbotsch commented May 12, 2023

azure-pipelines bot commented May 12, 2023

jakobbotsch commented May 12, 2023

azure-pipelines bot commented May 12, 2023

jakobbotsch commented May 13, 2023

azure-pipelines bot commented May 13, 2023

jakobbotsch commented May 13, 2023

jakobbotsch commented May 13, 2023

azure-pipelines bot commented May 13, 2023

jakobbotsch commented May 15, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakobbotsch commented May 19, 2023

AndyAyersMS left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakobbotsch May 19, 2023 • edited Loading

Choose a reason for hiding this comment

jakobbotsch commented May 10, 2023 •

edited

Loading

jakobbotsch May 19, 2023 •

edited

Loading