[LoopUnroll] Ignore inlinable calls if unrolling is forced #88149

gburgessiv · 2024-04-09T16:26:37Z

NumInlineCandidates counts candidates that are very likely to be inlined. This is a useful metric, but causes linker warnings if:

the loop to be unrolled has had unrolling forced by the user, and
the inliner fails to inline the call (e.g., because it's considered a very cold callsite)

Fixes #88141

llvmbot · 2024-04-09T16:27:08Z

@llvm/pr-subscribers-llvm-transforms

Author: George Burgess IV (gburgessiv)

Changes

NumInlineCandidates counts candidates that are very likely to be inlined. This is a useful metric, but causes linker warnings if:

the loop to be unrolled has had unrolling forced by the user, and
the inliner fails to inline the call (e.g., because it's considered a very cold callsite)

Full diff: https://github.com/llvm/llvm-project/pull/88149.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp (+8-3)
(added) llvm/test/Transforms/LoopUnroll/unroll-calls-if-forced.ll (+35)

diff --git a/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp b/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
index 75fb8765061edf..a026abc3267ca1 100644
--- a/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
@@ -1154,9 +1154,11 @@ tryToUnrollLoop(Loop *L, DominatorTree &DT, LoopInfo *LI, ScalarEvolution &SE,
   // automatic unrolling from interfering with the user requested
   // transformation.
   Loop *ParentL = L->getParentLoop();
+  const bool UnrollIsForcedByUser =
+      hasUnrollTransformation(L) == TM_ForcedByUser;
   if (ParentL != nullptr &&
       hasUnrollAndJamTransformation(ParentL) == TM_ForcedByUser &&
-      hasUnrollTransformation(L) != TM_ForcedByUser) {
+      !UnrollIsForcedByUser) {
     LLVM_DEBUG(dbgs() << "Not unrolling loop since parent loop has"
                       << " llvm.loop.unroll_and_jam.\n");
     return LoopUnrollResult::Unmodified;
@@ -1166,7 +1168,7 @@ tryToUnrollLoop(Loop *L, DominatorTree &DT, LoopInfo *LI, ScalarEvolution &SE,
   // loop has an explicit unroll-and-jam pragma. This is to prevent automatic
   // unrolling from interfering with the user requested transformation.
   if (hasUnrollAndJamTransformation(L) == TM_ForcedByUser &&
-      hasUnrollTransformation(L) != TM_ForcedByUser) {
+      !UnrollIsForcedByUser) {
     LLVM_DEBUG(
         dbgs()
         << "  Not unrolling loop since it has llvm.loop.unroll_and_jam.\n");
@@ -1216,7 +1218,10 @@ tryToUnrollLoop(Loop *L, DominatorTree &DT, LoopInfo *LI, ScalarEvolution &SE,
   if (OptForSize)
     UP.Threshold = std::max(UP.Threshold, LoopSize + 1);
 
-  if (UCE.NumInlineCandidates != 0) {
+  // Ignore the potential for inlining if unrolling is forced by the user. It's
+  // only _very likely_ that `NumInlineCandidates` functions will be inlined;
+  // things like profile data can make this significantly less likely.
+  if (!UnrollIsForcedByUser && UCE.NumInlineCandidates != 0) {
     LLVM_DEBUG(dbgs() << "  Not unrolling loop with inlinable calls.\n");
     return LoopUnrollResult::Unmodified;
   }
diff --git a/llvm/test/Transforms/LoopUnroll/unroll-calls-if-forced.ll b/llvm/test/Transforms/LoopUnroll/unroll-calls-if-forced.ll
new file mode 100644
index 00000000000000..37548920358147
--- /dev/null
+++ b/llvm/test/Transforms/LoopUnroll/unroll-calls-if-forced.ll
@@ -0,0 +1,35 @@
+; RUN: opt -passes=loop-unroll -disable-verify -S < %s | FileCheck %s
+;
+; Ensure that loop unrolling occurs if the user forces it, even if there are
+; calls that may be inlined.
+
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+
+define internal void @foo() {
+  ret void
+}
+
+; CHECK-LABEL: @forced(
+; CHECK: call {{.*}}void @foo
+; CHECK: call {{.*}}void @foo
+define void @forced(ptr nocapture %a) {
+entry:
+  br label %for.body
+
+for.body:
+  %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
+  %arrayidx = getelementptr inbounds i32, ptr %a, i64 %indvars.iv
+  call void @foo()
+  %0 = load i32, ptr %arrayidx, align 4
+  %inc = add nsw i32 %0, 1
+  store i32 %inc, ptr %arrayidx, align 4
+  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+  %exitcond = icmp eq i64 %indvars.iv.next, 64
+  br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
+
+for.end:
+  ret void
+}
+
+!0 = distinct !{!0, !{!"llvm.loop.unroll.enable"},
+                    !{!"llvm.loop.unroll.count", i32 2}}

gburgessiv · 2024-04-10T14:48:03Z

Hey Nikita, I see you've done a few reviews in this area. Would you kindly be able to take a look at this CL?

fhahn · 2024-04-10T14:59:01Z

This is a useful metric, but causes linker warnings if:

From the description, it's not clear what kind of warnings this would yield. Looking at the linked issue, it's remarks that get printed by default (loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering).

Would be good to clarify in the description.

Regardless of the patch at hand, I don't think we really should print those messages by default; there are other reasons why the compiler won't unroll a loop even if forced by the user.

Could you also add a link to #88141 in the description, not in the title as (#88141)? This could be confusing as Github will add the PR ID in a similar fashion when landing.

nikic · 2024-04-15T04:43:36Z

llvm/test/Transforms/LoopUnroll/unroll-calls-if-forced.ll

@@ -0,0 +1,35 @@
+; RUN: opt -passes=loop-unroll -disable-verify -S < %s | FileCheck %s


Why does this -disable-verify?

Please also use UTC.

Why does this -disable-verify?

Copy-paste from another test; I removed this.

Please also use UTC.

I assume for commit timestamps? Done.

UTC as in llvm/utils/update_test_checks.py to automatically generate CHECK lines

Ah, that makes more sense. Thanks

nikic · 2024-04-15T04:52:46Z

I think the change makes sense conceptually -- pragma unroll should override pretty much all inlining heuristics, and I don't see a strong reason why it wouldn't override this one as well.

A possible caveat is the interaction with the pre-link pipeline: Are we okay with unrolling the calls there, or would it be better to wait until post-link (in which case the call has likely been inlined)?

aeubanks · 2024-05-09T23:23:33Z

reverse ping, Chromium is seeing something similar that may be fixed by this

gburgessiv · 2024-05-10T17:43:21Z

Ah, sorry for the delay here.

A possible caveat is the interaction with the pre-link pipeline: Are we okay with unrolling the calls there, or would it be better to wait until post-link (in which case the call has likely been inlined)?

I added a commit that tries to funnel PrepareForLTO through properly. I'm not confident it caught everything, but it should be a step toward addressing this, at least. I agree pushing this out of prelink is good, since the prelink pass manager seems to favor transformations that reduce IR size.

aeubanks

the PrepareForLTO portion needs testing. you'll need to add parsing code for something like -passes='loop-unroll-full<prepare-for-lto>', grep for parseLoopRotateOptions for an example

also, isn't this still going to result in the remark getting printed in the pre-link optimization pipeline? I think we should treat the PrepareForLTO and the call heuristic as two separate issues and go back to the original PR, splitting out the PrepareForLTO change

please move the issue being fixed from the commit title to the commit message like Fixes #88141, otherwise it's confusing with the PR number

aeubanks · 2024-05-15T15:26:26Z

llvm/lib/Passes/PassBuilderPipelines.cpp

+    LPM2.addPass(LoopFullUnrollPass(
+        Level.getSpeedupLevel(),
+        /* OnlyWhenForced= */ !PTO.LoopUnrolling, PTO.ForgetAllSCEVInLoopUnroll,
+        /* PrepareForLTO= */ isLTOPreLink(ThinOrFullLTOPhase::ThinLTOPreLink)));


isLTOPreLink(Phase), ditto below

gburgessiv · 2024-05-20T17:32:38Z

also, isn't this still going to result in the remark getting printed in the pre-link optimization pipeline?

I only observed the remarks during LTO, so I'd assumed that the prelink pipeline has a way of bypassing these checks (which passes my smell test, at least, since it's reasonable to expect opts to be in an unfinished state after pre-link).

I think we should treat the PrepareForLTO and the call heuristic as two separate issues and go back to the original PR, splitting out the PrepareForLTO change

Yeah, fixing the bug first SGTM.

please move the issue being fixed from the commit title to the commit message like Fixes #88141, otherwise it's confusing with the PR number

Done

LeszekSwirski · 2024-11-04T12:22:09Z

Is there any progress here? We've had to disable this annotation in V8 for now because of this issue (crbug.com/377144577)

gburgessiv · 2024-11-18T20:16:28Z

I agree we should ideally push this through; let me rebase and check tests, and I'll go through this thread to see if AIs are blocked on me.

`NumInlineCandidates` counts candidates that are _very likely_ to be inlined. This is a useful metric, but causes linker warnings if: - the loop to be unrolled has had unrolling forced by the user, and - the inliner fails to inline the call (e.g., because it's considered a very cold callsite) Fixes llvm#88141

llvmbot added the llvm:transforms label Apr 9, 2024

gburgessiv requested a review from nikic April 10, 2024 14:47

nikic reviewed Apr 15, 2024

View reviewed changes

gburgessiv force-pushed the bug-88141 branch from dd0fea3 to 789592b Compare May 10, 2024 17:42

aeubanks reviewed May 15, 2024

View reviewed changes

gburgessiv changed the title ~~[LoopUnroll] Ignore inlinable calls if unrolling is forced (#88141)~~ [LoopUnroll] Ignore inlinable calls if unrolling is forced May 20, 2024

gburgessiv force-pushed the bug-88141 branch 2 times, most recently from 18de714 to e7eb490 Compare May 20, 2024 17:32

gburgessiv force-pushed the bug-88141 branch from e7eb490 to fdc96e4 Compare November 18, 2024 23:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LoopUnroll] Ignore inlinable calls if unrolling is forced #88149

[LoopUnroll] Ignore inlinable calls if unrolling is forced #88149

gburgessiv commented Apr 9, 2024 •

edited

Loading

llvmbot commented Apr 9, 2024

gburgessiv commented Apr 10, 2024

fhahn commented Apr 10, 2024

nikic Apr 15, 2024

gburgessiv May 10, 2024

aeubanks May 15, 2024

gburgessiv May 20, 2024

nikic commented Apr 15, 2024

aeubanks commented May 9, 2024

gburgessiv commented May 10, 2024

aeubanks left a comment

aeubanks May 15, 2024

gburgessiv commented May 20, 2024

LeszekSwirski commented Nov 4, 2024

gburgessiv commented Nov 18, 2024

		@@ -0,0 +1,35 @@
		; RUN: opt -passes=loop-unroll -disable-verify -S < %s \| FileCheck %s

[LoopUnroll] Ignore inlinable calls if unrolling is forced #88149

Are you sure you want to change the base?

[LoopUnroll] Ignore inlinable calls if unrolling is forced #88149

Conversation

gburgessiv commented Apr 9, 2024 • edited Loading

llvmbot commented Apr 9, 2024

gburgessiv commented Apr 10, 2024

fhahn commented Apr 10, 2024

nikic Apr 15, 2024

Choose a reason for hiding this comment

gburgessiv May 10, 2024

Choose a reason for hiding this comment

aeubanks May 15, 2024

Choose a reason for hiding this comment

gburgessiv May 20, 2024

Choose a reason for hiding this comment

nikic commented Apr 15, 2024

aeubanks commented May 9, 2024

gburgessiv commented May 10, 2024

aeubanks left a comment

Choose a reason for hiding this comment

aeubanks May 15, 2024

Choose a reason for hiding this comment

gburgessiv commented May 20, 2024

LeszekSwirski commented Nov 4, 2024

gburgessiv commented Nov 18, 2024

gburgessiv commented Apr 9, 2024 •

edited

Loading