Wrong code at -O3 on x86_64 [14 regression since 0d95b20] #66986

shao-hua-li · 2023-09-21T07:25:42Z

clang at -O3 produced the wrong code.

Bisected to 0d95b20, which was committed by @xortator

Compiler explorer: https://godbolt.org/z/o9Mdccsfe

% cat a.c
int printf(const char *, ...);
int a, b, c, d, e, g, h;
unsigned int f, i;
long j;
int k() {
  if (f < 2)
    return d;
  g = e / f;
  return g;
}
int main() {
  i = -1;
  for (; i>0; ++i) {
    h = 0;
    for (; h < 5; h++)
      j = 1;
    for (; k() + i + j <= 4; j++) {
      while (c)
        ;
      for (; b - 5 + h < 6;)
        ;
    }
  }
  printf("%d\n", a);
}
%
% clang -O2 a.c && ./a.out
0
% clang -O3 a.c && ./a.out
floating point exception
% clang -fsanitize=address,undefined a.c && ./a.out
0
%

The text was updated successfully, but these errors were encountered:

nikic · 2023-09-21T08:57:54Z

Opt bisect points to an IndVars run with the following diff: https://gist.github.com/nikic/b241c38dcab5f27c652ebed27f371ebe

This is an LFTR transform:

INDVARS: Rewriting loop exit condition to:
      LHS:  %inc16.us.us55 = add nuw nsw i64 %inc1624.us36.us, 1
       op:	!=
      RHS:	  %smax = call i64 @llvm.smax.i64(i64 %indvars.iv, i64 2)
ExitCount:	(-2 + (2 smax {(5 + (-1 * (zext i32 (-1 + (%1 /u %0)) to i64))<nsw>)<nsw>,+,-1}<nuw><nsw><%for.cond1.preheader.us>))<nsw>
  was:   %cmp6.us.us56 = icmp slt i64 %inc16.us.us55, %invariant.op.us

I haven't tried reasoning about the correctness of this, but what strikes me is that this is a very non-profitable transform. Not only do we insert an expensive exit count calculation, we also add a whole extra IV, plus an extra smax in the loop.

nikic · 2023-09-21T09:07:36Z

Ah, the correctness problem is more obvious than I thought: We end up hoisting a division by zero here. We're missing an isSafeToExpand check in LFTR.

nikic · 2023-09-21T10:31:39Z

Reduced:

target datalayout = "n8:16:32:64"

define i32 @test(i1 %c, i32 %arg1, i32 %arg2) {
entry:
  br label %loop

loop:
  %phi = phi i32 [ %add9, %loop.latch ], [ 0, %entry ]
  br i1 %c, label %if, label %loop.latch

if:
  %udiv = udiv i32 %arg1, %arg2
  %add = add i32 %udiv, %phi
  %zext = zext i32 %add to i64
  br label %loop2

loop2:
  %phi6 = phi i64 [ %add7, %loop2 ], [ 0, %if ]
  %add7 = add i64 %phi6, 1
  %icmp = icmp slt i64 %phi6, %zext
  br i1 %icmp, label %loop2, label %loop.latch

loop.latch:
  %add9 = add i32 %phi, 1
  br label %loop
}

github-actions bot added the new issue label Sep 21, 2023

nikic added miscompilation and removed new issue labels Sep 21, 2023

nikic self-assigned this Sep 21, 2023

nikic added a commit that referenced this issue Sep 21, 2023

[IndVars] Add test for #66986 (NFC)

0fe905d

nikic closed this as completed in 7078993 Sep 21, 2023

kstoimenov mentioned this issue Sep 22, 2023

Add memcpm test kstoimenov/llvm-project#13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong code at -O3 on x86_64 [14 regression since 0d95b20] #66986

Wrong code at -O3 on x86_64 [14 regression since 0d95b20] #66986

shao-hua-li commented Sep 21, 2023

nikic commented Sep 21, 2023

nikic commented Sep 21, 2023

nikic commented Sep 21, 2023

Wrong code at -O3 on x86_64 [14 regression since 0d95b20] #66986

Wrong code at -O3 on x86_64 [14 regression since 0d95b20] #66986

Comments

shao-hua-li commented Sep 21, 2023

nikic commented Sep 21, 2023

nikic commented Sep 21, 2023

nikic commented Sep 21, 2023