Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong code at -O3 on x86_64 [14 regression since 0d95b20] #66986

Closed
shao-hua-li opened this issue Sep 21, 2023 · 3 comments
Closed

Wrong code at -O3 on x86_64 [14 regression since 0d95b20] #66986

shao-hua-li opened this issue Sep 21, 2023 · 3 comments
Assignees

Comments

@shao-hua-li
Copy link

clang at -O3 produced the wrong code.

Bisected to 0d95b20, which was committed by @xortator

Compiler explorer: https://godbolt.org/z/o9Mdccsfe

% cat a.c
int printf(const char *, ...);
int a, b, c, d, e, g, h;
unsigned int f, i;
long j;
int k() {
  if (f < 2)
    return d;
  g = e / f;
  return g;
}
int main() {
  i = -1;
  for (; i>0; ++i) {
    h = 0;
    for (; h < 5; h++)
      j = 1;
    for (; k() + i + j <= 4; j++) {
      while (c)
        ;
      for (; b - 5 + h < 6;)
        ;
    }
  }
  printf("%d\n", a);
}
%
% clang -O2 a.c && ./a.out
0
% clang -O3 a.c && ./a.out
floating point exception
% clang -fsanitize=address,undefined a.c && ./a.out
0
%
@nikic
Copy link
Contributor

nikic commented Sep 21, 2023

Opt bisect points to an IndVars run with the following diff: https://gist.github.com/nikic/b241c38dcab5f27c652ebed27f371ebe

This is an LFTR transform:

INDVARS: Rewriting loop exit condition to:
      LHS:  %inc16.us.us55 = add nuw nsw i64 %inc1624.us36.us, 1
       op:	!=
      RHS:	  %smax = call i64 @llvm.smax.i64(i64 %indvars.iv, i64 2)
ExitCount:	(-2 + (2 smax {(5 + (-1 * (zext i32 (-1 + (%1 /u %0)) to i64))<nsw>)<nsw>,+,-1}<nuw><nsw><%for.cond1.preheader.us>))<nsw>
  was:   %cmp6.us.us56 = icmp slt i64 %inc16.us.us55, %invariant.op.us

I haven't tried reasoning about the correctness of this, but what strikes me is that this is a very non-profitable transform. Not only do we insert an expensive exit count calculation, we also add a whole extra IV, plus an extra smax in the loop.

@nikic
Copy link
Contributor

nikic commented Sep 21, 2023

Ah, the correctness problem is more obvious than I thought: We end up hoisting a division by zero here. We're missing an isSafeToExpand check in LFTR.

@nikic nikic self-assigned this Sep 21, 2023
@nikic
Copy link
Contributor

nikic commented Sep 21, 2023

Reduced:

target datalayout = "n8:16:32:64"

define i32 @test(i1 %c, i32 %arg1, i32 %arg2) {
entry:
  br label %loop

loop:
  %phi = phi i32 [ %add9, %loop.latch ], [ 0, %entry ]
  br i1 %c, label %if, label %loop.latch

if:
  %udiv = udiv i32 %arg1, %arg2
  %add = add i32 %udiv, %phi
  %zext = zext i32 %add to i64
  br label %loop2

loop2:
  %phi6 = phi i64 [ %add7, %loop2 ], [ 0, %if ]
  %add7 = add i64 %phi6, 1
  %icmp = icmp slt i64 %phi6, %zext
  br i1 %icmp, label %loop2, label %loop.latch

loop.latch:
  %add9 = add i32 %phi, 1
  br label %loop
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants