[flang] Generate valid IR on GOTO DO body #66084

luporl · 2023-09-12T13:14:57Z

Flang was generating invalid IR when there was a GOTO to the body
of a DO loop. This happened because the value of step, computed at
the beginning of the loop, was being reused at the end of the loop,
that, for unstructured loops, is in another basic block. Because of
this, a GOTO could skip the beginning of the loop, that defined
step, and yet try to use it at the end of the loop, which is
invalid.

Instead of reusing the step value, it can be recomputed if it is a
constant, or stored and loaded to/from a temporary variable, for
non-constant step expressions.

Note that, while this change prevents the generation of invalid IR
on the presence of jumps to DO loop bodies, what happens if the
program reaches the end of a DO loop without ever passing through
its beginning is undefined behavior, as some control variables,
such as trip, will be uninitialized. It doesn't seem worth the
effort and overhead to ensure this legacy extension will behave
correctly in this case. This is consistent with at least gfortran,
that doesn't behave correctly if step is not equal to one.

Fixes: #65036

Flang was generating invalid IR when there was a GOTO to the body of a DO loop. This happened because the value of step, computed at the beginning of the loop, was being reused at the end of the loop, that, for unstructured loops, is in another basic block. Because of this, a GOTO could skip the beginning of the loop, that defined step, and yet try to use it at the end of the loop, which is invalid. Instead of reusing the step value, it can be recomputed if it is a constant, or stored and loaded to/from a temporary variable, for non-constant step expressions. Note that, while this change prevents the generation of invalid IR on the presence of jumps to DO loop bodies, what happens if the program reaches the end of a DO loop without ever passing through its beginning is undefined behavior, as some control variables, such as trip, will be uninitialized. It doesn't seem worth the effort and overhead to ensure this legacy extension will behave correctly in this case. This is consistent with at least gfortran, that doesn't behave correctly if step is not equal to one. Fixes: llvm#65036

llvmbot · 2023-09-12T13:15:55Z

@llvm/pr-subscribers-flang-fir-hlfir

Changes

Flang was generating invalid IR when there was a GOTO to the body
of a DO loop. This happened because the value of step, computed at
the beginning of the loop, was being reused at the end of the loop,
that, for unstructured loops, is in another basic block. Because of
this, a GOTO could skip the beginning of the loop, that defined
step, and yet try to use it at the end of the loop, which is
invalid.

Instead of reusing the step value, it can be recomputed if it is a
constant, or stored and loaded to/from a temporary variable, for
non-constant step expressions.

Note that, while this change prevents the generation of invalid IR
on the presence of jumps to DO loop bodies, what happens if the
program reaches the end of a DO loop without ever passing through
its beginning is undefined behavior, as some control variables,
such as trip, will be uninitialized. It doesn't seem worth the
effort and overhead to ensure this legacy extension will behave
correctly in this case. This is consistent with at least gfortran,
that doesn't behave correctly if step is not equal to one.

Fixes: #65036

--

Patch is 24.48 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/66084.diff

7 Files Affected:

(modified) flang/lib/Lower/Bridge.cpp (+44-26)
(added) flang/test/Lower/HLFIR/goto-do-body.f90 (+114)
(modified) flang/test/Lower/do_loop.f90 (+3-1)
(modified) flang/test/Lower/do_loop_unstructured.f90 (+12-6)
(added) flang/test/Lower/goto-do-body.f90 (+125)
(modified) flang/test/Lower/loops2.f90 (+9-6)
(modified) flang/test/Lower/mixed_loops.f90 (+2-1)

diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 5d5ac0f274773ed..66ee2f6ecaea4a0 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -111,7 +111,6 @@ struct IncrementLoopInfo {
   llvm::SmallVector localInitSymList;
   llvm::SmallVector sharedSymList;
   mlir::Value loopVariable = nullptr;
-  mlir::Value stepValue = nullptr; // possible uses in multiple blocks
 
   // Data members for structured loops.
   fir::DoLoopOp doLoop = nullptr;
@@ -119,6 +118,7 @@ struct IncrementLoopInfo {
   // Data members for unstructured loops.
   bool hasRealControl = false;
   mlir::Value tripVariable = nullptr;
+  mlir::Value stepVariable = nullptr;
   mlir::Block *headerBlock = nullptr; // loop entry and test block
   mlir::Block *maskBlock = nullptr;   // concurrent loop mask block
   mlir::Block *bodyBlock = nullptr;   // first loop body block
@@ -1708,29 +1708,45 @@ class FirConverter : public Fortran::lower::AbstractConverter {
     genFIR(endDoEval, unstructuredContext);
   }
 
+  /// Generate FIR to evaluate loop control values (lower, upper and step).
+  mlir::Value genControlValue(const Fortran::lower::SomeExpr *expr,
+                              const IncrementLoopInfo &info,
+                              bool *isConst = nullptr) {
+    mlir::Location loc = toLocation();
+    mlir::Type controlType = info.isStructured() ? builder->getIndexType()
+                                                 : info.getLoopVariableType();
+    Fortran::lower::StatementContext stmtCtx;
+    if (expr) {
+      if (isConst)
+        *isConst = Fortran::evaluate::IsConstantExpr(*expr);
+      return builder->createConvert(loc, controlType,
+                                    createFIRExpr(loc, expr, stmtCtx));
+    }
+
+    if (isConst)
+      *isConst = true;
+    if (info.hasRealControl)
+      return builder->createRealConstant(loc, controlType, 1u);
+    return builder->createIntegerConstant(loc, controlType, 1); // step
+  }
+
   /// Generate FIR to begin a structured or unstructured increment loop nest.
   void genFIRIncrementLoopBegin(IncrementLoopNestInfo &incrementLoopNestInfo) {
     assert(!incrementLoopNestInfo.empty() && "empty loop nest");
     mlir::Location loc = toLocation();
-    auto genControlValue = [&](const Fortran::lower::SomeExpr *expr,
-                               const IncrementLoopInfo &info) {
-      mlir::Type controlType = info.isStructured() ? builder->getIndexType()
-                                                   : info.getLoopVariableType();
-      Fortran::lower::StatementContext stmtCtx;
-      if (expr)
-        return builder->createConvert(loc, controlType,
-                                      createFIRExpr(loc, expr, stmtCtx));
-
-      if (info.hasRealControl)
-        return builder->createRealConstant(loc, controlType, 1u);
-      return builder->createIntegerConstant(loc, controlType, 1); // step
-    };
     for (IncrementLoopInfo &info : incrementLoopNestInfo) {
       info.loopVariable =
           genLoopVariableAddress(loc, info.loopVariableSym, info.isUnordered);
       mlir::Value lowerValue = genControlValue(info.lowerExpr, info);
       mlir::Value upperValue = genControlValue(info.upperExpr, info);
-      info.stepValue = genControlValue(info.stepExpr, info);
+      bool isConst = true;
+      mlir::Value stepValue = genControlValue(
+          info.stepExpr, info, info.isStructured() ? nullptr : &isConst);
+      // Use a temp variable for unstructured loops with non-const step.
+      if (!isConst) {
+        info.stepVariable = builder->createTemporary(loc, stepValue.getType());
+        builder->create(loc, stepValue, info.stepVariable);
+      }
 
       // Structured loop - generate fir.do_loop.
       if (info.isStructured()) {
@@ -1739,14 +1755,14 @@ class FirConverter : public Fortran::lower::AbstractConverter {
         if (info.isUnordered) {
           // The loop variable value is explicitly updated.
           info.doLoop = builder->create(
-              loc, lowerValue, upperValue, info.stepValue, /*unordered=*/true);
+              loc, lowerValue, upperValue, stepValue, /*unordered=*/true);
           builder->setInsertionPointToStart(info.doLoop.getBody());
           loopValue = builder->createConvert(loc, loopVarType,
                                              info.doLoop.getInductionVar());
         } else {
           // The loop variable is a doLoop op argument.
           info.doLoop = builder->create(
-              loc, lowerValue, upperValue, info.stepValue, /*unordered=*/false,
+              loc, lowerValue, upperValue, stepValue, /*unordered=*/false,
               /*finalCountValue=*/true,
               builder->createConvert(loc, loopVarType, lowerValue));
           builder->setInsertionPointToStart(info.doLoop.getBody());
@@ -1775,18 +1791,17 @@ class FirConverter : public Fortran::lower::AbstractConverter {
         auto diff1 =
             builder->create(loc, upperValue, lowerValue);
         auto diff2 =
-            builder->create(loc, diff1, info.stepValue);
-        tripCount =
-            builder->create(loc, diff2, info.stepValue);
+            builder->create(loc, diff1, stepValue);
+        tripCount = builder->create(loc, diff2, stepValue);
         tripCount =
             builder->createConvert(loc, builder->getIndexType(), tripCount);
       } else {
         auto diff1 =
             builder->create(loc, upperValue, lowerValue);
         auto diff2 =
-            builder->create(loc, diff1, info.stepValue);
+            builder->create(loc, diff1, stepValue);
         tripCount =
-            builder->create(loc, diff2, info.stepValue);
+            builder->create(loc, diff2, stepValue);
       }
       if (forceLoopToExecuteOnce) { // minimum tripCount is 1
         mlir::Value one =
@@ -1874,12 +1889,15 @@ class FirConverter : public Fortran::lower::AbstractConverter {
       tripCount = builder->create(loc, tripCount, one);
       builder->create(loc, tripCount, info.tripVariable);
       mlir::Value value = builder->create(loc, info.loopVariable);
+      mlir::Value step;
+      if (info.stepVariable)
+        step = builder->create(loc, info.stepVariable);
+      else
+        step = genControlValue(info.stepExpr, info);
       if (info.hasRealControl)
-        value =
-            builder->create(loc, value, info.stepValue);
+        value = builder->create(loc, value, step);
       else
-        value =
-            builder->create(loc, value, info.stepValue);
+        value = builder->create(loc, value, step);
       builder->create(loc, value, info.loopVariable);
 
       genBranch(info.headerBlock);
diff --git a/flang/test/Lower/HLFIR/goto-do-body.f90 b/flang/test/Lower/HLFIR/goto-do-body.f90
new file mode 100644
index 000000000000000..383b839e591e33d
--- /dev/null
+++ b/flang/test/Lower/HLFIR/goto-do-body.f90
@@ -0,0 +1,114 @@
+! RUN: bbc --hlfir -o - %s | FileCheck %s
+
+! Test jumping to the body of a do loop.
+subroutine sub1()
+! CHECK-LABEL:  func @_QPsub1() {
+  implicit none
+  integer :: i
+  external foo
+! CHECK:    %[[C2:.*]] = arith.constant 2 : i32
+! CHECK:    %[[C0:.*]] = arith.constant 0 : i32
+! CHECK:    %[[C1:.*]] = arith.constant 1 : i32
+! CHECK:    %[[TRIP:.*]] = fir.alloca i32
+! CHECK:    %[[I_REF:.*]] = fir.alloca i32 {bindc_name = "i", {{.*}}}
+! CHECK:    %[[I:.*]]:2 = hlfir.declare %[[I_REF]] {uniq_name = "_QFsub1Ei"} : (!fir.ref) -> (!fir.ref, !fir.ref)
+
+  do i = 1, 3
+    if (i .eq. 2) goto 70
+! CHECK:    %[[TMP1:.*]] = fir.load %[[I]]#0 : !fir.ref
+! CHECK:    %[[COND:.*]] = arith.cmpi eq, %[[TMP1]], %[[C2]] : i32
+! CHECK:    cf.cond_br %[[COND]], ^[[BODY:.*]], ^{{.*}}
+
+  end do
+
+  call foo
+! CHECK:  fir.call @_QPfoo()
+
+  do i = 1, 2
+! CHECK:    fir.store %[[C2]] to %[[TRIP]] : !fir.ref
+! CHECK:    fir.store %[[C1]] to %[[I]]#1 : !fir.ref
+! CHECK:    cf.br ^[[HEADER:.*]]
+! CHECK:  ^[[HEADER]]:
+! CHECK:    %[[TMP2:.*]] = fir.load %[[TRIP]] : !fir.ref
+! CHECK:    %[[TMP3:.*]] = arith.cmpi sgt, %[[TMP2]], %[[C0]] : i32
+! CHECK:    cf.cond_br %[[TMP3]], ^[[BODY]], ^[[EXIT:.*]]
+
+    70 call foo
+! CHECK:  ^[[BODY]]:
+! CHECK:    fir.call @_QPfoo()
+! CHECK:    %[[TMP4:.*]] = fir.load %[[TRIP]] : !fir.ref
+! CHECK:    %[[TMP5:.*]] = arith.subi %[[TMP4]], %[[C1]] : i32
+! CHECK:    fir.store %[[TMP5]] to %[[TRIP]] : !fir.ref
+! CHECK:    %[[TMP6:.*]] = fir.load %[[I]]#1 : !fir.ref
+! CHECK:    %[[TMP7:.*]] = arith.addi %[[TMP6]], %[[C1]] : i32
+! CHECK:    fir.store %[[TMP7]] to %[[I]]#1 : !fir.ref
+! CHECK:    cf.br ^[[HEADER]]
+  end do
+end subroutine
+! CHECK: ^[[EXIT]]:
+! CHECK:    return
+! CHECK:  }
+
+! Test jumping to the body of a do loop with a step expression.
+subroutine sub2()
+! CHECK-LABEL:  func @_QPsub2() {
+  implicit none
+  integer :: i, j
+  external foo
+! CHECK:    %[[C_7:.*]] = arith.constant -7 : i32
+! CHECK:    %[[C8:.*]] = arith.constant 8 : i32
+! CHECK:    %[[C2:.*]] = arith.constant 2 : i32
+! CHECK:    %[[C0:.*]] = arith.constant 0 : i32
+! CHECK:    %[[C3:.*]] = arith.constant 3 : i32
+! CHECK:    %[[C1:.*]] = arith.constant 1 : i32
+! CHECK:    %[[TRIP:.*]] = fir.alloca i32
+! CHECK:    %[[I_REF:.*]] = fir.alloca i32 {bindc_name = "i", {{.*}}}
+! CHECK:    %[[I:.*]]:2 = hlfir.declare %[[I_REF]] {uniq_name = "_QFsub2Ei"} : (!fir.ref) -> (!fir.ref, !fir.ref)
+! CHECK:    %[[J_REF:.*]] = fir.alloca i32 {bindc_name = "j", {{.*}}}
+! CHECK:    %[[J:.*]]:2 = hlfir.declare %[[J_REF]] {uniq_name = "_QFsub2Ej"} : (!fir.ref) -> (!fir.ref, !fir.ref)
+
+  do i = 1, 3
+    if (i .eq. 2) goto 70
+! CHECK:    %[[TMP1:.*]] = fir.load %[[I]]#0 : !fir.ref
+! CHECK:    %[[COND:.*]] = arith.cmpi eq, %[[TMP1]], %[[C2]] : i32
+! CHECK:    cf.cond_br %[[COND]], ^[[BODY:.*]], ^{{.*}}
+
+  end do
+
+  call foo
+! CHECK:    fir.call @_QPfoo()
+
+  j = 3
+! CHECK:    hlfir.assign %[[C3]] to %[[J]]#0 : i32, !fir.ref
+
+  do i = 1, 2, 3 * j - 8
+! CHECK:    %[[TMP2:.*]] = fir.load %[[J]]#0 : !fir.ref
+! CHECK:    %[[TMP3:.*]] = arith.muli %[[TMP2]], %[[C3]] : i32
+! CHECK:    %[[STEP:.*]] = arith.subi %[[TMP3]], %[[C8]] : i32
+! CHECK:    fir.store %[[STEP]] to %[[STEP_VAR:.*]] : !fir.ref
+! CHECK:    %[[TMP4:.*]] = arith.addi %[[TMP3]], %[[C_7]] : i32
+! CHECK:    %[[TMP5:.*]] = arith.divsi %[[TMP4]], %[[STEP]] : i32
+! CHECK:    fir.store %[[TMP5]] to %[[TRIP]] : !fir.ref
+! CHECK:    fir.store %[[C1]] to %[[I]]#1 : !fir.ref
+! CHECK:    cf.br ^[[HEADER:.*]]
+! CHECK:  ^[[HEADER]]:
+! CHECK:    %[[TMP6:.*]] = fir.load %[[TRIP]] : !fir.ref
+! CHECK:    %[[TMP7:.*]] = arith.cmpi sgt, %[[TMP6]], %[[C0]] : i32
+! CHECK:    cf.cond_br %[[TMP7]], ^[[BODY]], ^[[EXIT:.*]]
+
+    70 call foo
+! CHECK:  ^[[BODY]]:
+! CHECK:    fir.call @_QPfoo()
+! CHECK:    %[[TMP8:.*]] = fir.load %[[TRIP]] : !fir.ref
+! CHECK:    %[[TMP9:.*]] = arith.subi %[[TMP8]], %[[C1]] : i32
+! CHECK:    fir.store %[[TMP9]] to %[[TRIP]] : !fir.ref
+! CHECK:    %[[TMP10:.*]] = fir.load %[[I]]#1 : !fir.ref
+! CHECK:    %[[STEP_VAL:.*]] = fir.load %[[STEP_VAR]] : !fir.ref
+! CHECK:    %[[TMP11:.*]] = arith.addi %[[TMP10]], %[[STEP_VAL]] : i32
+! CHECK:    fir.store %[[TMP11]] to %[[I]]#1 : !fir.ref
+! CHECK:    cf.br ^[[HEADER]]
+  end do
+end subroutine
+! CHECK: ^[[EXIT]]:
+! CHECK:    return
+! CHECK:  }
diff --git a/flang/test/Lower/do_loop.f90 b/flang/test/Lower/do_loop.f90
index 512168763ceb9bd..d6ca1fe70977dd9 100644
--- a/flang/test/Lower/do_loop.f90
+++ b/flang/test/Lower/do_loop.f90
@@ -245,6 +245,7 @@ subroutine loop_with_real_control(s,e,st)
   ! CHECK-DAG: %[[S:.*]] = fir.load %[[S_REF]] : !fir.ref
   ! CHECK-DAG: %[[E:.*]] = fir.load %[[E_REF]] : !fir.ref
   ! CHECK-DAG: %[[ST:.*]] = fir.load %[[ST_REF]] : !fir.ref
+  ! CHECK: fir.store %[[ST]] to %[[ST_VAR:.*]] : !fir.ref
   real :: x, s, e, st
 
   ! CHECK: %[[DIFF:.*]] = arith.subf %[[E]], %[[S]] {{.*}}: f32
@@ -267,7 +268,8 @@ subroutine loop_with_real_control(s,e,st)
     ! CHECK: %[[INC:.*]] = arith.subi %[[INDEX2]], %[[C1]] : index
     ! CHECK: fir.store %[[INC]] to %[[INDEX_REF]] : !fir.ref
     ! CHECK: %[[X2:.*]] = fir.load %[[X_REF]] : !fir.ref
-    ! CHECK: %[[XINC:.*]] = arith.addf %[[X2]], %[[ST]] {{.*}}: f32
+    ! CHECK: %[[ST_VAL:.*]] = fir.load %[[ST_VAR]] : !fir.ref
+    ! CHECK: %[[XINC:.*]] = arith.addf %[[X2]], %[[ST_VAL]] {{.*}}: f32
     ! CHECK: fir.store %[[XINC]] to %[[X_REF]] : !fir.ref
     ! CHECK: br ^[[HDR]]
   end do
diff --git a/flang/test/Lower/do_loop_unstructured.f90 b/flang/test/Lower/do_loop_unstructured.f90
index a8c849a48827934..2b5f1c6c0061a1d 100644
--- a/flang/test/Lower/do_loop_unstructured.f90
+++ b/flang/test/Lower/do_loop_unstructured.f90
@@ -37,7 +37,8 @@ subroutine simple_unstructured()
 ! CHECK:   %[[TRIP_VAR_NEXT:.*]] = arith.subi %[[TRIP_VAR]], %[[ONE_1]] : i32
 ! CHECK:   fir.store %[[TRIP_VAR_NEXT]] to %[[TRIP_VAR_REF]] : !fir.ref
 ! CHECK:   %[[LOOP_VAR:.*]] = fir.load %[[LOOP_VAR_REF]] : !fir.ref
-! CHECK:   %[[LOOP_VAR_NEXT:.*]] = arith.addi %[[LOOP_VAR]], %[[STEP_ONE]] : i32
+! CHECK:   %[[STEP_ONE_2:.*]] = arith.constant 1 : i32
+! CHECK:   %[[LOOP_VAR_NEXT:.*]] = arith.addi %[[LOOP_VAR]], %[[STEP_ONE_2]] : i32
 ! CHECK:   fir.store %[[LOOP_VAR_NEXT]] to %[[LOOP_VAR_REF]] : !fir.ref
 ! CHECK:   cf.br ^[[HEADER]]
 ! CHECK: ^[[EXIT]]:
@@ -75,7 +76,8 @@ subroutine simple_unstructured_with_step()
 ! CHECK:   %[[TRIP_VAR_NEXT:.*]] = arith.subi %[[TRIP_VAR]], %[[ONE_1]] : i32
 ! CHECK:   fir.store %[[TRIP_VAR_NEXT]] to %[[TRIP_VAR_REF]] : !fir.ref
 ! CHECK:   %[[LOOP_VAR:.*]] = fir.load %[[LOOP_VAR_REF]] : !fir.ref
-! CHECK:   %[[LOOP_VAR_NEXT:.*]] = arith.addi %[[LOOP_VAR]], %[[STEP]] : i32
+! CHECK:   %[[STEP_2:.*]] = arith.constant 2 : i32
+! CHECK:   %[[LOOP_VAR_NEXT:.*]] = arith.addi %[[LOOP_VAR]], %[[STEP_2]] : i32
 ! CHECK:   fir.store %[[LOOP_VAR_NEXT]] to %[[LOOP_VAR_REF]] : !fir.ref
 ! CHECK:   cf.br ^[[HEADER]]
 ! CHECK: ^[[EXIT]]:
@@ -151,7 +153,8 @@ subroutine nested_unstructured()
 ! CHECK:   %[[TRIP_VAR_K_NEXT:.*]] = arith.subi %[[TRIP_VAR_K]], %[[ONE_1]] : i32
 ! CHECK:   fir.store %[[TRIP_VAR_K_NEXT]] to %[[TRIP_VAR_K_REF]] : !fir.ref
 ! CHECK:   %[[LOOP_VAR_K:.*]] = fir.load %[[LOOP_VAR_K_REF]] : !fir.ref
-! CHECK:   %[[LOOP_VAR_K_NEXT:.*]] = arith.addi %[[LOOP_VAR_K]], %[[K_STEP]] : i32
+! CHECK:   %[[K_STEP_2:.*]] = arith.constant 1 : i32
+! CHECK:   %[[LOOP_VAR_K_NEXT:.*]] = arith.addi %[[LOOP_VAR_K]], %[[K_STEP_2]] : i32
 ! CHECK:   fir.store %[[LOOP_VAR_K_NEXT]] to %[[LOOP_VAR_K_REF]] : !fir.ref
 ! CHECK:   cf.br ^[[HEADER_K]]
 ! CHECK: ^[[EXIT_K]]:
@@ -160,7 +163,8 @@ subroutine nested_unstructured()
 ! CHECK:   %[[TRIP_VAR_J_NEXT:.*]] = arith.subi %[[TRIP_VAR_J]], %[[ONE_1]] : i32
 ! CHECK:   fir.store %[[TRIP_VAR_J_NEXT]] to %[[TRIP_VAR_J_REF]] : !fir.ref
 ! CHECK:   %[[LOOP_VAR_J:.*]] = fir.load %[[LOOP_VAR_J_REF]] : !fir.ref
-! CHECK:   %[[LOOP_VAR_J_NEXT:.*]] = arith.addi %[[LOOP_VAR_J]], %[[J_STEP]] : i32
+! CHECK:   %[[J_STEP_2:.*]] = arith.constant 1 : i32
+! CHECK:   %[[LOOP_VAR_J_NEXT:.*]] = arith.addi %[[LOOP_VAR_J]], %[[J_STEP_2]] : i32
 ! CHECK:   fir.store %[[LOOP_VAR_J_NEXT]] to %[[LOOP_VAR_J_REF]] : !fir.ref
 ! CHECK:   cf.br ^[[HEADER_J]]
 ! CHECK: ^[[EXIT_J]]:
@@ -169,7 +173,8 @@ subroutine nested_unstructured()
 ! CHECK:   %[[TRIP_VAR_I_NEXT:.*]] = arith.subi %[[TRIP_VAR_I]], %[[ONE_1]] : i32
 ! CHECK:   fir.store %[[TRIP_VAR_I_NEXT]] to %[[TRIP_VAR_I_REF]] : !fir.ref
 ! CHECK:   %[[LOOP_VAR_I:.*]] = fir.load %[[LOOP_VAR_I_REF]] : !fir.ref
-! CHECK:   %[[LOOP_VAR_I_NEXT:.*]] = arith.addi %[[LOOP_VAR_I]], %[[I_STEP]] : i32
+! CHECK:   %[[I_STEP_2:.*]] = arith.constant 1 : i32
+! CHECK:   %[[LOOP_VAR_I_NEXT:.*]] = arith.addi %[[LOOP_VAR_I]], %[[I_STEP_2]] : i32
 ! CHECK:   fir.store %[[LOOP_VAR_I_NEXT]] to %[[LOOP_VAR_I_REF]] : !fir.ref
 ! CHECK:   cf.br ^[[HEADER_I]]
 ! CHECK: ^[[EXIT_I]]:
@@ -215,7 +220,8 @@ subroutine nested_structured_in_unstructured()
 ! CHECK:   %[[TRIP_VAR_I_NEXT:.*]] = arith.subi %[[TRIP_VAR_I]], %[[C1_3]] : i32
 ! CHECK:   fir.store %[[TRIP_VAR_I_NEXT]] to %[[TRIP_VAR_I_REF]] : !fir.ref
 ! CHECK:   %[[LOOP_VAR_I:.*]] = fir.load %[[LOOP_VAR_I_REF]] : !fir.ref
-! CHECK:   %[[LOOP_VAR_I_NEXT:.*]] = arith.addi %[[LOOP_VAR_I]], %c1_i32_0 : i32
+! CHECK:   %[[I_STEP_2:.*]] = arith.constant 1 : i32
+! CHECK:   %[[LOOP_VAR_I_NEXT:.*]] = arith.addi %[[LOOP_VAR_I]], %[[I_STEP_2]] : i32
 ! CHECK:   fir.store %[[LOOP_VAR_I_NEXT]] to %[[LOOP_VAR_I_REF]] : !fir.ref
 ! CHECK:   cf.br ^[[HEADER]]
 ! CHECK: ^[[EXIT]]:
diff --git a/flang/test/Lower/goto-do-body.f90 b/flang/test/Lower/goto-do-body.f90
new file mode 100644
index 000000000000000..3854947b55f6294
--- /dev/null
+++ b/flang/test/Lower/goto-do-body.f90
@@ -0,0 +1,125 @@
+! RUN: bbc %s -emit-fir -o - | FileCheck %s
+
+! Test jumping to the body of a do loop.
+subroutine sub1()
+! CHECK-LABEL:  func @_QPsub1() {
+  implicit none
+  integer :: i
+  external foo
+! CHECK:    %[[TRIP:.*]] = fir.alloca i32
+! CHECK:    %[[I:.*]] = fir.alloca i32 {bindc_name = "i", {{.*}}}
+
+  do i = 1, 3
+    if (i .eq. 2) goto 70
+! CHECK:    %[[TMP1:.*]] = fir.load %[[I]] : !fir.ref
+! CHECK:    %[[C2_1:.*]] = arith.constant 2 : i32
+! CHECK:    %[[COND:.*]] = arith.cmpi eq, %[[TMP1]], %[[C2_1]] : i32
+! CHECK:    cf.cond_br %[[COND]], ^[[COND_TRUE:.*]], ^{{.*}}
+! CHECK:  ^[[COND_TRUE]]:
+! CHECK:    cf.br ^[[BODY:.*]]
+
+  end do
+
+  call foo
+! CHECK:  fir.call @_QPfoo()
+
+  do i = 1, 2
+! CHECK:    %[[C1_1:.*]] = arith.constant 1 : i32
+! CHECK:    %[[C2_2:.*]] = arith.constant 2 : i32
+! CHECK:    %[[C1_2:.*]] = arith.constant 1 : i32
+! CHECK:    %[[TMP2:.*]] = arith.subi %[[C2_2]], %[[C1_1]] : i32
+! CHECK:    %[[TMP3:.*]] = arith.addi %[[TMP2]], %[[C1_2]] : i32
+! CHECK:    %[[TMP4:.*]] = arith.divsi %[[TMP3]], %[[C1_2]] : i32
+! CHECK:    fir.store %[[TMP4]] to %[[TRIP]] : !fir.ref
+! CHECK:    fir.store %[[C1_1]] to %[[I]] : !fir.ref
+! CHECK:    cf.br ^[[HEADER:.*]]
+! CHECK:  ^[[HEADER]]:
+! CHECK:    %[[TMP5:.*]] = fir.load %[[TRIP]] : !fir.ref
+! CHECK:    %[[C0_1:.*]] = arith.constant 0 : i32
+! CHECK:    %[[TMP6:.*]] = arith.cmpi sgt, %[[TMP5]], %[[C0_1]] : i32
+! CHECK:    cf.cond_br %[[TMP6]], ^[[BODY]], ^[[EXIT:.*]]
+
+    70 call foo
+! CHECK:  ^[[BODY]]:
+! CHECK:    fir.call @_QPfoo()
+! CHECK:    %[[TMP7:.*]] = fir.load %[[TRIP]] : !fir.ref
+! CHECK:    %[[C1_3:.*]] = arith.constant 1 : i32
+! CHECK:    %[[TMP8:.*]] = arith.subi %[[TMP7]], %[[C1_3]] : i32
+! CHECK:    fir.store %[[TMP8]] to %[[TRIP]] : !fir.ref
+! CHECK:    %[[TMP9:.*]] = fir.load %[[I]] : !fir.ref
+! CHECK:    %[[C1_4:.*]] = arith.constant 1 : i32
+! CHECK:    %[[TMP10:.*]] = arith.addi %[[TMP9]], %[[C1_4]] : i32
+! CHECK:    fir.store %[[TMP10]] to %[[I]] : !fir.ref
+! CHECK:    cf.br ^[[HEADER]]
+  end do
+end subroutine
+! CHECK: ^[[EXIT]]:
+! CHECK:    return
+! CHECK:  }
+
+! Test jumping to the body of a do loop with a step expression.
+subroutine sub2()
+! CHECK-LABEL:  func @_QPsub2() {
+  implicit none
+  integer :: i, j
+  external foo
+! CHECK:    %[[TRIP:.*]] = fir.alloca i32
+! CHECK:    %[[I:.*]] = fir.alloca i32 {bindc_name = "i", {{.*}}}
+! CHECK:    %[[J:.*]] = fir.alloca i32 {bindc_name = "j", {{.*}}}
+
+  do i = 1, 3
+    if (i .eq. 2) goto 70
+! CHECK:    %[[TMP1:.*]] = fir.load %[[I]] : !fir.ref
+! CHECK...

vdonaldson · 2023-09-13T03:27:33Z

The problem isn't restricted to stride computations. Other loop control expressions are also problematic.

subroutine sub(j)
  integer :: i, j
  if (j .eq. 2) goto 70
  do i = 1, 2
    70 j = j + 1
  end do
end

$ flang blah.f90
./blah.f90:3:3: warning: Label '70' is in a construct that should not be used as a branch target here
    if (j .eq. 2) goto 70
    ^^^^^^^^^^^^^^^^^^^^^
error: loc("blah.f90":5:5): operand #1 does not dominate this use
mlir did not succeed

This is nonconformant code. f90 Clause 8.1.1.2 states: Transfer of control to the interior of a block from outside the block is prohibited. f18 Clause 11.1.2.1 has the same program restriction with a minor qualification.

Some compilers flag this as a hard error, which is one possibility. A question there is whether to generate errors for all branches into blocks, or allow some that are more likely to be benign.

Otherwise, the primary goal is to avoid a compiler crash and generate an executable that will likely crash at runtime. It might be possible to generalize this PR to do that, but at a cost of generating somewhat more complex code that isn't needed most of the time.

At this point I don't have a concrete recommendation.

luporl · 2023-09-13T12:19:11Z

The problem isn't restricted to stride computations. Other loop control expressions are also problematic.
subroutine sub(j)
  integer :: i, j
  if (j .eq. 2) goto 70
  do i = 1, 2
    70 j = j + 1
  end do
end

Your test program works fine with this patch:

$ flang-new -c blah.f90
./blah.f90:3:3: warning: Label '70' is in a construct that should not be used as a branch target here
    if (j .eq. 2) goto 70
    ^^^^^^^^^^^^^^^^^^^^^
$

The generated IR for this program actually has a step = 1 constant. The problem is that this constant is defined in the beginning of the loop and reused at its end, that will be in a different block.

This is nonconformant code. f90 Clause 8.1.1.2 states: Transfer of control to the interior of a block from outside the block is prohibited. f18 Clause 11.1.2.1 has the same program restriction with a minor qualification.

Some compilers flag this as a hard error, which is one possibility. A question there is whether to generate errors for all branches into blocks, or allow some that are more likely to be benign.

Otherwise, the primary goal is to avoid a compiler crash and generate an executable that will likely crash at runtime. It might be possible to generalize this PR to do that, but at a cost of generating somewhat more complex code that isn't needed most of the time.

Right, this is non-conformant code that is currently accepted as an extension by flang's semantics. I agree that if we need to come up with a complex solution to support this use case then it may not be worth it. But if what is done by this patch is enough to fix invalid IR generation when GOTO DO loop bodies are used, I think it is valid, as it adds little complexity and overhead.

Although jumping to the body of a loop may cause crashes at runtime, with care it can be avoided, if, for instance:

The end of the loop is skipped (e.g., by a GOTO).
The beginning of the loop is executed at least once (as with a GOTO out followed by a GOTO back inside the loop).

The test case in #65036 uses some constructions similar to the ones above and behaves correctly at runtime.

vdonaldson · 2023-09-13T20:37:28Z

Your test program works fine with this patch

@luporl Thanks for verifying that your patch addresses this test variant.

I discussed this compiler crash issue with several other engineers at nvidia. Although the executable generated for these test cases with your patch will avoid the compiler crash, the vast majority of such nonconformant tests will crash at runtime. We believe it would be preferable to handle this as some other compilers do by turning the current warning into a hard error. I have filed an internal issue to do that not only for DO loops, but also for most if not all other constructs as well. ASSOCIATE constructs for example are similar to DO loops in that they would most likely crash at runtime after taking a nonconformant branch. One question is whether or not to make an exception for IF/ELSE IF/ELSE sequences, which might occur in old f66/f77 code and are more likely to be benign and not crash at runtime.

My recommendation now is to put this PR on hold pending a probable error message fix.

Again, thanks for working on the problem.

klausler · 2023-09-13T20:47:16Z

It would be better to disallow attempts to branch into constructs. They're not standard-conforming, they're not allowed by many compilers, they're not portable, and they don't have well-defined semantics for constructs other than IF/THEN/ENDIF.

luporl · 2023-09-14T13:09:28Z

@vdonaldson, @klausler, thanks for your comments and for investigating this issue further.
I'll put this PR on hold, as suggested.

vdonaldson · 2023-09-19T23:11:01Z

@luporl After yet more research into Fortran standards and legacy code we've come to the conclusion that it is best to go ahead with the code in this PR after all.

Additional detail:

I had thought that f77 allowed branching into a loop, and it was first prohibited in f90. However, the f77 standard has an appendix section A2 Conflicts with ANSI 3.9-1966 that lists conflicts between f66 and f77. Item (7) in the list is:

This standard does not permit a transfer of control into the range of a DO-loop from outside the range. The range of a DO-loop may be entered only by the execution of a DO statement. ANSI X3.9-1966 permitted transfer of control into the range of a DO-loop under certain conditions. This involved the concept referred to as "extended range of a DO."

See also f77 Clause 11.10 DO Statement, especially 11.10.1 Range of a DO-Loop and 11.10.2 Active and Inactive DO_Loops.

So the prohibition against branching into a loop dates back to f77, and only f66 allows it. But f66 apparently did in fact explicitly allow it.

Despite the prohibition starting with f77 and the fact that executing such a branch will typically cause a crash or other bad behavior, enough compilers allow these f66 branches for rare legacy cases that it may be best to support it with the existing warning.

So please go ahead with the push. If someone eventually finds a case where doing this has negative consequences, such as an executable performance cost in a more general use case, we can revisit the issue at that time.

Thanks for working on this.

luporl requested a review from a team as a code owner September 12, 2023 13:14

llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Sep 12, 2023

luporl requested review from kiranchandramohan and vdonaldson September 12, 2023 13:30

kiranchandramohan requested a review from jeanPerier September 12, 2023 13:46

vdonaldson requested a review from klausler September 13, 2023 20:41

vdonaldson approved these changes Sep 19, 2023

View reviewed changes

jeanPerier approved these changes Sep 20, 2023

View reviewed changes

luporl merged commit 3dbb055 into llvm:main Sep 21, 2023

kstoimenov mentioned this pull request Sep 22, 2023

Add memcpm test kstoimenov/llvm-project#13

Closed

luporl deleted the luporl-goto-do branch September 27, 2023 13:12

luporl restored the luporl-goto-do branch December 18, 2023 14:30

luporl deleted the luporl-goto-do branch January 17, 2024 14:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[flang] Generate valid IR on GOTO DO body #66084

[flang] Generate valid IR on GOTO DO body #66084

luporl commented Sep 12, 2023

llvmbot commented Sep 12, 2023

vdonaldson commented Sep 13, 2023

luporl commented Sep 13, 2023

vdonaldson commented Sep 13, 2023

klausler commented Sep 13, 2023

luporl commented Sep 14, 2023

vdonaldson commented Sep 19, 2023

[flang] Generate valid IR on GOTO DO body #66084

[flang] Generate valid IR on GOTO DO body #66084

Conversation

luporl commented Sep 12, 2023

llvmbot commented Sep 12, 2023

vdonaldson commented Sep 13, 2023

luporl commented Sep 13, 2023

vdonaldson commented Sep 13, 2023

klausler commented Sep 13, 2023

luporl commented Sep 14, 2023

vdonaldson commented Sep 19, 2023