Fix an incorrect outcome (corner cases) (#917)

Flank · Aug 5, 2020 · d5edb88 · d5edb88
1 parent c9e349a
commit d5edb88
Show file tree

Hide file tree

Showing 5 changed files with 158 additions and 13 deletions.
diff --git a/docs/bugs/914_falsy_positive_outcome_for_flaky_tests.md b/docs/bugs/914_falsy_positive_outcome_for_flaky_tests.md
@@ -0,0 +1,74 @@
+# Outcome incorrectly set by Flank [#914](https://github.com/Flank/flank/issues/914)
+
+##### Changelog
+|Date|Who?|Action|
+|---|---|---|
+|31th July 2020|[pawelpasterz](https://github.com/pawelpasterz)|created|
+|   |   |   |
+
+### Description
+```
+[task 2020-07-23T20:29:15.152Z] ┌─────────┬──────────────────────┬──────────────────────────────────────────┐
+[task 2020-07-23T20:29:15.153Z] │ OUTCOME │      MATRIX ID       │               TEST DETAILS               │
+[task 2020-07-23T20:29:15.153Z] ├─────────┼──────────────────────┼──────────────────────────────────────────┤
+[task 2020-07-23T20:29:15.153Z] │  flaky  │ matrix-1vj2pt9wih182 │ 1 test cases failed, 106 passed, 2 flaky │
+[task 2020-07-23T20:29:15.153Z] └─────────┴──────────────────────┴──────────────────────────────────────────┘
+```
+FTL confirmed the matrix outcome is a failure. Flank reported flaky.
+
+### Steps to reproduce
+So far no reliable way to reproduce it (with 10/10 rate) found. One can simulate error with the test:
+```
+SavedMatrixTest#`savedMatrix should have failed outcome when at least one test is failed and the last one is flaky`
+```
+
+Due to the incorrectly implemented logic regarding the status update, there were cases when flaky tests obscure failed results.
+The following must occur:
+1. The matrix needs to have both flaky and failed tests (at least one of each)
+2. TestExecutions need to be returned in a specific order:
+    1. Failed execution must be consumed by flank before flaky one
+    2. The following flaky execution must be actually failed but the outcome of all re-runs is success
+3. Then flank does what is the following:
+    1. The outcome is set to `failure` when failed test/shards/executions are processed
+    ```kotlin
+   private fun Outcome?.updateOutcome(
+       when {
+           ...
+           else -> this?.summary  // at this point SavedMatrix.outcome == success, flank changes it to failure
+           ...
+   })
+   ```
+    2. When flank reaches case where step summary is `failure` but execution is `success` it sets `SavedMatrix` outcome to flaky
+    ```kotlin
+   updateOutcome(flakyOutcome = it.step.outcome?.summary != this?.summary) // flakyOutcome == true 
+    ```
+    3. Due to incorrect order in `when` condition `failure` check is never reached 
+    ```kotlin
+   private fun Outcome?.updateOutcome(
+           flakyOutcome: Boolean
+       ) {
+           outcome = when {
+               flakyOutcome -> flaky 
+               // flank should escape here with failure status persisted, but since flakyOutcome == true SavedMatrix.outcome is changed to flaky
+               outcome == failure || outcome == inconclusive -> return 
+               outcome == flaky -> this?.summary?.takeIf { it == failure || it == inconclusive }
+               else -> this?.summary
+           } ?: outcome
+       }
+    ```
+
+### Proposed fix
+1. Change order in when condition to always check for `failure` first
+    ```kotlin
+   private fun Outcome?.updateOutcome(
+           flakyOutcome: Boolean
+       ) {
+           outcome = when {
+               outcome == failure || outcome == inconclusive -> return // escape when failure/inconclusive outcome is set
+               flakyOutcome -> flaky 
+               outcome == flaky -> this?.summary?.takeIf { it == failure || it == inconclusive }
+               else -> this?.summary
+           } ?: outcome
+       }
+    ```
+2. Add unit tests to cover this case.
diff --git a/release_notes.md b/release_notes.md
@@ -17,7 +17,7 @@
 - [#920](https://github.com/Flank/flank/pull/920) Improve .yml validation on `doctor` command. ([piotradamczyk5](https://github.com/piotradamczyk5))
 - [#934](https://github.com/Flank/flank/pull/934) Delete incorrect flank snapshot labels. ([piotradamczyk5](https://github.com/piotradamczyk5))
 - [#926](https://github.com/Flank/flank/pull/926) Flank should reflect gcloud exit codes. ([adamfilipow92](https://github.com/adamfilipow92))
--
+- [#917](https://github.com/Flank/flank/pull/917) Fix an incorrect outcome. ([pawelpasterz](https://github.com/pawelpasterz))
 -
 -
 

diff --git a/test_runner/src/main/kotlin/ftl/json/SavedMatrix.kt b/test_runner/src/main/kotlin/ftl/json/SavedMatrix.kt
@@ -149,11 +149,13 @@ class SavedMatrix(matrix: TestMatrix) {
         flakyOutcome: Boolean
     ) {
         outcome = when {
-            flakyOutcome -> flaky
-            // the matrix outcome is failure if any step fails
-            // if the matrix outcome is already set to failure then we can ignore the other step outcomes.
-            // inconclusive is treated as a failure
+            // The matrix outcome is failure if any step fails
+            // If the matrix outcome is already set to failure then we can ignore the other step outcomes.
+            // Inconclusive is treated as a failure
+            // This particular conditions order should not be changed, for more details check:
+            // https://github.com/Flank/flank/issues/914
             outcome == failure || outcome == inconclusive -> return
+            flakyOutcome -> flaky
             outcome == flaky -> this?.summary?.takeIf { it == failure || it == inconclusive }
             else -> this?.summary
         } ?: outcome

diff --git a/test_runner/src/main/kotlin/ftl/mock/MockServer.kt b/test_runner/src/main/kotlin/ftl/mock/MockServer.kt
@@ -30,6 +30,7 @@ import ftl.log.LogbackLogger
 import ftl.util.Bash
 import ftl.util.FlankGeneralError
 import ftl.util.StepOutcome.failure
+import ftl.util.StepOutcome.flaky
 import ftl.util.StepOutcome.inconclusive
 import ftl.util.StepOutcome.skipped
 import ftl.util.StepOutcome.success
@@ -98,6 +99,8 @@ object MockServer {
                 skippedDetail.incompatibleAppVersion = true
                 outcome.skippedDetail = skippedDetail
             }
+            "-4" -> outcome.summary = flaky
+            "-666" -> outcome.summary = null
             else -> outcome.summary = success
         }
 

diff --git a/test_runner/src/test/kotlin/ftl/json/SavedMatrixTest.kt b/test_runner/src/test/kotlin/ftl/json/SavedMatrixTest.kt
@@ -15,7 +15,8 @@ import ftl.util.MatrixState.FINISHED
 import ftl.util.MatrixState.INVALID
 import ftl.util.MatrixState.PENDING
 import ftl.util.webLink
-import org.junit.Assert
+import org.junit.Assert.assertEquals
+import org.junit.Ignore
 import org.junit.Test
 import org.junit.runner.RunWith
 
@@ -24,11 +25,11 @@ class SavedMatrixTest {
 
     companion object {
         // use -1 step id to get a failure outcome from the mock server
-        fun createStepExecution(stepId: Int, deviceModel: String = "shamu"): TestExecution {
+        fun createStepExecution(stepId: Int, deviceModel: String = "shamu", executionId: Int? = null): TestExecution {
             val toolResultsStep = ToolResultsStep()
             toolResultsStep.projectId = "1"
             toolResultsStep.historyId = "2"
-            toolResultsStep.executionId = stepId.toString()
+            toolResultsStep.executionId = executionId?.toString() ?: stepId.toString()
             toolResultsStep.stepId = stepId.toString()
 
             val testExecution = TestExecution()
@@ -161,7 +162,7 @@ class SavedMatrixTest {
         testMatrix.webLink()
         testExecutions.forEach { it.state = ERROR }
         savedMatrix.update(testMatrix)
-        Assert.assertEquals(0, savedMatrix.billableVirtualMinutes)
+        assertEquals(0, savedMatrix.billableVirtualMinutes)
     }
 
     @Test
@@ -182,7 +183,7 @@ class SavedMatrixTest {
         testMatrix.state = FINISHED
         testMatrix.webLink()
         savedMatrix.update(testMatrix)
-        Assert.assertEquals(1, savedMatrix.billableVirtualMinutes)
+        assertEquals(1, savedMatrix.billableVirtualMinutes)
     }
 
     @Test
@@ -199,8 +200,73 @@ class SavedMatrixTest {
 
         testMatrix.state = INVALID
         savedMatrix.update(testMatrix)
-        Assert.assertEquals(expectedOutcome, savedMatrix.outcome)
-        Assert.assertEquals(expectedOutcomeDetails, savedMatrix.outcomeDetails)
-        Assert.assertEquals(INVALID, savedMatrix.state)
+        assertEquals(expectedOutcome, savedMatrix.outcome)
+        assertEquals(expectedOutcomeDetails, savedMatrix.outcomeDetails)
+        assertEquals(INVALID, savedMatrix.state)
+    }
+
+    @Test
+    fun `savedMatrix should have failed outcome when at least one test is failed`() {
+        val expectedOutcome = "failure"
+        val successStepExecution = createStepExecution(1) // success
+        val failedStepExecution = createStepExecution(-1) // failure
+        val flakyStepExecution = createStepExecution(-4) // flaky
+        // https://github.com/Flank/flank/issues/914
+        // This test covers edge case where the last test execution to check is flaky
+        // based on different outcome from step (failed) and execution (success)
+        // step.outcome != execution.outcome => means flaky
+        val flakyOutcomeComparedStepExecution = createStepExecution(stepId = -1, executionId = 1) // flaky
+
+        // below order in the list matters!
+        val executions = listOf(
+            flakyStepExecution,
+            successStepExecution,
+            failedStepExecution,
+            flakyOutcomeComparedStepExecution
+        )
+
+        val testMatrix = TestMatrix().apply {
+            testMatrixId = "123"
+            state = FINISHED
+            resultStorage = createResultsStorage()
+            testExecutions = executions
+        }
+
+        val savedMatrix = SavedMatrix(testMatrix)
+
+        assertEquals(
+            "Does not return failed outcome when last execution is flaky",
+            expectedOutcome,
+            savedMatrix.outcome
+        )
+    }
+
+    @Ignore("Should be used to verify https://github.com/Flank/flank/issues/918 fix")
+    @Test
+    fun `savedMatrix should have flaky outcome when at least one test is flaky`() {
+        val expectedOutcome = "flaky"
+        val successStepExecution = createStepExecution(1) // success
+        // https://github.com/Flank/flank/issues/918
+        // This test covers edge case where summary for both step and execution is null and outcome of
+        // saved matrix was not changed and is set to success
+        val malformed = createStepExecution(stepId = -666, executionId = -666) // flaky
+
+        // below order in the list matters!
+        val executions = listOf(
+            successStepExecution,
+            successStepExecution,
+            malformed
+        )
+
+        val testMatrix = TestMatrix().apply {
+            testMatrixId = "123"
+            state = FINISHED
+            resultStorage = createResultsStorage()
+            testExecutions = executions
+        }
+
+        val savedMatrix = SavedMatrix(testMatrix)
+
+        assertEquals(expectedOutcome, savedMatrix.outcome)
     }
 }