Skip to content

Commit

Permalink
Fix an incorrect outcome (corner cases) (#917)
Browse files Browse the repository at this point in the history
  • Loading branch information
pawelpasterz authored Aug 5, 2020
1 parent c9e349a commit d5edb88
Show file tree
Hide file tree
Showing 5 changed files with 158 additions and 13 deletions.
74 changes: 74 additions & 0 deletions docs/bugs/914_falsy_positive_outcome_for_flaky_tests.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Outcome incorrectly set by Flank [#914](https://github.com/Flank/flank/issues/914)

##### Changelog
|Date|Who?|Action|
|---|---|---|
|31th July 2020|[pawelpasterz](https://github.com/pawelpasterz)|created|
| | | |

### Description
```
[task 2020-07-23T20:29:15.152Z] ┌─────────┬──────────────────────┬──────────────────────────────────────────┐
[task 2020-07-23T20:29:15.153Z] │ OUTCOME │ MATRIX ID │ TEST DETAILS │
[task 2020-07-23T20:29:15.153Z] ├─────────┼──────────────────────┼──────────────────────────────────────────┤
[task 2020-07-23T20:29:15.153Z] │ flaky │ matrix-1vj2pt9wih182 │ 1 test cases failed, 106 passed, 2 flaky │
[task 2020-07-23T20:29:15.153Z] └─────────┴──────────────────────┴──────────────────────────────────────────┘
```
FTL confirmed the matrix outcome is a failure. Flank reported flaky.

### Steps to reproduce
So far no reliable way to reproduce it (with 10/10 rate) found. One can simulate error with the test:
```
SavedMatrixTest#`savedMatrix should have failed outcome when at least one test is failed and the last one is flaky`
```

Due to the incorrectly implemented logic regarding the status update, there were cases when flaky tests obscure failed results.
The following must occur:
1. The matrix needs to have both flaky and failed tests (at least one of each)
2. TestExecutions need to be returned in a specific order:
1. Failed execution must be consumed by flank before flaky one
2. The following flaky execution must be actually failed but the outcome of all re-runs is success
3. Then flank does what is the following:
1. The outcome is set to `failure` when failed test/shards/executions are processed
```kotlin
private fun Outcome?.updateOutcome(
when {
...
else -> this?.summary // at this point SavedMatrix.outcome == success, flank changes it to failure
...
})
```
2. When flank reaches case where step summary is `failure` but execution is `success` it sets `SavedMatrix` outcome to flaky
```kotlin
updateOutcome(flakyOutcome = it.step.outcome?.summary != this?.summary) // flakyOutcome == true
```
3. Due to incorrect order in `when` condition `failure` check is never reached
```kotlin
private fun Outcome?.updateOutcome(
flakyOutcome: Boolean
) {
outcome = when {
flakyOutcome -> flaky
// flank should escape here with failure status persisted, but since flakyOutcome == true SavedMatrix.outcome is changed to flaky
outcome == failure || outcome == inconclusive -> return
outcome == flaky -> this?.summary?.takeIf { it == failure || it == inconclusive }
else -> this?.summary
} ?: outcome
}
```

### Proposed fix
1. Change order in when condition to always check for `failure` first
```kotlin
private fun Outcome?.updateOutcome(
flakyOutcome: Boolean
) {
outcome = when {
outcome == failure || outcome == inconclusive -> return // escape when failure/inconclusive outcome is set
flakyOutcome -> flaky
outcome == flaky -> this?.summary?.takeIf { it == failure || it == inconclusive }
else -> this?.summary
} ?: outcome
}
```
2. Add unit tests to cover this case.
2 changes: 1 addition & 1 deletion release_notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
- [#920](https://github.com/Flank/flank/pull/920) Improve .yml validation on `doctor` command. ([piotradamczyk5](https://github.com/piotradamczyk5))
- [#934](https://github.com/Flank/flank/pull/934) Delete incorrect flank snapshot labels. ([piotradamczyk5](https://github.com/piotradamczyk5))
- [#926](https://github.com/Flank/flank/pull/926) Flank should reflect gcloud exit codes. ([adamfilipow92](https://github.com/adamfilipow92))
-
- [#917](https://github.com/Flank/flank/pull/917) Fix an incorrect outcome. ([pawelpasterz](https://github.com/pawelpasterz))
-
-

Expand Down
10 changes: 6 additions & 4 deletions test_runner/src/main/kotlin/ftl/json/SavedMatrix.kt
Original file line number Diff line number Diff line change
Expand Up @@ -149,11 +149,13 @@ class SavedMatrix(matrix: TestMatrix) {
flakyOutcome: Boolean
) {
outcome = when {
flakyOutcome -> flaky
// the matrix outcome is failure if any step fails
// if the matrix outcome is already set to failure then we can ignore the other step outcomes.
// inconclusive is treated as a failure
// The matrix outcome is failure if any step fails
// If the matrix outcome is already set to failure then we can ignore the other step outcomes.
// Inconclusive is treated as a failure
// This particular conditions order should not be changed, for more details check:
// https://github.com/Flank/flank/issues/914
outcome == failure || outcome == inconclusive -> return
flakyOutcome -> flaky
outcome == flaky -> this?.summary?.takeIf { it == failure || it == inconclusive }
else -> this?.summary
} ?: outcome
Expand Down
3 changes: 3 additions & 0 deletions test_runner/src/main/kotlin/ftl/mock/MockServer.kt
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ import ftl.log.LogbackLogger
import ftl.util.Bash
import ftl.util.FlankGeneralError
import ftl.util.StepOutcome.failure
import ftl.util.StepOutcome.flaky
import ftl.util.StepOutcome.inconclusive
import ftl.util.StepOutcome.skipped
import ftl.util.StepOutcome.success
Expand Down Expand Up @@ -98,6 +99,8 @@ object MockServer {
skippedDetail.incompatibleAppVersion = true
outcome.skippedDetail = skippedDetail
}
"-4" -> outcome.summary = flaky
"-666" -> outcome.summary = null
else -> outcome.summary = success
}

Expand Down
82 changes: 74 additions & 8 deletions test_runner/src/test/kotlin/ftl/json/SavedMatrixTest.kt
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@ import ftl.util.MatrixState.FINISHED
import ftl.util.MatrixState.INVALID
import ftl.util.MatrixState.PENDING
import ftl.util.webLink
import org.junit.Assert
import org.junit.Assert.assertEquals
import org.junit.Ignore
import org.junit.Test
import org.junit.runner.RunWith

Expand All @@ -24,11 +25,11 @@ class SavedMatrixTest {

companion object {
// use -1 step id to get a failure outcome from the mock server
fun createStepExecution(stepId: Int, deviceModel: String = "shamu"): TestExecution {
fun createStepExecution(stepId: Int, deviceModel: String = "shamu", executionId: Int? = null): TestExecution {
val toolResultsStep = ToolResultsStep()
toolResultsStep.projectId = "1"
toolResultsStep.historyId = "2"
toolResultsStep.executionId = stepId.toString()
toolResultsStep.executionId = executionId?.toString() ?: stepId.toString()
toolResultsStep.stepId = stepId.toString()

val testExecution = TestExecution()
Expand Down Expand Up @@ -161,7 +162,7 @@ class SavedMatrixTest {
testMatrix.webLink()
testExecutions.forEach { it.state = ERROR }
savedMatrix.update(testMatrix)
Assert.assertEquals(0, savedMatrix.billableVirtualMinutes)
assertEquals(0, savedMatrix.billableVirtualMinutes)
}

@Test
Expand All @@ -182,7 +183,7 @@ class SavedMatrixTest {
testMatrix.state = FINISHED
testMatrix.webLink()
savedMatrix.update(testMatrix)
Assert.assertEquals(1, savedMatrix.billableVirtualMinutes)
assertEquals(1, savedMatrix.billableVirtualMinutes)
}

@Test
Expand All @@ -199,8 +200,73 @@ class SavedMatrixTest {

testMatrix.state = INVALID
savedMatrix.update(testMatrix)
Assert.assertEquals(expectedOutcome, savedMatrix.outcome)
Assert.assertEquals(expectedOutcomeDetails, savedMatrix.outcomeDetails)
Assert.assertEquals(INVALID, savedMatrix.state)
assertEquals(expectedOutcome, savedMatrix.outcome)
assertEquals(expectedOutcomeDetails, savedMatrix.outcomeDetails)
assertEquals(INVALID, savedMatrix.state)
}

@Test
fun `savedMatrix should have failed outcome when at least one test is failed`() {
val expectedOutcome = "failure"
val successStepExecution = createStepExecution(1) // success
val failedStepExecution = createStepExecution(-1) // failure
val flakyStepExecution = createStepExecution(-4) // flaky
// https://github.com/Flank/flank/issues/914
// This test covers edge case where the last test execution to check is flaky
// based on different outcome from step (failed) and execution (success)
// step.outcome != execution.outcome => means flaky
val flakyOutcomeComparedStepExecution = createStepExecution(stepId = -1, executionId = 1) // flaky

// below order in the list matters!
val executions = listOf(
flakyStepExecution,
successStepExecution,
failedStepExecution,
flakyOutcomeComparedStepExecution
)

val testMatrix = TestMatrix().apply {
testMatrixId = "123"
state = FINISHED
resultStorage = createResultsStorage()
testExecutions = executions
}

val savedMatrix = SavedMatrix(testMatrix)

assertEquals(
"Does not return failed outcome when last execution is flaky",
expectedOutcome,
savedMatrix.outcome
)
}

@Ignore("Should be used to verify https://github.com/Flank/flank/issues/918 fix")
@Test
fun `savedMatrix should have flaky outcome when at least one test is flaky`() {
val expectedOutcome = "flaky"
val successStepExecution = createStepExecution(1) // success
// https://github.com/Flank/flank/issues/918
// This test covers edge case where summary for both step and execution is null and outcome of
// saved matrix was not changed and is set to success
val malformed = createStepExecution(stepId = -666, executionId = -666) // flaky

// below order in the list matters!
val executions = listOf(
successStepExecution,
successStepExecution,
malformed
)

val testMatrix = TestMatrix().apply {
testMatrixId = "123"
state = FINISHED
resultStorage = createResultsStorage()
testExecutions = executions
}

val savedMatrix = SavedMatrix(testMatrix)

assertEquals(expectedOutcome, savedMatrix.outcome)
}
}

0 comments on commit d5edb88

Please sign in to comment.