Improve test handling in CI #6025

michaelkaye · 2022-05-11T14:53:55Z

Type of change

Feature
Bugfix
Technical
Other :

Content

~~All tests I found in the most recent test runs that are failing have been marked @Ignore.~~

Having merged other PRs and discussed; increasing timeouts seems to make the tests work (reflecting that the underlying everything is being slow). Have removed the @ ignores for now, replaced with increased timeouts and retries.

Additional test failures that have shown up as part of this have been fixed afterwards.

This should allow new failures of integration tests in CI to be instantly visible when the PR run fails.

Motivation and context

As discussed in a retro a few weeks ago, we want to know when the longer tests fail due to new code. As they're slow to build/run and need macos runners, we're not running on every commit, but we will verify after each PR is merged, as a reasonable intemediate between "too much" and "not enough".

Checklist

Changes has been tested on an Android device or Android emulator with API 21
UI change has been tested on both light and dark themes
Accessibility has been taken into account. See https://github.com/vector-im/element-android/blob/develop/CONTRIBUTING.md#accessibility
Pull request is based on the develop branch
Pull request includes a new file under ./changelog.d. See https://github.com/vector-im/element-android/blob/develop/CONTRIBUTING.md#changelog
Pull request includes screenshots or videos if containing UI changes
Pull request includes a sign off
You've made a self review of your PR
If you have modified the screen flow, or added new screens to the application, you have updated the test UiAllScreensSanityTest.allScreensTest()

github-actions · 2022-05-11T15:27:31Z

Unit Test Results

122 files ±0 122 suites ±0 2m 7s ⏱️ -13s
205 tests ±0 205 ✔️ ±0 0 💤 ±0 0 ❌ ±0
690 runs ±0 690 ✔️ ±0 0 💤 ±0 0 ❌ ±0

Results for commit 868c33a. ± Comparison against base commit 3674ae7.

♻️ This comment has been updated with latest results.

michaelkaye · 2022-05-12T08:57:28Z

TODO: after merge, create an issue to track these have been @ignored (or add to an existing one)

michaelkaye · 2022-05-13T08:46:51Z

vector/src/androidTest/java/im/vector/app/ui/robot/ElementRobot.kt

+                // Which is convoluted especially as it involves the app state refreshing
+                // so; in order to make this be more stable
+                // I hereby cheat and write:
+                Thread.sleep(30_000)


to be clear: i feel bad about this but i experimented with various options for ~1day and none were more likely to work than this - handling the restart requirement just makes it hard to test with.

this seems fair, it sounds like we're missing registering the sync as an IdlingResource in order for espresso to be able to wait for the sync to finish

there's a helper for withRetry if we wanted to poll for the home appearing instead of a blanket 30 second sleep (assuming the original check wasn't providing false positives)

fun waitForHome() { // each attempt has a 500ms delay withRetry(times = 60) ... original logic } }

The problem is that waitForHome() is using an idling resource on initial sync; but it's idle at two points: Immediately (+/- thread racing) after we return from getting espresso to click the toggle, and then it's busy for 10-15s until after the app has restarted and done the initial sync.

ouchadam · 2022-05-13T09:55:21Z

out of interest, do these tests pass locally but fail on the CI?

ouchadam

LGTM 👍 small comment about potentially reducing the 30 second wait but not a blocker (would prefer the CI and local to be passing)

michaelkaye · 2022-05-13T12:02:27Z

out of interest, do these tests pass locally but fail on the CI?

They fail for me locally (on a linux machine) and on GHA CI, and also when using firebase test lab and a remote synapse server to run the UI test code.

Fundimentally changed what PR does.

… 60s.

This reverts commit 8d234b4.

I'm reliably getting passing tests locally with just these ignores.

michaelkaye · 2022-05-16T15:03:44Z

With the above changes (recently rebased against develop) I'm now getting zero test fails locally (and I've generally been able to replicate all CI test fails locally).

I'm a little concerned that 8950aa3 is going to be slowing down the tests generally because of lots of initializations, but if we want the separation I'm not sure how else to do it.

michaelkaye · 2022-05-16T16:05:22Z

specifcally asking for review from @bmarty for 8950aa3 related to a recentl change,

and @BillCarsonFr to have a confirm whether a19c1d6 which changes the actual logic of a crypto test is right.

bmarty · 2022-05-17T07:54:50Z

specifcally asking for review from @bmarty for 8950aa3 related to a recentl change,

Having double checked, I do not see what recent change may require this change. But if it fixes the tests, it's OK for me.

Asking for @ganfra advice.

and @BillCarsonFr to have a confirm whether a19c1d6 which changes the actual logic of a crypto test is right.

I do not want to answer in place of @BillCarsonFr , but the change is now matching the message in the checks.

bmarty

Small remarks

bmarty · 2022-05-17T07:59:03Z

...x-sdk-android/src/androidTest/java/org/matrix/android/sdk/internal/crypto/CryptoStoreTest.kt

@@ -37,6 +39,8 @@ private const val DUMMY_DEVICE_KEY = "DeviceKey"
 @RunWith(AndroidJUnit4::class)
 class CryptoStoreTest : InstrumentedTest {

+    @get:Rule var rule = RetryTestRule(3)


This could be val instead of val (= immutable). Applicable to all the occurrences on this PR

👍 Have updated these.

BillCarsonFr · 2022-05-17T08:02:46Z

specifcally asking for review from @bmarty for 8950aa3 related to a recentl change,

Having double checked, I do not see what recent change may require this change. But if it fixes the tests, it's OK for me.

Asking for @ganfra advice.

and @BillCarsonFr to have a confirm whether a19c1d6 which changes the actual logic of a crypto test is right.

I do not want to answer in place of @BillCarsonFr , but the change is now matching the message in the checks.

Yes that is correct. Wonder why it was pushed like that :/

michaelkaye · 2022-05-17T11:21:55Z

specifcally asking for review from @bmarty for 8950aa3 related to a recentl change,

Having double checked, I do not see what recent change may require this change. But if it fixes the tests, it's OK for me.

Asking for @ganfra advice.

It's the change merged here:

https://github.com/vector-im/element-android/pull/5887/files#diff-7ab56d26588e03ae923b3a9b5a37b0a3e6bdb1c7d560104072426c0e0361118bR68

that will create multiple TestMatrix instances; each eventually does a WorkManager.initialize() further on in the logic:

https://github.com/vector-im/element-android/pull/5887/files#diff-2a024565d457bef75c1870ebbf3454ed8f6f134bdd6a28cf29c71910e0f2697cR69

Which is causing the errors.

bmarty · 2022-05-17T15:44:09Z

I see, thanks @michaelkaye for the explanation!

github-actions · 2022-05-18T09:19:02Z

Matrix SDK

Integration Tests Results:

[org.matrix.android.sdk.session]
= passed=18 failures=2 errors=0 skipped=3
[org.matrix.android.sdk.account]
= passed=3 failures=0 errors=0 skipped=2
[org.matrix.android.sdk.internal]
= passed=53 failures=4 errors=0 skipped=1
[org.matrix.android.sdk.ordering]
= passed=16 failures=0 errors=0 skipped=0
[org.matrix.android.sdk.PermalinkParserTest]
= passed=2 failures=0 errors=0 skipped=0

bmarty requested a review from BillCarsonFr May 11, 2022 15:37

michaelkaye marked this pull request as ready for review May 12, 2022 08:56

michaelkaye commented May 13, 2022

View reviewed changes

ouchadam previously approved these changes May 13, 2022

View reviewed changes

michaelkaye mentioned this pull request May 13, 2022

Feature/bca/fix 5906 #5939

Merged

12 tasks

michaelkaye changed the title ~~@Ignore all tests currently failing in CI~~ Improve test handling in CI May 13, 2022

michaelkaye added 7 commits May 16, 2022 15:59

@ignore all tests currently failing in CI

e06682d

changelog.d

012b20f

Fix threading UI test failure by adding a sleep 30s.

010be91

Crypto tests are failing due to slow initialSync. Increase timeout by…

78140af

… 60s.

Increase timeout. Log timeout.

70682b4

Revert "@ignore all tests currently failing in CI"

fa26e2a

This reverts commit 8d234b4.

Fix linting error.

4ced6ca

michaelkaye force-pushed the michaelk/skip_tests_failing_on_ci branch from 18496f0 to 4ced6ca Compare May 16, 2022 15:00

michaelkaye added 3 commits May 16, 2022 16:01

Address repeated initializatin of WorkManagerImpl in #5887

8950aa3

Rather than ignore them, put tests on a retry loop.

096cf92

I'm reliably getting passing tests locally with just these ignores.

Make test consistent wiht assert message.

a19c1d6

Lint fixes

1f89cfb

michaelkaye force-pushed the michaelk/skip_tests_failing_on_ci branch from 419824d to 1f89cfb Compare May 16, 2022 15:55

michaelkaye requested a review from bmarty May 16, 2022 16:01

bmarty reviewed May 17, 2022

View reviewed changes

Correct var -> val for @get:Rules

868c33a

bmarty approved these changes May 17, 2022

View reviewed changes

michaelkaye merged commit f730378 into develop May 18, 2022

michaelkaye deleted the michaelk/skip_tests_failing_on_ci branch May 18, 2022 08:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve test handling in CI #6025

Improve test handling in CI #6025

michaelkaye commented May 11, 2022 •

edited

Loading

github-actions bot commented May 11, 2022 •

edited

Loading

michaelkaye commented May 12, 2022

michaelkaye May 13, 2022

ouchadam May 13, 2022

michaelkaye May 13, 2022

ouchadam commented May 13, 2022

ouchadam left a comment

michaelkaye commented May 13, 2022

michaelkaye commented May 16, 2022 •

edited

Loading

michaelkaye commented May 16, 2022

bmarty commented May 17, 2022

bmarty left a comment

bmarty May 17, 2022

michaelkaye May 17, 2022

BillCarsonFr commented May 17, 2022

michaelkaye commented May 17, 2022

bmarty commented May 17, 2022

github-actions bot commented May 18, 2022 •

edited

Loading

Improve test handling in CI #6025

Improve test handling in CI #6025

Conversation

michaelkaye commented May 11, 2022 • edited Loading

Type of change

Content

Motivation and context

Checklist

github-actions bot commented May 11, 2022 • edited Loading

Unit Test Results

michaelkaye commented May 12, 2022

michaelkaye May 13, 2022

Choose a reason for hiding this comment

ouchadam May 13, 2022

Choose a reason for hiding this comment

michaelkaye May 13, 2022

Choose a reason for hiding this comment

ouchadam commented May 13, 2022

ouchadam left a comment

Choose a reason for hiding this comment

michaelkaye commented May 13, 2022

michaelkaye commented May 16, 2022 • edited Loading

michaelkaye commented May 16, 2022

bmarty commented May 17, 2022

bmarty left a comment

Choose a reason for hiding this comment

bmarty May 17, 2022

Choose a reason for hiding this comment

michaelkaye May 17, 2022

Choose a reason for hiding this comment

BillCarsonFr commented May 17, 2022

michaelkaye commented May 17, 2022

bmarty commented May 17, 2022

github-actions bot commented May 18, 2022 • edited Loading

Matrix SDK

Integration Tests Results:

michaelkaye commented May 11, 2022 •

edited

Loading

github-actions bot commented May 11, 2022 •

edited

Loading

michaelkaye commented May 16, 2022 •

edited

Loading

github-actions bot commented May 18, 2022 •

edited

Loading