-
Notifications
You must be signed in to change notification settings - Fork 746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve test handling in CI #6025
Conversation
TODO: after merge, create an issue to track these have been |
// Which is convoluted especially as it involves the app state refreshing | ||
// so; in order to make this be more stable | ||
// I hereby cheat and write: | ||
Thread.sleep(30_000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be clear: i feel bad about this but i experimented with various options for ~1day and none were more likely to work than this - handling the restart requirement just makes it hard to test with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems fair, it sounds like we're missing registering the sync as an IdlingResource
in order for espresso to be able to wait for the sync to finish
there's a helper for withRetry
if we wanted to poll for the home appearing instead of a blanket 30 second sleep (assuming the original check wasn't providing false positives)
fun waitForHome() {
// each attempt has a 500ms delay
withRetry(times = 60)
... original logic
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that waitForHome() is using an idling resource on initial sync; but it's idle at two points: Immediately (+/- thread racing) after we return from getting espresso to click the toggle, and then it's busy for 10-15s until after the app has restarted and done the initial sync.
out of interest, do these tests pass locally but fail on the CI? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍 small comment about potentially reducing the 30 second wait but not a blocker (would prefer the CI and local to be passing)
They fail for me locally (on a linux machine) and on GHA CI, and also when using firebase test lab and a remote synapse server to run the UI test code. |
Fundimentally changed what PR does.
This reverts commit 8d234b4.
18496f0
to
4ced6ca
Compare
I'm reliably getting passing tests locally with just these ignores.
With the above changes (recently rebased against develop) I'm now getting zero test fails locally (and I've generally been able to replicate all CI test fails locally). I'm a little concerned that 8950aa3 is going to be slowing down the tests generally because of lots of initializations, but if we want the separation I'm not sure how else to do it. |
419824d
to
1f89cfb
Compare
specifcally asking for review from @bmarty for 8950aa3 related to a recentl change, and @BillCarsonFr to have a confirm whether a19c1d6 which changes the actual logic of a crypto test is right. |
Having double checked, I do not see what recent change may require this change. But if it fixes the tests, it's OK for me. Asking for @ganfra advice.
I do not want to answer in place of @BillCarsonFr , but the change is now matching the message in the checks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small remarks
@@ -37,6 +39,8 @@ private const val DUMMY_DEVICE_KEY = "DeviceKey" | |||
@RunWith(AndroidJUnit4::class) | |||
class CryptoStoreTest : InstrumentedTest { | |||
|
|||
@get:Rule var rule = RetryTestRule(3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be val
instead of val
(= immutable). Applicable to all the occurrences on this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Have updated these.
Yes that is correct. Wonder why it was pushed like that :/ |
It's the change merged here: that will create multiple TestMatrix instances; each eventually does a WorkManager.initialize() further on in the logic: Which is causing the errors. |
I see, thanks @michaelkaye for the explanation! |
Matrix SDKIntegration Tests Results:
|
Type of change
Content
All tests I found in the most recent test runs that are failing have been marked@Ignore
.Having merged other PRs and discussed; increasing timeouts seems to make the tests work (reflecting that the underlying everything is being slow). Have removed the @ ignores for now, replaced with increased timeouts and retries.
Additional test failures that have shown up as part of this have been fixed afterwards.
This should allow new failures of integration tests in CI to be instantly visible when the PR run fails.
Motivation and context
As discussed in a retro a few weeks ago, we want to know when the longer tests fail due to new code. As they're slow to build/run and need macos runners, we're not running on every commit, but we will verify after each PR is merged, as a reasonable intemediate between "too much" and "not enough".
Checklist