-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security Solution] Improving test automation reliability #173508
Comments
Pinging @elastic/security-solution (Team: SecuritySolution) |
Thanks for putting this ticket together, @MadameSheema! I'm glad we're taking steps towards making further improvements to our flaky test process. I have a couple of questions/comments.
|
@peluja1012 lots of thanks for reviewing thoroughly the ticket, here you can find my insights:
You are completely right about it, we could still potentially be causing problems. The proposed solution in the main ticket is a quick win since it is easy to implement and would save us time. I don't know how complex can be skipping tests automatically, that is something we need to research. Another thing we need to investigate or take into consideration is the pros of cons, for instance, sometimes failures are due to problems not related to us or network connection issues that happens just one time, skipping the test right away will lead to minimising our test coverage. But I really like the idea of automating this process and we can research to see if we can make it possible without compromising our coverage :)
I've researched about this and seems to be simpler than we thought. I'll ping you on slack and I'll provide you details about it.
Is a functionality integrated in Cypress by Patryk that checks which tests have been modified/introduced and executes them several times in a row. That would help us to get rid of the flaky test suite runner and make the check of flakiness in our tests part of the PR process. This is a great idea that we need to re-implement, but there is some work we need to do. For instance, we need to make sure that our tests can be executed several times in the same instance without having false failures or that process is not going to impact teams outside security solution. |
Awesome writeup, I'm particularly looking forward to "Provid[ing] an extended guide of automation best practices focused on flakiness" coming to fruition. Couple of clarifying questions:
Is "rebasing" strictly required here? Could we say more broadly:
Is there a way to execute a single test file utilizing the flaky test runner, or do team members only have the option to run tests at the "config file" (test group) level? I've gotten some team feedback that it would be nice to have more granularity within the flaky test runner, by being able to run specific test files as opposed to groups. Thank you! |
Absolutely! Thanks for the suggestion.
Yes it is. It is not 100% ideal because it requires a bit of manual intervention but is feasible. For your team, you must modify the line 10 for ESS tests and line 29 for serverless tests of the
You just need to remember to revert the changes before merging the code into main :) |
If you are running 200x tests, please make sure that the test groups are small enough so as not to consume too many resources/cost. A good target time would be under 1h. For example, test group |
@MadameSheema I think out of all the tickets in the "Tasks" list, by far the most valuable one for us short-term would be #174892 because it would save ~5-10 minutes every single day for people who monitor flaky tests manually, and allow to delegate the tests triage process to any team members. Probably the 2nd one by priority would be #174902 because it would guarantee that any new or changed Cypress tests are being run through a sort of "flaky test runner" before they get merged. Currently, PR authors need to remember to do that manually, and PR reviewers need to remember to check that it's done. |
Update
This ticket was created to minimize blocking other teams' PRs because of our tests. That has been achieved already by #173815.
There is still work pending to do to minimise the flakiness in our automation and to have more reliable tests.
This ticket from now on is going to track all the efforts around it.
Tasks
/skip
comments on failed test issues #173430Useful information
Flaky test vs blocker test
A flaky test is an automated test that has inconsistent behaviour, sometimes it passes, sometimes fails. In our context, a flaky test is an automated test that has failed at least one time in:
on-merge
pipelineFlaky test suite runner
executionWithout blocking the merge of a PR and without breaking the on-merge pipeline.
A blocker test is an automated test that has consistently failing in:
on-merge
pipelineFlaky test suite runner
executionBlocking the merge of a PR or breaking the on-merge pipeline.
In both cases, we are talking about failing tests.
How are we tracking failed tests?
When a test fails for the first time on the
on-merge
pipeline a github ticket is automatically created in the Kibana repository.failed-test
and the owner of the testWhen a test that is already reported on a github ticket fails again:
When the test has failed several times, kibana-operations team skips it.
The text was updated successfully, but these errors were encountered: