-
Notifications
You must be signed in to change notification settings - Fork 42
Run Fleet's 'upgrade agent' tests in nightly builds (keep them skipped in PR CI) #652
Comments
Can we link the blocker issue that force us to skip theses tests? |
I'm thinking about this issue more thoroughly: I'd prefer defining what to run in the PR build instead, running all tests in the nightly build. For that, I'd like to engage with the entire Elastic Agent team (devs and PM) so that we all together define the priorities for what to run. IMHO we must:
After this prioritisation session we should have a clear understanding of what is tested, when, and most importantly why. As a benefit of this initiative, the PR jobs will reduce the build time, improving the time to receive feedback after a PR is sent (builds are starting to take 30-40 mins, including a lot of wait-for-results situations -some times it takes 5-10 minutos to receive agent events). @ph @kseniia-kolpakova I'm open to any new ideas on this. |
I have a few thoughts, I like prioritization. But, the 'prioritization' here overloads the term in a bad way - the upgrade tests are extremely important, but we just can't test them during a PR easily. I rather prefer the idea of having a '@skip-pr' tag for any tests that shall be skipped during PR CI, regardless of it's priority. And we can prioritize them too, for when we need to reduce scope. We'll need to upgrade linting rules to allow more tags, fyi. More discourse: I think we would attempt to keep as many tests as possible in the short term, to be run during PR CI. We don't have that many tests yet such that it is a problem I recommend tackling in a reduction of tests executed. One theory is that if the test is important enough to be written, we should keep it running with *some expected value (even if we adjust it over time). About this one test, the 'upgrade' test is the only one we've found that can't easily be written to run against PRs. It may be the only one for a long time. I think we'll appreciate all the test work more, if it informs developers precisely when they push commits as to whether they broke a test. It is the absolute best way to engage them to fix / update them, and scale the team. Having said that, I don't mind using P1/P2/P3 notation, it has advantages for sure, and pulling in the Devs / PM / Leads is a great practice. |
Related to the implementation details, adding a |
I would like to raise here, as @EricDavisX said we want to keep as many tests as possible. Looking at the prioritization and tagging are we effectively looking at the right problem? Looking at @mdelapenya comment concerning the wait-for-results should we instead focus on reducing the overall run time? As the suite grows we might want to have a way to select priority but I am not sure we need to cross the bridge yet. @mdelapenya Can you provide some stats concerning the run time and we can take a look at how we could make them faster? |
Yes, I'm building a dashboard in Kibana for test times, using our jenkins-stats cluster. Will share results soon SPOILER: I'm not a Kibana user |
We got tests passing today, and I see some merges, like the above - can we re-sync up and see what else needs to be done? And if nothing, we can make notes in the e2e-testing docs to make sure devs know expectations and how to work with the system. |
This task has been accomplished. Closing |
These tests are currently skipped, but we'd like to run them in our nightly builds.
For that:
@nightly
annotation to the scenarios, removing@skip
@nightly
from PRs@nightly
annotation from the scheduled, nightly jobsThoughts? @elastic/observablt-robots @ph @EricDavisX @michalpristas
The text was updated successfully, but these errors were encountered: