Speed up CI test workflows in GHA #2798

nibix · 2023-05-28T18:50:31Z

The CI job build can take more than one hour. As low turnaround times are desirable for speedy development, this should be reduced.

https://github.com/opensearch-project/security/actions/runs/4983823949/jobs/8921355684

The text was updated successfully, but these errors were encountered:

nibix · 2023-05-28T18:52:00Z

@davidlago @peternied @scrawfor99 @cwperks FYI. We are already working on this. I filed this to document and discuss our findings and suggestions for improvement.

nibix · 2023-05-30T14:30:07Z

We did some benchmarks and tests and would like to share our first results:

Gradle task build profile

We profiled the gradle task that is called from the CI:

gradlew -x integrationTest -x spotlessCheck -x checkstyleMain -x checkstyleTest -x spotbugsMain build test

task	time
total	46m33.49s
:checkstyleIntegrationTest	15m28.70s
:spotbugsIntegrationTest	15m17.72s
:test	13m10.24s
:opensslTest	2m4.77s
:compileJava	12.681s
:compileTestJava	9.541s
:compileIntegrationTestJava	4.068s
:jacocoTestReport	2.168s
:bundlePlugin	2.030s

What is remarkable is that the profile contains three major parts which take quite a while:

test
spotbugsIntegrationTest
checkstyleIntegrationTest

I would have only expected test. Possibly, an exclusion for spotbugsIntegrationTest and checkstyleIntegrationTest is missing?

General test resource consumption

We profiled also the individual tests. The general problem is that there are many tests which spin up embedded clusters and thus require an enormous amount of time.

A short term solution would be to parallelize the tests in maybe 4 CI jobs. This should be achievable with additional Gradle tasks. We are working on a prototype for this.

The mid to long term solution would be re-doing the tests with a stronger focus on performance. This work has been already started in the integration-test project. With additional light-weight unit tests, it would be possible to completely replace the resource hungly old tests and thus speed up the CI.

Windows tests

At the moment, the Java based integration tests are executed both in a Linux and Windows environment. The windows tests take about twice the time because the windows test runners are not as strong as the Linux test runners.

I would recommend to review the test strategy here: Does it for Java-based tests make sense to run the complete tests suites under both operating systems? I think, for the windows testing, it would be sufficient to select a relatively small number of tests which either test a large section through the stack or are likely to hit on OS specific issues.

peternied · 2023-06-13T21:58:08Z

Does it for Java-based tests make sense to run the complete tests suites under both operating systems

We have caught several platform specific issues by running the whole suite, we should not alter this strategy. When we official support mac releases we will add its CI tests as well.

peternied · 2023-06-13T21:59:00Z

A short term solution would be to parallelize the tests in maybe 4 CI jobs. This should be achievable with additional Gradle tasks. We are working on a prototype for this.

What is the mechanism that you are using to subdivide the test runs? Are there multiple options, I'd be curious how flexible they are.

nibix · 2023-06-16T06:33:44Z

@peternied

We have caught several platform specific issues by running the whole suite, we should not alter this strategy. When we official support mac releases we will add its CI tests as well.

Just out of curiosity: Do you have an example for such a platform specific bug? I would like to know about the character of these bugs to be able to judge this better.

github-actions bot added the untriaged Require the attention of the repository maintainers and may need to be prioritized label May 28, 2023

davidlago added triaged Issues labeled as 'Triaged' have been reviewed and are deemed actionable. and removed untriaged Require the attention of the repository maintainers and may need to be prioritized labels May 31, 2023

davidlago assigned nibix May 31, 2023

pawel-gudel-eliatra mentioned this issue Jun 14, 2023

[Enhancement] Parallel test jobs for CI #2861

Merged

3 tasks

davidlago closed this as completed Jul 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up CI test workflows in GHA #2798

Speed up CI test workflows in GHA #2798

nibix commented May 28, 2023

nibix commented May 28, 2023

nibix commented May 30, 2023 •

edited

Loading

peternied commented Jun 13, 2023

peternied commented Jun 13, 2023

nibix commented Jun 16, 2023

Speed up CI test workflows in GHA #2798

Speed up CI test workflows in GHA #2798

Comments

nibix commented May 28, 2023

nibix commented May 28, 2023

nibix commented May 30, 2023 • edited Loading

Gradle task build profile

General test resource consumption

Windows tests

peternied commented Jun 13, 2023

peternied commented Jun 13, 2023

nibix commented Jun 16, 2023

nibix commented May 30, 2023 •

edited

Loading