Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up CI test workflows in GHA #2798

Closed
nibix opened this issue May 28, 2023 · 5 comments
Closed

Speed up CI test workflows in GHA #2798

nibix opened this issue May 28, 2023 · 5 comments
Assignees
Labels
triaged Issues labeled as 'Triaged' have been reviewed and are deemed actionable.

Comments

@nibix
Copy link
Collaborator

nibix commented May 28, 2023

The CI job build can take more than one hour. As low turnaround times are desirable for speedy development, this should be reduced.

https://github.com/opensearch-project/security/actions/runs/4983823949/jobs/8921355684

239952005-c5c32b38-e1f8-4096-bdbc-b48cab789f01

@github-actions github-actions bot added the untriaged Require the attention of the repository maintainers and may need to be prioritized label May 28, 2023
@nibix
Copy link
Collaborator Author

nibix commented May 28, 2023

@davidlago @peternied @scrawfor99 @cwperks FYI. We are already working on this. I filed this to document and discuss our findings and suggestions for improvement.

@nibix
Copy link
Collaborator Author

nibix commented May 30, 2023

We did some benchmarks and tests and would like to share our first results:

Gradle task build profile

We profiled the gradle task that is called from the CI:

gradlew -x integrationTest -x spotlessCheck -x checkstyleMain -x checkstyleTest -x spotbugsMain build test 
task time
total 46m33.49s
:checkstyleIntegrationTest 15m28.70s  
:spotbugsIntegrationTest 15m17.72s
:test 13m10.24s
:opensslTest 2m4.77s
:compileJava 12.681s  
:compileTestJava 9.541s  
:compileIntegrationTestJava 4.068s
:jacocoTestReport 2.168s
:bundlePlugin 2.030s

What is remarkable is that the profile contains three major parts which take quite a while:

  • test
  • spotbugsIntegrationTest
  • checkstyleIntegrationTest

I would have only expected test. Possibly, an exclusion for spotbugsIntegrationTest and checkstyleIntegrationTest is missing?

General test resource consumption

We profiled also the individual tests. The general problem is that there are many tests which spin up embedded clusters and thus require an enormous amount of time.

A short term solution would be to parallelize the tests in maybe 4 CI jobs. This should be achievable with additional Gradle tasks. We are working on a prototype for this.

The mid to long term solution would be re-doing the tests with a stronger focus on performance. This work has been already started in the integration-test project. With additional light-weight unit tests, it would be possible to completely replace the resource hungly old tests and thus speed up the CI.

Windows tests

At the moment, the Java based integration tests are executed both in a Linux and Windows environment. The windows tests take about twice the time because the windows test runners are not as strong as the Linux test runners.

I would recommend to review the test strategy here: Does it for Java-based tests make sense to run the complete tests suites under both operating systems? I think, for the windows testing, it would be sufficient to select a relatively small number of tests which either test a large section through the stack or are likely to hit on OS specific issues.

@davidlago davidlago added triaged Issues labeled as 'Triaged' have been reviewed and are deemed actionable. and removed untriaged Require the attention of the repository maintainers and may need to be prioritized labels May 31, 2023
@peternied
Copy link
Member

Does it for Java-based tests make sense to run the complete tests suites under both operating systems

We have caught several platform specific issues by running the whole suite, we should not alter this strategy. When we official support mac releases we will add its CI tests as well.

@peternied
Copy link
Member

A short term solution would be to parallelize the tests in maybe 4 CI jobs. This should be achievable with additional Gradle tasks. We are working on a prototype for this.

What is the mechanism that you are using to subdivide the test runs? Are there multiple options, I'd be curious how flexible they are.

@nibix
Copy link
Collaborator Author

nibix commented Jun 16, 2023

@peternied

We have caught several platform specific issues by running the whole suite, we should not alter this strategy. When we official support mac releases we will add its CI tests as well.

Just out of curiosity: Do you have an example for such a platform specific bug? I would like to know about the character of these bugs to be able to judge this better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issues labeled as 'Triaged' have been reviewed and are deemed actionable.
Projects
None yet
Development

No branches or pull requests

3 participants