Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
gitlab-ci: add concurrent jobs in run stage
- This commit splits the run stage (~40 mins) into four smaller jobs. - Prior to this commit, typical turn around for a pipeline ~1 hour but two consecutive tests of this re-factoring finished in 23 minutes - The old run stage used all executables in one run script and so could not start until the pgi executable was ready, even though the gnu executable was ready 10 minutes earlier - Breaking the run stage into tests grouped by compiler allows some "tetris" to be played to minimize wait time between jobs - Implemented by making four copies of MOM6-examples to allow concurrency across the three compilers (gnu, intel, pgi), and a fourth for restart tests (gnu only) - The results are copied into sub-directories under results/ for later comparison, no longer using tar files for caching output - Added "needs:" so jobs can start when their dependency is ready - Re-ordered jobs in the .gitlab-ci.yml files so that the slowest compilation starts first (pgi) Considerations: - We can't run two tests in the same directory at the same time because of colliding output. Therefore, the old CI would launch tests of all experiments/configurations concurrently but would cycle through each group of tests (compilers, layout, etc.) sequentially, copying the output and reusing the same work space. Making copies of the work space is slow, and running more concurrent jobs requires more nodes to be available at once, so the "four" has been found to be optimal for gaea and current work load. - We only have six runners (on the six compilation nodes) which limits the pipeline to six jobs at once. Allowing multiple jobs per runner could remove this limitation but would impact the system more. - The restart testing is the slowest section of the run stage (even though for a subset of experiments). Separating restarts out allows more concurrency. Doing restart tests for more experiments and all compilers would be very expensive.
- Loading branch information