-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing CI #207
Implementing CI #207
Conversation
runscript: [FV3checkoutStartClusters.py, FV3swStartClusters.py, FV3nhStartClusters.py, FV3hydroStartClusters.py] | ||
steps: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do each of these run through the cluster spin up part of the API script? Do they all build something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They all run through the start up script and just have a different testcmd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first only checks out code, the next 3 each build a different configuration
name: SOLO SHiELD build | ||
strategy: | ||
fail-fast: true | ||
max-parallel: 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How many runners do you have? You have 4 jobs, do they need to all run one after the other? You could set up 4 runners to run all 4 at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only have 1 right now. But I will set up multiple runners so that we can do multiple tests at the same time. For the build step I need them all to build in series (and not parallel).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the builds could be done in parallel if we have the CI system build the FMS library first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would there be an issue with the builds occurring simultaneously when they all use the same Build/exec folder? I could alter the Build system so that they all build with different exec folders
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current build system uses exec/${config}_${compiler}/ It could be altered to use a combination of config, hydro, comp, bit, and compiler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will start making those changes - building in parallel would make everything much faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember to run the FMS build as a standalone process. Once we bring in the updates to build the few NCEPlibs needed by SHiELD, that would become part of that first step.
I have created PR NOAA-GFDL/SHiELD_build#15 to now have scripts to build libFMS and nceplibs. There is also the ability to store these libraries external to the SHiELD_build so that libFMS does not need to be built with every new CI instance. And I have updated the compile such that there are unique exec/ sub-directories for each configuration so that we can do all 3 builds in parallel. I have also added more Github runners (3 in total) so we can run builds and tests in parallel. Once This PR from SHiELD_build is merged I will rerun the CI on this PR so we can see how it works |
@laurenchilutti After this gets merged it, submit a PR with code that should fail to make sure that the tests fail when expected. |
Tests passed and builds ran in parallel - future improvements I want to make:
|
* updated fv_regional_bc.F90 to read levsp from GFS_ctl file (#152) * Updating a continue statement in fv_nudge (#153) * update in fv_nudge to fix 666 label logic * fix logic errors (#138) * update to external_ic::remap_scale to skip remapping non-existent IC tracers (#159) * Fix nudge logic (#157) * fix logic errors * fix answer changes * adding back a line that was mistakenly deleted in a previous commit (#166) * Release 042022 (#184) * Merge of updates from GFDL Weather and Climate Dynamics Division (202204): *add license header to missing files and fix typo in header *updates needed for fv3_gfsphysics to have access to bounded_domain *remove obsoleted driver/SHiELD files *updating to fix bug where long_name and units attributes were not being captured in the RESTARTS *remove unused function fv_diagnostics::max_vorticity_hy1 *remove avec timer remnants *adding ability to specify prefix and directory when reading and writing restarts *remove old style namelist read in favor of read from internal character variable *Added option for a mean wind *The planetary radius and rotation rate are now re-scalable by a namelist parameter (small_earth_scale) instead of using exclusively the hard-coded FMS constant. *fv_mapz: Cleanup and added helpful annotations to avoid getting lost so easily * remove duplicate code and fix lstatus on all grids depending on gfs_data and gfs_data.tile1 * New idealized tests *Makes the non-hydrostatic restart variables optional for reads to allow hydrostatic ICs *Fix the hydrostatic TE remapping; Add GMAO cubic for TE remapping, which is used if kord_tm=0 and remap_te=.true. *Add a TE remapping option (kord_tm=0) *Addressing GNU Warnings *Add the L75 vertical config from HAFS * clean up fms_mp_mod and remove mp_bcst * cherry pick 5193c6b from dev/emc * Attempt at integrating fixes on top of dev/emc branch. (#173) * remove outdated GFDL_tools/fv_diag_column.F90 which exists as the result of an improper merge. The correct file is tools/fv_diag_column.F90 (#191) * various io fixes (#192) * fixes io performance issues by making everyone a reader when io_layout=1,1 adds capability to use FMS feature to ignore data integrity checksums in restarts * rename enforce_rst_cksum to ignore_rst_cksum and change the default value for compatibility * fix index error (#196) * New notebooks (#190) * Moving source files into source directory * Added advection notebook * Fixed subplot spacing * New 3D case notebooks * New idealized tests Updated mtn_wave_tests_1km to restore missing graphics. * first try at held-suarez * Updated H-S superrotation notebook * New level placement tool in an interactive note * Minor updates to notebooks; deleted fv_eta binary. * Regional decomposition test fix (when nrows_blend > 0) (#194) (#200) * Regional bc blend changes to extend into interior halos and overlap on corners. Still not working for u and v. dyn_core.F90 : Fix from Jun Wang to correct sync of u,v fv_regional_bc.F90 : add check for nrows_blend > tile size; fix error when nrows_blend=1 Conflicts (not taken during chery-pick): driver/SHiELD/atmosphere.F90 model/fv_sg.F90 Co-authored-by: Ted Mansell <[email protected]> * Implementing CI (#207) * CI Parallelworks update (#211) * Update call to read_input_nml and remove unnecessary code. (#161) * Removing use of INTERNAL_FILE_NML and cleaning up read_namelist_test_cases to remove unused argument * deleting duplicate call to read_namelist_test_case_nml in fv_control * removing commented code in fv_control Co-authored-by: Rusty Benson <[email protected]> Co-authored-by: menzel-gfdl <[email protected]> Co-authored-by: Rusty Benson <[email protected]> Co-authored-by: MatthewPyle-NOAA <[email protected]> Co-authored-by: lharris4 <[email protected]> Co-authored-by: Ted Mansell <[email protected]>
Description
Parallelworks is currently down, so I will keep this as a draft until it comes back on. - This is resolved.
This PR brings in a github actions yaml file that uses a Self-Hosted Runner (Parallelworks rdhpcs cloud resources) to automatically compile solo shield and run idealized tests on Pull Requests to the main branch of the GFDL_atmos_cubed_sphere repository. The actual scripts referenced in this yaml are stored on the Parallelworks resource. The following is a description of how the CI works:
How Has This Been Tested?
This has been tested with generating test PRs in my fork:
laurenchilutti#14 shows that when a PR targets a branch not named "main" the tests will not run
laurenchilutti#15 shows that when a PR targets the main branch tests run
Checklist:
Please check all whether they apply or not
(Relies on Updates to build system for use in CI on Parallelworks SHiELD_build#13)
(Relies on Build nceplibs, use external libraries, and build different configurations in parallel SHiELD_build#15)