Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[develop] Added return code to srw_build.sh script. #461

Conversation

MichaelLueken
Copy link
Collaborator

@MichaelLueken MichaelLueken commented Nov 9, 2022

DESCRIPTION OF CHANGES:

The Jenkins build test doesn't exit with a return code, so if an executable fails to build, Jenkins will still continue to the test phase, leading to failures in the tests. Exiting with a return code has been added to the .cicd/scripts/srw_build.sh script.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

TESTS CONDUCTED:

  • hera.intel
  • Tested build manually on Hera using devbuild.sh and srw_build.sh
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss2.intel
  • NOAA Cloud (indicate which platform)
  • Jenkins
  • fundamental test suite
  • comprehensive tests (specify which if a subset was used)

ISSUE:

Fixes #460

CHECKLIST

  • My code follows the style guidelines in the Contributor's Guide
  • I have performed a self-review of my own code using the Code Reviewer's Guide
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes do not require updates to the documentation (explain).
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

@MichaelLueken MichaelLueken added ci-hera-intel-build Kicks off automated build test on hera with intel ci-jet-intel-build Kicks off automated build test on jet with gnu run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests labels Nov 9, 2022
@venitahagerty venitahagerty removed ci-jet-intel-build Kicks off automated build test on jet with gnu ci-hera-intel-build Kicks off automated build test on hera with intel labels Nov 9, 2022
@venitahagerty
Copy link
Collaborator

Machine: jet
Compiler: intel
Job: build
Repo location: /lfs1/BMC/nrtrr/rrfs_ci/autoci/pr/1116565961/20221109193509/ufs-srweather-app
Build failed
If test failed, please make changes and add the following label back:
ci-jet-intel-build

@venitahagerty
Copy link
Collaborator

Machine: hera
Compiler: intel
Job: build
Repo location: /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1116565961/20221109193515/ufs-srweather-app
Build failed
If test failed, please make changes and add the following label back:
ci-hera-intel-build

@MichaelLueken
Copy link
Collaborator Author

The changes properly kill the Jenkins CI pipeline when an executable fails to build during the building phase. Will remove the incorrect component name and resubmit tests.

@MichaelLueken MichaelLueken added ci-hera-intel-WE Kicks off automated workflow test on hera with intel ci-jet-intel-WE Kicks off automated workflow test on jet with intel labels Nov 9, 2022
@venitahagerty venitahagerty removed ci-jet-intel-WE Kicks off automated workflow test on jet with intel ci-hera-intel-WE Kicks off automated workflow test on hera with intel labels Nov 9, 2022
@venitahagerty
Copy link
Collaborator

Machine: jet
Compiler: intel
Job: WE
Repo location: /lfs1/BMC/nrtrr/rrfs_ci/autoci/pr/1116565961/20221109203512/ufs-srweather-app
Build failed
If test failed, please make changes and add the following label back:
ci-jet-intel-WE

@venitahagerty
Copy link
Collaborator

Machine: hera
Compiler: intel
Job: WE
Repo location: /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1116565961/20221109203518/ufs-srweather-app
Build failed
If test failed, please make changes and add the following label back:
ci-hera-intel-WE

@MichaelLueken MichaelLueken added ci-hera-intel-WE Kicks off automated workflow test on hera with intel ci-jet-intel-WE Kicks off automated workflow test on jet with intel labels Nov 9, 2022
@venitahagerty venitahagerty removed ci-hera-intel-WE Kicks off automated workflow test on hera with intel ci-jet-intel-WE Kicks off automated workflow test on jet with intel labels Nov 9, 2022
@venitahagerty
Copy link
Collaborator

venitahagerty commented Nov 9, 2022

Machine: hera
Compiler: intel
Job: WE
Repo location: /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1116565961/20221109212013/ufs-srweather-app
Build was Successful
Rocoto jobs started
Long term tracking will be done on 10 experiments
If test failed, please make changes and add the following label back:
ci-hera-intel-WE
Experiment Succeeded on hera: pregen_grid_orog_sfc_climo
Experiment Succeeded on hera: community_ensemble_2mems_stoch
Experiment Succeeded on hera: grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_2017_gfdlmp_regional
Experiment Succeeded on hera: grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta
Experiment Succeeded on hera: grid_RRFS_CONUS_25km_ics_GSMGFS_lbcs_GSMGFS_suite_GFS_v15p2
Experiment Succeeded on hera: grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2
Experiment Succeeded on hera: grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_RAP_suite_HRRR
Experiment Succeeded on hera: grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16
Experiment Succeeded on hera: grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_HRRR
Experiment Succeeded on hera: MET_ensemble_verification
All experiments completed

@venitahagerty
Copy link
Collaborator

venitahagerty commented Nov 9, 2022

Machine: jet
Compiler: intel
Job: WE
Repo location: /lfs1/BMC/nrtrr/rrfs_ci/autoci/pr/1116565961/20221109212020/ufs-srweather-app
Build was Successful
Rocoto jobs started
Long term tracking will be done on 10 experiments
If test failed, please make changes and add the following label back:
ci-jet-intel-WE
Experiment Succeeded on jet: specify_DT_ATMOS_LAYOUT_XY_BLOCKSIZE
Experiment Succeeded on jet: nco_grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_HRRR
Experiment Succeeded on jet: specify_RESTART_INTERVAL
Experiment Succeeded on jet: custom_GFDLgrid
Experiment Succeeded on jet: grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2
Experiment Succeeded on jet: custom_ESGgrid
Experiment Succeeded on jet: grid_RRFS_CONUS_25km_ics_GSMGFS_lbcs_GSMGFS_suite_GFS_v15p2
Experiment Succeeded on jet: grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_RAP_suite_HRRR
Experiment Succeeded on jet: specify_DOT_OR_USCORE
Experiment Succeeded on jet: grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16
All experiments completed

Copy link
Collaborator

@christinaholtNOAA christinaholtNOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting concept. I'm not at all opposed, but do have a couple of comments/questions for thought.

I'm curious what mechanism is catching the error. There are some reserved linux exit codes, and I'm curious if this unpredictable exit code might cause any collisions with the standard ones, or even cause problems in properly handling the error.

@christopherwharrop-noaa
Copy link
Collaborator

Would it be more direct to simply check the status of the build command itself? If ./build.sh isn't returning a meaningful/correct exit code then it should also be fixed. If you want the exit code of this script to be the success/failure of the build, then I'd suggest something like below. Were you planning to make use of the number of FAIL: messages for some other purpose?

# Build and install
cd ${workspace}/tests
export PID=$$
./build.sh ${platform} ${SRW_COMPILER}
build_exit=$?
cd -

# Create combined log file for upload to s3
build_dir="${workspace}/build_${SRW_COMPILER}"
cat ${build_dir}/log.cmake ${build_dir}/log.make \
    >${build_dir}/srw_build-${platform}-${SRW_COMPILER}.log

TEST_OUTPUT="${workspace}/tests/build_test${PID}.out"

exit ${build_exit}

@MichaelLueken MichaelLueken added ci-hera-intel-build Kicks off automated build test on hera with intel ci-jet-intel-build Kicks off automated build test on jet with gnu labels Nov 10, 2022
@MichaelLueken
Copy link
Collaborator Author

This is an interesting concept. I'm not at all opposed, but do have a couple of comments/questions for thought.

I'm curious what mechanism is catching the error. There are some reserved linux exit codes, and I'm curious if this unpredictable exit code might cause any collisions with the standard ones, or even cause problems in properly handling the error.

@christinaholtNOAA Thank you very much for the review! I have moved the return code handling from .cicd/scripts/srw_build.sh to tests/build.sh. In tests/build.sh, regardless of the number of failed tests, the return code will be 1.

@MichaelLueken
Copy link
Collaborator Author

Would it be more direct to simply check the status of the build command itself? If ./build.sh isn't returning a meaningful/correct exit code then it should also be fixed. If you want the exit code of this script to be the success/failure of the build, then I'd suggest something like below. Were you planning to make use of the number of FAIL: messages for some other purpose?

# Build and install
cd ${workspace}/tests
export PID=$$
./build.sh ${platform} ${SRW_COMPILER}
build_exit=$?
cd -

# Create combined log file for upload to s3
build_dir="${workspace}/build_${SRW_COMPILER}"
cat ${build_dir}/log.cmake ${build_dir}/log.make \
    >${build_dir}/srw_build-${platform}-${SRW_COMPILER}.log

TEST_OUTPUT="${workspace}/tests/build_test${PID}.out"

exit ${build_exit}

@christopherwharrop-noaa Thank you very much for your comment! I have removed the unnecessary grep for FAIL: from .cicd/scripts/srw_build.sh (this isn't necessary for anything else) and just read in the return code from tests/build.sh.

@venitahagerty venitahagerty removed ci-jet-intel-build Kicks off automated build test on jet with gnu ci-hera-intel-build Kicks off automated build test on hera with intel labels Nov 10, 2022
Copy link
Collaborator

@danielabdi-noaa danielabdi-noaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@venitahagerty
Copy link
Collaborator

Machine: hera
Compiler: intel
Job: build
Repo location: /scratch1/BMC/zrtrr/rrfs_ci/autoci/pr/1116565961/20221110165017/ufs-srweather-app
Build was Successful
If test failed, please make changes and add the following label back:
ci-hera-intel-build

@venitahagerty
Copy link
Collaborator

Machine: jet
Compiler: intel
Job: build
Repo location: /lfs1/BMC/nrtrr/rrfs_ci/autoci/pr/1116565961/20221110165011/ufs-srweather-app
Build was Successful
If test failed, please make changes and add the following label back:
ci-jet-intel-build

@MichaelLueken MichaelLueken merged commit 0ba66f9 into ufs-community:develop Nov 10, 2022
@MichaelLueken MichaelLueken deleted the bugfix/add-returncode-to-srw_build branch November 10, 2022 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When the build phase fails on Jenkins, Jenkins reports a pass and moves on to the test phase
6 participants