-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add install hooks for ATDM Trilinos configuration and ctest -S driver #2689
Comments
CC: @dridzal, @trilinos/framework NOTE: Adding a Therefore, adding a |
Agreed. I would like ROL's nightly testing to catch this as well (and not just PR testing). Are these options added at the Trilinos level or do I need to modify my scripts? |
@dridzal, yes, once I add that option |
CC: @bathmatt, @jmgate, @trilinos/framework, @fryeguy52 Note that the install hook defined here and the testing it would represent would not have caught the Kokkos installation problem reported in #2883. In order to catch that, you would need to delete (or move) the Trilinos source directories and binary directories after the install of Trilinos, and then have a separate process try to build against these. That type of process should be used for testing installation of Trilinos as part of upgrades of Trilinos for ATDM APPs but that goes beyond the scope of this story. However, this story should consider implementing the minimum features needed for that process which are to:
Then when the follow-on Jenkins job tries to build against this installation of Trilinos, if anything is pointing into the source or build trees, it will fail the build. If this Jenkins driver would have been in place for the EMPIRE Jenkins Pipeline process, then EMPIRE would have never have accepted this version of Trilinos and a lot of wasted time would have been avoided. I will add these moves to the scope of this issue. That is not a lot of work. |
CC: @fryeguy52 FYI: I updated the above "Proposed Solutions" section to describe how to provide a |
@bathmatt, is the name of the script that gets installed with Trilinos that loads the env negotiable? Currently it looks like the EM-Plasma/BuildScripts Trilinos install scripts install
? If I don't hear back from you, I think I will just implement this with the CMake cache var |
…AME (trilinos#2689) See the README.md file for details. I also renamed the env var set in atdm/load-env.sh from JOB_NAME to ATDM_CONFIG_JOB_NAME. This is to avoid confusing and other problems with the unnamespaced var JOB_NAME which is set by Jenkins. This does not change the behavior of the CTest -S jenkins drivers that directly read from JOB_NAME.
…AME (trilinos#2689) See the README.md file for details. I also renamed the env var set in atdm/load-env.sh from JOB_NAME to ATDM_CONFIG_JOB_NAME. This is to avoid confusing and other problems with the unnamespaced var JOB_NAME which is set by Jenkins. This does not change the behavior of the CTest -S jenkins drivers that directly read from JOB_NAME.
…AME (trilinos#2689) See the README.md file for details. I also renamed the env var set in atdm/load-env.sh from JOB_NAME to ATDM_CONFIG_JOB_NAME. This is to avoid confusing and other problems with the unnamespaced var JOB_NAME which is set by Jenkins. This does not change the behavior of the CTest -S jenkins drivers that directly read from JOB_NAME.
FYI: PR #3521 implements the installation of the env loading scripts and allows you to configure with:
and then after installation run:
and you should be off to the races. Please review the updated documentation at: and comment in PR #3521. |
…AME (trilinos#2689) See the README.md file for details. I also renamed the env var set in atdm/load-env.sh from JOB_NAME to ATDM_CONFIG_JOB_NAME. This is to avoid confusing and other problems with the unnamespaced var JOB_NAME which is set by Jenkins. This does not change the behavior of the CTest -S jenkins drivers that directly read from JOB_NAME.
…g-install Automatically Merged using Trilinos Pull Request AutoTester PR Title: Add install of ATDM env scripts, rename JOB_NAME to ATDM_CONFIG_JOB_NAME (#2689) PR Author: bartlettroscoe
FYI: PR #3521 was just merged that installs a Now we just need the install hooks for our CTest -S driver. |
Actually, EMPIRE is going to use their own Jenkins drivers so they don't need the ATDM Trilinos CTest -S driver to do installs at this time. Therefore, the needs of EMPIRE for this story are satisfied so I will remove the "EMPIRE" label and add the "SPARC" label since we do plan to install Trilinos as a biproduct of the ATDM Trilinos builds for SPARC (or run another set of ATDM Trilinos builds for the merge of SPARC Trilinos 'master' and Trilinos Github 'develop'). |
…AME (trilinos#2689) See the README.md file for details. I also renamed the env var set in atdm/load-env.sh from JOB_NAME to ATDM_CONFIG_JOB_NAME. This is to avoid confusing and other problems with the unnamespaced var JOB_NAME which is set by Jenkins. This does not change the behavior of the CTest -S jenkins drivers that directly read from JOB_NAME.
I added tests for passing and failing installs. I only implemented this for the all-at-once approach since that is all we need right now for ATDM Trilinos. But we could implement it for the package-by-package mode if needed without much trouble. Note that as part of this I fixed an oversight where build failures that did not cause test failures would be ignored in that all-at-once approach. That was wrong. Now if there are any build or install failures, it will assume that any (read that 'all') of the tested packages may have failed. See the long "NOTE" comment about this. Long-story-short, this will only really impact CI builds where there are build failures. If there are just test failures (the more common case), then only the packages with failing tests are flagged as failed and will be enabled on the next CI iteration.
My last commit broke the use case where an optional component (TriBITS package) is missing (i.e. was not enabled). This broke the SPARC use case where it looks for TriKota but TriKota can be missing. I added tests for that use case and strengthened the existing tests. I broke the failing test into two ctest tests since you have to reconfigure from scratch anyway so there is no benefit to keeping these together.
My last commit broke the use case where an optional component (TriBITS package) is missing (i.e. was not enabled). This broke the SPARC use case where it looks for TriKota but TriKota can be missing. I added tests for that use case and strengthened the existing tests. I broke the failing test into two ctest tests since you have to reconfigure from scratch anyway so there is no benefit to keeping these together. Build/Test Cases Summary Enabled Packages: Enabled all Packages 0) MPI_DEBUG => passed: passed=355,notpassed=0 (1.16 min) 1) SERIAL_RELEASE => passed: passed=355,notpassed=0 (1.22 min)
Origin repo remote tracking branch: 'github/master' Origin repo remote repo URL: 'github = [email protected]:TriBITSPub/TriBITS.git' At commit: commit 1d1334bbe67bb82184c64445a6d946d4a5ad35b7 Author: Roscoe A. Bartlett <[email protected]> Date: Tue Apr 23 10:17:36 2019 -0600 Summary: Fix case for optional missing component (#2689)
…dm-nightly (#2689, #4993) Merging PR #4993 branch directly to 'atdm-nightly' to ensure it gets into ATDM builds tomorrow. This should fix SPARC build problems described in #4993 related to install changes related to #4993. NOTE: Since the Trilinos PR tester does not currently test the installed TrilinosConfig.cmake file it provides no value in testing a change like this. Build/Test Cases Summary Enabled Packages: Kokkos, Teuchos Disabled Packages: PyTrilinos,Claps,TriKota 0) MPI_RELEASE_DEBUG_SHARED_PT_OPENMP => passed: passed=170,notpassed=0 (3.01 min) Other local commits for this build/test group: f1b5c79
Automatically Merged using Trilinos Pull Request AutoTester PR Title: Fix case for optional missing component (#2689) PR Author: bartlettroscoe
Origin repo remote tracking branch: 'github/master' Origin repo remote repo URL: 'github = [email protected]:TriBITSPub/TriBITS.git' At commit: commit c6b9a85872eacce51dfaadbb87e61c9ed10cedf6 Author: Roscoe A. Bartlett <[email protected]> Date: Thu Apr 25 12:47:14 2019 -0600 Summary: Add tribits/core install_package_by_package tests and <Project>_INSTALL_PBP_RUNNER option (#2689, ATDV-156)
…velop (trilinos#2689, ATDV-156) Allow doing install using 'run-as-atdm-devops-admin' (see ATDV-156). This is part of the general install story trilinos#2689.
…s#2689, ATDV-151, ATDV-156) This is mainly to allow the usage of installation as the 'atdm-devops-admin' user account using tthe 'jenkins' on various systems and therefore protect the installs of Trilinos from bad 'jenkins' jobs. There is a lot little things you need to do to get this work: * ATDM_CONFIG_WORKSPACE_BASE[_DEFAULT]: Use a different workspace for SRC_AND_BUILD (which will allow for reading by the 'atdm-devops-admin' account). * ATDM_CONFIG_INSTALL_PBP_RUNNER[_DEFAULT]: Allow inserting the 'run-as-atdm-devops-admin' setuid program to run the install command as the 'atdm-devops-admin' user. * ATDM_CONFIG_TRIL_CMAKE_INSTALL_PREFIX_DATE_BASE[_DEFAULT]: Set a default base directory for a given system * ATDM_CONFIG_USE_JENKINS_INSTALL_DEFAULTS: Set to '1' to use the defaults for the above three vars. See updated documentation for more details.
…s#2689, ATDV-151, ATDV-156) This is mainly to allow the usage of installation as the 'atdm-devops-admin' user account using the 'jenkins' entity account on various systems and therefore protect the installs of Trilinos from bad 'jenkins' jobs. There is a lot little things you need to do to get this work: * ATDM_CONFIG_WORKSPACE_BASE[_DEFAULT]: Use a different workspace for SRC_AND_BUILD (which will allow for reading by the 'atdm-devops-admin' account). (And you must 'cd' into that workspace.) * ATDM_CONFIG_INSTALL_PBP_RUNNER[_DEFAULT]: Allow inserting the 'run-as-atdm-devops-admin' setuid program to run the install command as the 'atdm-devops-admin' user. * ATDM_CONFIG_TRIL_CMAKE_INSTALL_PREFIX_DATE_BASE[_DEFAULT]: Set a default base directory for a given system. * ATDM_CONFIG_USE_JENKINS_INSTALL_DEFAULTS: Set to '1' to use the defaults for the above three vars. See updated documentation for more details.
…s#2689, ATDV-151, ATDV-156) This is mainly to allow the usage of installation as the 'atdm-devops-admin' user account using the 'jenkins' entity account on various systems and therefore protect the installs of Trilinos from bad 'jenkins' jobs. There is a lot little things you need to do to get this work: * ATDM_CONFIG_WORKSPACE_BASE[_DEFAULT]: Use a different workspace for SRC_AND_BUILD (which will allow for reading by the 'atdm-devops-admin' account). (And you must 'cd' into that workspace.) * ATDM_CONFIG_INSTALL_PBP_RUNNER[_DEFAULT]: Allow inserting the 'run-as-atdm-devops-admin' setuid program to run the install command as the 'atdm-devops-admin' user. * ATDM_CONFIG_TRIL_CMAKE_INSTALL_PREFIX_DATE_BASE[_DEFAULT]: Set a default base directory for a given system. * ATDM_CONFIG_USE_JENKINS_INSTALL_DEFAULTS: Set to '1' to use the defaults for the above three vars. See updated documentation for more details.
…s:develop' (bad6c8f). * trilinos-develop: Add ctest -S driver for waterman_cuda-9.2_shared_opt for SPARC (ATDV-151) Add install stuff for 'waterman' (ATDV-156, ATDV-151) Set up for alternate workspace, install as atdm-devops, etc. (trilinos#2689, ATDV-151, ATDV-156) Automatic snapshot commit from tribits at c6b9a85
…s:develop' (bad6c8f). * trilinos-develop: Add ctest -S driver for waterman_cuda-9.2_shared_opt for SPARC (ATDV-151) Add install stuff for 'waterman' (ATDV-156, ATDV-151) Set up for alternate workspace, install as atdm-devops, etc. (trilinos#2689, ATDV-151, ATDV-156) Automatic snapshot commit from tribits at c6b9a85
Technically made more progress on improving installing in PRs #7285 and #7297. That last bit of scope stopping this story from being closed described above would be to move the build and source trees out of the way after the install. That would make the install testing a little stronger but I fear that may cause some confusion and may not react well to other scripts (like that various ATDM Trilinos driver scripts). I think I should just close this issue and leave that last bit of scope undone. |
CC: @fryeguy52
Description
In order for the ATDM projects to adopt the new ATDM Trilinos Configuration and Jenkins Drivers (that submit to CDash as a byproduct), these scripts must support the installation of Trilinos and using Trilinos from that install.
For this to occur:
Possible solutions
First, a new env var
ATDM_CONFIG_TRIL_CMAKE_INSTALL_PREFIX
could be added that would be read in theATDMDevEnvSettings.cmake
file and would set the CMake varCMAKE_INSTALL_PREFIX
. That would give you the correct configuration of Trilinos and would know where to install.As for doing the installation itself, one simple solution would be to provide a
atdm/install-prebuilt-trilinos.sh
script that will runmake install
on the built Trilinos version. This script would need to be explicitly called by some Jenkins driver script. But any failures would not be posted to CDash.If we wanted to allow for installation errors to get reported up to CDash, then a
CTEST_DO_INSTALL
option could be added to the TRIBITS_CTEST_DRIVER() function and would do acmake_build( ... )
command that would run theinstall
target (and in parallel) after the project build was complete. That would also allow logic for not installing if there was a build failure and, again, would allow for any install failures to get reported to CDash.Also, to help to make installation testing for downstream customer codes stronger, the smart-jenkins-driver.sh script could be updated to move the source and build directories out of the way to catch mistakes were the installed configuration files point into the source or build trees. This could be accomplished by adding the steps:
$WORKSPACE/Trilinos
$WORKSPACE/moved/SRC_AND_BUILD/
already exists, the move it back to$WORKSPACE/SRC_AND_BUILD/
$WORKSPACE/SRC_AND_BUILD/
to$WORKSPACE/moved/SRC_AND_BUILD/
.NOTE: The reason to move
$WORKSPACE/SRC_AND_BUILD/
to$WORKSPACE/moved/SRC_AND_BUILD/
instead of say$WORKSPACE/SRC_AND_BUILD.moved/
is to catch errors where the installation under$WORKSPACE/local_install/
might have relative paths to$WORKSPACE/SRC_AND_BUILD/BUILD/
which would also work for$WORKSPACE/SRC_AND_BUILD.moved/BUILD/
but would not work for$WORKSPACE/moved/SRC_AND_BUILD/BUILD/
. Also, the reason to move$WORKSPACE/SRC_AND_BUILD/
instead of deleting it is to avoid having to re-clone the Trilinos git repo again from scratch every time the job runs and to allow looking through the build artifacts on Jenkins after the job is complete.Lastly, in order to support loading the correct env from a Trilinos install, the relevant files from
Trilinos/cmake/std/atdm/
need to be installed that match the configuration of Trilinos being installed. At a minimum, this must include, for example:Trilinos/cmake/std/atdm/load-dev.sh
Trilinos/cmake/std/atdm/utils/
Trilinos/cmake/std/atdm/<system-name>/environment.sh
But that would allow loading any env, including those that don't match the install. So the Trilinos install hooks should also be updated to install a script
<install-prefix>/load_matching_env.sh
that takes no arguments and will source the installedatdm/load-env.sh
script with the right job name. For example, this installed scriptload_matching_env.sh
might look like:The a client like EMPIRE or SPARC would just source:
and then inspect the exported
ATDM_CONFIG_*
vars and know that the right compilers, MPI, TPL locations, etc. were loaded to correctly use that installation of Trilinos.The text was updated successfully, but these errors were encountered: