Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tempus_*_MPI_1 failing in Trilinos-atdm-ats1-knl_intel-19.0.4_mpich-7.7.15_openmp_static_opt build starting 2021-10-15 #9881

Closed
ZUUL42 opened this issue Oct 28, 2021 · 2 comments
Labels
impacting: tests The defect (bug) is primarily a test failure (vs. a build failure) PA: Discretizations Issues that fall under the Trilinos Discretizations Product Area PA: Nonlinear Solvers Issues that fall under the Trilinos Nonlinear Linear Solvers Product Area pkg: Tempus Primary Build Added by triager to mark failures affecting primary builds type: bug The primary issue is a bug in Trilinos code or tests

Comments

@ZUUL42
Copy link
Contributor

ZUUL42 commented Oct 28, 2021

CC: @trilinos/tempus, @ccober6 (Trilinos Nonlinear Solvers / Discretizations Triage/ATDM Contact)

Next Action Status

Description

As shown in this query (click "Shown Matching Output" in upper right) the tests:

  • Tempus_BDF2_MPI_1
  • Tempus_BackwardEuler_MPI_1
  • Tempus_DIRK_MPI_1
  • Tempus_ExplicitRK_MPI_1
  • Tempus_UnitTest_ERK_MPI_1

in the builds:

  • Trilinos-atdm-ats1-knl_intel-19.0.4_mpich-7.7.15_openmp_static_opt

started failing on testing day 2021-10-15.

Error: srun: error: nid00195: task 0: Floating point exception (core dumped)

Current Status on CDash

Run the above query adjusting the "Begin" and "End" dates to match today any other date range or just click "CURRENT" in the top bar to see results for the current testing day.

Steps to Reproduce

One should be able to reproduce this failure as described in:

and the system-specific instructions at:

Just log into any of the associated machines and copy and paste the full CDash build name <build-name> listed above and run commands like:

$ cd <some_build_dir>/

$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh <build-name>

$ cmake \
 -GNinja \
 -DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
 -DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_<package-name>=ON \
 $TRILINOS_DIR

$ make NP=16

$ <command-to-run-on-compute-node> ctest -j4

where <package-name> is any package that you want to enable to reproduce build and/or test results.

Again, for exact system-specific details on what commands to run to build and run tests, see:

If you can't figure out what commands to run to reproduce the problem given this documentation, then please post a comment here and we will give you the exact minimal commands.

@ZUUL42 ZUUL42 added type: bug The primary issue is a bug in Trilinos code or tests impacting: tests The defect (bug) is primarily a test failure (vs. a build failure) pkg: Tempus PA: Nonlinear Solvers Issues that fall under the Trilinos Nonlinear Linear Solvers Product Area PA: Discretizations Issues that fall under the Trilinos Discretizations Product Area Primary Build Added by triager to mark failures affecting primary builds labels Oct 28, 2021
ccober6 added a commit that referenced this issue Nov 13, 2021
Investigated failing tests from Tempus_*_MPI_1 failing in Trilinos-atdm-ats1-knl_intel-19.0.4_mpich-7.7.15_openmp_static_opt build starting 2021-10-15 #9881

 * Excluded the Intel compiler for FPE testing.
 * Fixed an FPE divide-by-zero in Stepper_ErrorNorm
   - Added unit test to cover it
 * Added some doxygen on Stepper_ErrorNorm
@ccober6
Copy link
Contributor

ccober6 commented Dec 6, 2021

@ZUUL42 is this still a problem. I think the above merge should have fixed it.

@ZUUL42
Copy link
Contributor Author

ZUUL42 commented Dec 6, 2021

@ccober6, the CDash query seems to indicate this failure hasn't presented itself since 11/24.
So, ya, I think we can call this one closed.

@ZUUL42 ZUUL42 closed this as completed Dec 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
impacting: tests The defect (bug) is primarily a test failure (vs. a build failure) PA: Discretizations Issues that fall under the Trilinos Discretizations Product Area PA: Nonlinear Solvers Issues that fall under the Trilinos Nonlinear Linear Solvers Product Area pkg: Tempus Primary Build Added by triager to mark failures affecting primary builds type: bug The primary issue is a bug in Trilinos code or tests
Projects
None yet
Development

No branches or pull requests

2 participants