Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build errors in Sacado_GTestSuite.exe for 'arm-20.1' on 'stria' and 'intel-18.0.0.20170811' on 'tlcc2' starting 2020-08-04 #7778

Closed
bartlettroscoe opened this issue Aug 4, 2020 · 11 comments
Labels
ATDM Sev: Nonblocker Problems with Trilinos that should not block ATDM APPs from getting updates client: ATDM Any issue primarily impacting the ATDM project impacting: configure or build The issue is primarily related to configuring or building PA: Nonlinear Solvers Issues that fall under the Trilinos Nonlinear Linear Solvers Product Area pkg: Sacado type: bug The primary issue is a bug in Trilinos code or tests

Comments

@bartlettroscoe
Copy link
Member

bartlettroscoe commented Aug 4, 2020

CC: @trilinos/sacado, @rppawlo (Trilinos Nonlinear Solvers Product Lead), @etphipp

Next Action Status

Description

As shown in this query Sacado is experiencing build errors in the builds:

  • Trilinos-atdm-tlcc2-intel-debug-openmp
  • Trilinos-atdm-tlcc2-intel-opt-openmp
  • Trilinos-atdm-van1-tx2_arm-20.1_openmpi-4.0.3_openmp_static_dbg
  • Trilinos-atdm-van1-tx2_arm-20.1_openmpi-4.0.3_openmp_static_opt

starting testing day 2020-08-04.

As shown here and here, the build errors all seem to be related with building object files for the exectuable:

  • sacado/test/GTestSuite/Sacado_GTestSuite.exe

where the intel-18.0.0.20170811 build errors on 'tlcc2' shown here show:

In file included from /gpfs1/jenkins/skybridge-slave/workspace/Trilinos-atdm-tlcc2-intel-opt-openmp/SRC_AND_BUILD/Trilinos/packages/sacado/test/GTestSuite/googletest/googletest/include/gtest/gtest.h(67),
                 from /gpfs1/jenkins/skybridge-slave/workspace/Trilinos-atdm-tlcc2-intel-opt-openmp/SRC_AND_BUILD/Trilinos/packages/sacado/test/GTestSuite/FadUnitTests.hpp(41),
                 from /gpfs1/jenkins/skybridge-slave/workspace/Trilinos-atdm-tlcc2-intel-opt-openmp/SRC_AND_BUILD/Trilinos/packages/sacado/test/GTestSuite/CacheFadUnitTests.cpp(30):
/gpfs1/jenkins/skybridge-slave/workspace/Trilinos-atdm-tlcc2-intel-opt-openmp/SRC_AND_BUILD/Trilinos/packages/sacado/test/GTestSuite/googletest/googletest/include/gtest/internal/gtest-internal.h(1173): error: no instance of function template "testing::internal::ElemFromListImpl<testing::internal::IndexSequence<I...>>::Apply [with I=<0UL>]" matches the argument list
            argument types are: (bool (*)(), bool (*)())
        decltype(ElemFromListImpl<typename MakeIndexSequence<N>::type>::Apply(
                 ^
/gpfs1/jenkins/skybridge-slave/workspace/Trilinos-atdm-tlcc2-intel-opt-openmp/SRC_AND_BUILD/Trilinos/packages/sacado/test/GTestSuite/googletest/googletest/include/gtest/internal/gtest-internal.h(1167): note: this candidate was rejected because arguments do not match
    static R Apply(Ignore<0 * I>..., R (*)(), ...);
             ^
          detected during:
            instantiation of class "testing::internal::ElemFromList<N, T...> [with N=1UL, T=<bool, bool>]" at line 1185
            instantiation of class "testing::internal::FlatTupleElemBase<testing::internal::FlatTuple<T...>, I> [with T=<bool, bool>, I=1UL]" at line 1196
            instantiation of class "testing::internal::FlatTupleBase<testing::internal::FlatTuple<T...>, testing::internal::IndexSequence<Idx...>> [with Idx=<0UL, 1UL>, T=<bool, bool>]" at line 1214
            instantiation of class "testing::internal::FlatTuple<T...> [with T=<bool, bool>]" at line 802 of "/gpfs1/jenkins/skybridge-slave/workspace/Trilinos-atdm-tlcc2-intel-opt-openmp/SRC_AND_BUILD/Trilinos/packages/sacado/test/GTestSuite/googletest/googletest/include/gtest/internal/gtest-param-util.h"
            instantiation of class "testing::internal::ValueArray<Ts...> [with Ts=<bool, bool>]" at line 360 of "/gpfs1/jenkins/skybridge-slave/workspace/Trilinos-atdm-tlcc2-intel-opt-openmp/SRC_AND_BUILD/Trilinos/packages/sacado/test/GTestSuite/googletest/googletest/include/gtest/gtest-param-test.h"

compilation aborted for /gpfs1/jenkins/skybridge-slave/workspace/Trilinos-atdm-tlcc2-intel-opt-openmp/SRC_AND_BUILD/Trilinos/packages/sacado/test/GTestSuite/CacheFadUnitTests.cpp (code 2)

and the 'arm-20.1' build errors on 'stria' shown here show errors like:

In file included from /lustre/jenkins/stria/workspace/Trilinos-atdm-van1-tx2_arm-20.1_openmpi-4.0.3_openmp_static_dbg/SRC_AND_BUILD/Trilinos/packages/sacado/test/GTestSuite/RealELRCacheFadUnitTests.cpp:30:
/lustre/jenkins/stria/workspace/Trilinos-atdm-van1-tx2_arm-20.1_openmpi-4.0.3_openmp_static_dbg/SRC_AND_BUILD/Trilinos/packages/sacado/test/GTestSuite/FadUnitTests2.hpp:98:1: error: unknown type name 'TYPED_TEST_SUITE_P'
TYPED_TEST_SUITE_P(FadOpsUnitTest2);
^
/lustre/jenkins/stria/workspace/Trilinos-atdm-van1-tx2_arm-20.1_openmpi-4.0.3_openmp_static_dbg/SRC_AND_BUILD/Trilinos/packages/sacado/test/GTestSuite/FadUnitTests2.hpp:99:1: error: unknown type name 'TYPED_TEST_SUITE_P'
TYPED_TEST_SUITE_P(RealFadOpsUnitTest2);
^
/lustre/jenkins/stria/workspace/Trilinos-atdm-van1-tx2_arm-20.1_openmpi-4.0.3_openmp_static_dbg/SRC_AND_BUILD/Trilinos/packages/sacado/test/GTestSuite/FadUnitTests2.hpp:101:1: error: use of undeclared identifier 'gtest_typed_test_case_p_state_FadOpsUnitTest2_'
TYPED_TEST_P(FadOpsUnitTest2, testAddition) {
^
atse/libs/arm/yaml-cpp/0.6.2/include/gtest/gtest-typed-test.h:234:7: note: expanded from macro 'TYPED_TEST_P'
      GTEST_TYPED_TEST_CASE_P_STATE_(CaseName).AddTestName(\
      ^
atse/libs/arm/yaml-cpp/0.6.2/include/gtest/gtest-typed-test.h:208:3: note: expanded from macro 'GTEST_TYPED_TEST_CASE_P_STATE_'
  gtest_typed_test_case_p_state_##TestCaseName##_
  ^
<scratch space>:11:1: note: expanded from here
gtest_typed_test_case_p_state_FadOpsUnitTest2_
^

Looking at the updates for testing day 2020-08-04 here, it seems likely that commits from the PR #7736 are causing this.

Current Status on CDash

Steps to Reproduce

One should be able to reproduce this failure on the machine as described in:

More specifically, the commands given for the system 'tlcc2' are provided at:

The exact commands to reproduce this issue on a TLCC-2 machine like 'chama' or 'skybridge' should be:

$ cd <some_build_dir>/

$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh \
    Trilinos-atdm-tlcc2-intel-opt-openmp

$ cmake \
 -GNinja \
 -DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
 -DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Sacado=ON \
 $TRILINOS_DIR

$ make NP=16   # Or 'ninja -j16'

One can also log into 'stria' to reproduce the build errors for the 'van1-tx2' builds as described at:

by doing:

$ cd <some_build_dir>/

$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh \
    Trilinos-atdm-van1-tx2_arm-20.1_openmpi-4.0.3_openmp_static_dbg

$ cmake \
 -GNinja \
 -DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
 -DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Sacado=ON \
 $TRILINOS_DIR

$ make NP=16   # Or 'ninja -j16'
@bartlettroscoe bartlettroscoe added pkg: Sacado impacting: configure or build The issue is primarily related to configuring or building client: ATDM Any issue primarily impacting the ATDM project ATDM Sev: Nonblocker Problems with Trilinos that should not block ATDM APPs from getting updates PA: Nonlinear Solvers Issues that fall under the Trilinos Nonlinear Linear Solvers Product Area labels Aug 4, 2020
@bartlettroscoe bartlettroscoe added the type: bug The primary issue is a bug in Trilinos code or tests label Aug 4, 2020
etphipp added a commit to etphipp/Trilinos that referenced this issue Aug 4, 2020
This adds the option Sacado_ENABLE_GTest for turning off the Sacado GTest suite
(previously it was required if tests were turned on).  The included gtest does
not build on all platforms (for various reasons), so this allows one to turn
it off.

For issue trilinos#7778
@etphipp
Copy link
Contributor

etphipp commented Aug 4, 2020

OK, I looked into what was going wrong on both platforms. For tlcc, it is the use of Intel 18, which is failing on perfectly valid C++ code inside gtest. Not much can be done with that. For arm, the issue is the yaml-cpp module on that machine includes an old copy of gtest, and the build is picking up those headers instead of the ones in Trilinos (because Sacado optionally depends on Teuchos, which has an optional YAML TPL dependency). This is primarily a problem with the yaml module on that machine.

But both cases can be addressed by just turning of the gtest suite in Sacado. PR #7780 adds such an option. Someone would just need to add

-D Sacado_ENABLE_GTest=OFF \

to the configure scripts for those machines. I don't want it off by default because the whole point of moving the tests to gtest was to make sure those tests run on most machines.

@etphipp
Copy link
Contributor

etphipp commented Aug 4, 2020

I should add the arm issue would arise for anyone enabling the yaml tpl and gtest in Trilinos, not just Sacado.

@bartlettroscoe
Copy link
Member Author

@etphipp, the problem with setting Sacado_ENABLE_GTest=OFF is that this will disable a bunch of tests in Sacado, no? If we are going to disable tests, we should do so on the finest grained level we can. For this one tlcc2 intel-18 build error, we could just disable the building and running of that one test with:

  -DSacado_GTestSuite_EXEC_DISABLE=ON \
  -DSacado_GTestSuite_MPI_1_DISABLE=ON \

as described here:

That will disable the building and running of that test.

As for the 'arm-20.1' build on 'stria', note that the yaml-cpp TPL is not being enabled as can be see here showing:

Explicitly disabled TPLs on input (by user or by default):  yaml-cpp ...  9

...

Final set of non-enabled TPLs:  ... yaml-cpp ... 101

Where are you seeing that the 'yaml-cpp' TPL is getting eanbled?

Therefore, it must be picking up gtest.h from the PATH or something?

But I do see the yaml-cpp module is being loaded:

$ . cmake/std/atdm/load-env.sh Trilinos-atdm-van1-tx2_arm-20.1_openmpi-4.0.3_openmp_static_dbg
Hostname 'stria-login1' matches known ATDM host 'stria-login1' and system 'van1-tx2'
Setting compiler and build options for build-name 'Trilinos-atdm-van1-tx2_arm-20.1_openmpi-4.0.3_openmp_static_dbg'
Using ARM ATSE compiler stack ARM-20.1_OPENMPI-4.0.3 to build DEBUG code with Kokkos node type OPENMP

$ module list

Currently Loaded Modules:
  1) autotools       17) superlu_dist/5.4.0
  2) git/2.26.2      18) boost/1.72.0
  3) zlib/1.2.11     19) fftw/3.3.8
  4) bzip2/1.0.6     20) singularity/3.5.3
  5) xz/5.2.4        21) devpack-arm/20200529
  6) yaml-cpp/0.6.2  22) python/2.7.18-arm
  7) numactl/2.0.12  23) sparc-tools/aerotools/2
  8) hwloc/1.11.11   24) sparc-tools/taos
  9) pmix/2.2.3      25) binutils/2.33.1
 10) netcdf/4.6.3    26) arm/20.1
 11) pnetcdf/1.11.1  27) openucx/1.7.0
 12) phdf5/1.10.5    28) openmpi4/4.0.3
 13) cgns/3.4.0      29) armpl/20.1.0
 14) parmetis/4.0.3  30) sparc-dev/arm-20.1_openmpi-4.0.3
 15) metis/5.1.0     31) ninja/1.8.2
 16) superlu/5.2.1   32) cmake/3.12.2

and no yaml-cpp module is being loaded for the arm-20.0 env.

But we can fix that by just unloading the yaml-cpp module after loading the sparc-dev module with:

$ module unload yaml-cpp/0.6.2

which then shows:

$ module list

Currently Loaded Modules:
  1) autotools
  2) git/2.26.2
  3) zlib/1.2.11
  4) bzip2/1.0.6
  5) xz/5.2.4
  6) numactl/2.0.12
  7) hwloc/1.11.11
  8) pmix/2.2.3
  9) netcdf/4.6.3
 10) pnetcdf/1.11.1
 11) phdf5/1.10.5
 12) cgns/3.4.0
 13) parmetis/4.0.3
 14) metis/5.1.0
 15) superlu/5.2.1
 16) superlu_dist/5.4.0
 17) boost/1.72.0
 18) fftw/3.3.8
 19) singularity/3.5.3
 20) devpack-arm/20200529
 21) python/2.7.18-arm
 22) sparc-tools/aerotools/2
 23) sparc-tools/taos
 24) binutils/2.33.1
 25) arm/20.1
 26) openucx/1.7.0
 27) openmpi4/4.0.3
 28) armpl/20.1.0
 29) sparc-dev/arm-20.1_openmpi-4.0.3
 30) ninja/1.8.2
 31) cmake/3.12.2

Do those two tweaks fix the problems in these two sets of builds?

@etphipp
Copy link
Contributor

etphipp commented Aug 5, 2020

OK, I verified unloading the yaml module fixes the stria error, so that is an acceptable solution.

With regards to tlcc, disabling that one test has exactly the same effect as disabling GTest in Sacado (that one test is actually O(2k) individual unit tests) since that is the only test executable that uses GTest. Personally I think it is better to just disable GTest, since if another executable using GTest was ever added to Sacado, it would have to be explicitly disabled too. But regardless, I don't really care which way it is done.

@bartlettroscoe
Copy link
Member Author

@etphipp

OK, I verified unloading the yaml module fixes the stria error, so that is an acceptable solution.

Okay, I will post a PR for that (after I test on 'stria').

With regards to tlcc, disabling that one test has exactly the same effect as disabling GTest in Sacado (that one test is actually O(2k) individual unit tests) since that is the only test executable that uses GTest. Personally I think it is better to just disable GTest, since if another executable using GTest was ever added to Sacado, it would have to be explicitly disabled too. But regardless, I don't really care which way it is done.

Okay, then we just need to get PR #7780 merged and then create another PR that sets Sacado_EANBLE_GTest=OFF for those 'tlcc2' builds. For that matter, can we just add a commit to PR #7780 that does that? I can take care of that if you think that is a good idea.

@etphipp
Copy link
Contributor

etphipp commented Aug 5, 2020

Okay, then we just need to get PR #7780 merged and then create another PR that sets Sacado_EANBLE_GTest=OFF for those 'tlcc2' builds. For that matter, can we just add a commit to PR #7780 that does that? I can take care of that if you think that is a good idea.

That's fine with me.

bartlettroscoe added a commit to bartlettroscoe/Trilinos that referenced this issue Aug 5, 2020
This was being unloaded for the arm-20.0 env already!  This should hopefully
fix the Sacado build problem reported in trilinos#7778.
trilinos-autotester added a commit that referenced this issue Aug 5, 2020
…1-tx2-no-yaml-cpp

Automatically Merged using Trilinos Pull Request AutoTester
PR Title: ATDM: Unload yaml-cpp module for arm-20.1 env (#7778)
PR Author: bartlettroscoe
bartlettroscoe added a commit to etphipp/Trilinos that referenced this issue Aug 5, 2020
Code related to this test does not build with this intel-18.0.0.20170811
compiler.
@bartlettroscoe
Copy link
Member Author

FYI: PR #7784 just merged so the build errors in the 'van1-tx2' build should be gone tomorrow. The updated PR #7780 should be ready to merge. It just needs to be run by and pass PR testing.

@bartlettroscoe
Copy link
Member Author

FYI: PR #7780 just merged. With that, these Sacado build errors should be cleaned up tomorrow. Putting in review and will close once we get back results from CDash tomorrow. (NOTE: @rmmilewi's tool Grover will automatically update issues like this in the near future.)

@bartlettroscoe
Copy link
Member Author

FYI: I added tracking of this test to the drivers in:

so once the Grover tool (see #3887) is deployed, then it will automatically update this issue with the status of these tests. That tool should be deployed very soon (Friday?) so I would like to leave this issue open so that it will get that update as a test case.

jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Aug 6, 2020
…s:develop' (d8a0634).

* trilinos-develop:
  Update version for release 13.1
  ATDM: Set Sacado_ENABLE_GTest=OFF for tlcc2 intel-18 (trilinos#7778)
  ATDM: Unload yaml-cpp module for arm-20.1 env (trilinos#7778)
  Sacado:  Add option for turning off Gtest if desired.
  I commented out an assert to get debug tests to run
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Aug 6, 2020
…s:develop' (d8a0634).

* trilinos-develop:
  Update version for release 13.1
  ATDM: Set Sacado_ENABLE_GTest=OFF for tlcc2 intel-18 (trilinos#7778)
  ATDM: Unload yaml-cpp module for arm-20.1 env (trilinos#7778)
  Sacado:  Add option for turning off Gtest if desired.
  I commented out an assert to get debug tests to run
@grover-trilinos
Copy link

Test results for issue #7778 as of 2020-08-16

Tests with issue trackers Passed: twip=2
Tests with issue trackers Missing: twim=2

Detailed test results: (click to expand)

Tests with issue trackers Passed: twip=2

Site Build Name Test Name Status Details Consec­utive Pass Days Non-pass Last 30 Days Pass Last 30 Days Issue Tracker
stria Trilinos-atdm-van1-tx2_­arm-20.1_­openmpi-4.0.3_­openmp_­static_­dbg Sacado_­GTestSuite_­MPI_­1 Passed Completed 11 2 11 #7778
stria Trilinos-atdm-van1-tx2_­arm-20.1_­openmpi-4.0.3_­openmp_­static_­opt Sacado_­GTestSuite_­MPI_­1 Passed Completed 11 2 11 #7778

Tests with issue trackers Missing: twim=2

Site Build Name Test Name Status Details Consec­utive Missing Days Non-pass Last 30 Days Pass Last 30 Days Issue Tracker
chama Trilinos-atdm-tlcc2-intel-debug-openmp Sacado_­GTestSuite_­MPI_­1 Missing Missing 12 1 0 #7778
skybridge Trilinos-atdm-tlcc2-intel-opt-openmp Sacado_­GTestSuite_­MPI_­1 Missing Missing 11 2 0 #7778

This is an automated comment generated by Grover. Each week, Grover collates and reports data from CDash in an automated way to make it easier for developers to stay on top of their issues. Grover saw that there tests being tracked on CDash that are associated with this open issue. If you have a question, please reach out to Ross. I'm just a cat.

@bartlettroscoe
Copy link
Member Author

As shown above, the test Sacado_­GTestSuite_­MPI_­1 associated with this exectuable is passing or is disabled (missing) in these builds and has been so for 12 days so we can close this.

Thank you Grover!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ATDM Sev: Nonblocker Problems with Trilinos that should not block ATDM APPs from getting updates client: ATDM Any issue primarily impacting the ATDM project impacting: configure or build The issue is primarily related to configuring or building PA: Nonlinear Solvers Issues that fall under the Trilinos Nonlinear Linear Solvers Product Area pkg: Sacado type: bug The primary issue is a bug in Trilinos code or tests
Projects
None yet
Development

No branches or pull requests

3 participants