Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Builds but fails many (not all) tests #42

Open
mathomp4 opened this issue May 31, 2023 · 4 comments
Open

[Bug]: Builds but fails many (not all) tests #42

mathomp4 opened this issue May 31, 2023 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@mathomp4
Copy link

What happened?

First, my system is:

  • macOS 13.4
  • GCC 12.2
  • Open MPI 4.1.5
  • HDF5 1.10.10
  • python 3.11.3

This is part of my ongoing attempt to make neural-fortran work with pre-compiled libraries (see modern-fortran/neural-fortran#128 and modern-fortran/neural-fortran#129).

Also, I've tried this with both 4.6.3 (the version neural-fortran points to) and 4.10.2 (the current latest release). All errors below are for 4.10.2, but I had failures with 4.6.3 as well

Now, I have to do some weird things due to how we build HDF5 in our library stack (autotools, static, and then weird install paths on that), but if I add:

-DHDF5_ROOT="$(prefix);$(prefix)/include/hdf5;$(prefix)/include/szlib

to the cmake line, everything seems to be found:

-- Looking for H5_HAVE_FILTER_SZIP
-- Looking for H5_HAVE_FILTER_SZIP - found
-- Looking for H5_HAVE_FILTER_DEFLATE
-- Looking for H5_HAVE_FILTER_DEFLATE - found
-- Looking for H5_HAVE_PARALLEL
-- Looking for H5_HAVE_PARALLEL - found
...
-- Looking for H5Pset_fapl_mpio
-- Looking for H5Pset_fapl_mpio - found
-- Performing Test HDF5_C_links
-- Performing Test HDF5_C_links - Success
-- Performing Test HDF5_Fortran_links
-- Performing Test HDF5_Fortran_links - Success
-- Found HDF5: /Users/mathomp4/Baselibs/ESMA-Baselibs-main-with-neural-fortran/aarch64-apple-darwin22.5.0/gfortran/Darwin/lib/libhdf5_hl.a;/Users/mathomp4/Baselibs/ESMA-Baselibs-main-with-neural-fortran/aarch64-apple-darwin22.5.0/gfortran/Darwin/lib/libhdf5.a (found version "1.10.10") found components: Fortran
...

and a make install runs fine as well.

But on ctest:

Test project /Users/mathomp4/Baselibs/ESMA-Baselibs-main-with-neural-fortran/src/h5fortran/build
      Start  1: minimal
 1/27 Test  #1: minimal ..........................   Passed    0.70 sec
      Start  2: array
 2/27 Test  #2: array ............................***Failed    0.28 sec
      Start  3: attributes
 3/27 Test  #3: attributes .......................***Failed    0.27 sec
      Start 24: PythonAttributes
 4/27 Test #24: PythonAttributes .................   Passed    0.13 sec
      Start  4: attributes_read
 5/27 Test  #4: attributes_read ..................   Passed    0.27 sec
      Start  5: cast
 6/27 Test  #5: cast .............................***Failed    0.28 sec
      Start  6: deflate_write
 7/27 Test  #6: deflate_write ....................***Failed    0.26 sec
      Start  7: deflate_read
Failed test dependencies: deflate_write
 8/27 Test  #7: deflate_read .....................***Not Run   0.00 sec
      Start  8: deflate_props
Failed test dependencies: deflate_write
 9/27 Test  #8: deflate_props ....................***Not Run   0.00 sec
      Start  9: destructor
10/27 Test  #9: destructor .......................   Passed    0.26 sec
      Start 10: exist
11/27 Test #10: exist ............................***Failed    0.26 sec
      Start 11: fill
12/27 Test #11: fill .............................***Failed    0.25 sec
      Start 12: groups
13/27 Test #12: groups ...........................***Failed    0.26 sec
      Start 20: write
14/27 Test #20: write ............................   Passed    0.25 sec
      Start 13: layout
15/27 Test #13: layout ...........................***Failed    0.25 sec
      Start 14: lt
16/27 Test #14: lt ...............................***Failed    0.25 sec
      Start 15: scalar
17/27 Test #15: scalar ...........................   Passed    0.26 sec
      Start 16: shape
18/27 Test #16: shape ............................   Passed    0.25 sec
      Start 17: string
19/27 Test #17: string ...........................***Failed    0.26 sec
      Start 26: PythonString
20/27 Test #26: PythonString .....................   Passed    0.12 sec
      Start 18: string_read
21/27 Test #18: string_read ......................   Passed    0.28 sec
      Start 19: version
22/27 Test #19: version ..........................   Passed    0.25 sec
      Start 21: fail_read_size_mismatch
23/27 Test #21: fail_read_size_mismatch ..........   Passed    0.26 sec
      Start 22: fail_read_rank_mismatch
24/27 Test #22: fail_read_rank_mismatch ..........   Passed    0.25 sec
      Start 23: fail_nonexist_variable
25/27 Test #23: fail_nonexist_variable ...........   Passed    0.25 sec
      Start 25: PythonShape
26/27 Test #25: PythonShape ......................   Passed    0.13 sec
      Start 27: h5ls
27/27 Test #27: h5ls .............................***Not Run (Disabled)   0.00 sec

54% tests passed, 12 tests failed out of 26

Label Time Summary:
h5fortran    =   6.28 sec*proc (27 tests)
python       =   0.38 sec*proc (3 tests)
shaky        =   0.76 sec*proc (3 tests)

Total Test time (real) =   6.29 sec

The following tests did not run:
	 27 - h5ls (Disabled)

The following tests FAILED:
	  2 - array (Failed)
	  3 - attributes (Failed)
	  5 - cast (Failed)
	  6 - deflate_write (Failed)
	  7 - deflate_read (Not Run)
	  8 - deflate_props (Not Run)
	 10 - exist (Failed)
	 11 - fill (Failed)
	 12 - groups (Failed)
	 13 - layout (Failed)
	 14 - lt (Failed)
	 17 - string (Failed)
Errors while running CTest
Output from these tests are in: /Users/mathomp4/Baselibs/ESMA-Baselibs-main-with-neural-fortran/src/h5fortran/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

I'll paste below the --rerun-failed --output-on-failure output.

Any ideas what is happening? The weird thing is, I'm pretty sure I got this working a few weeks back. I had to move on to other projects, but I had time today and...not working. You can see below that there seem to be a lot of:

ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

errors.

Relevant log output

Test project /Users/mathomp4/Baselibs/ESMA-Baselibs-main-with-neural-fortran/src/h5fortran/build
      Start  2: array
 1/13 Test  #2: array ............................***Failed    0.01 sec
  1  2  3  4
  2  4  6  8
  3  6  9 12
  4  8 12 16
 PASSED: array write
 PASSED: slice read
 PASSED: create dataset and write slice 1D
 PASSED: overwrite slice 1d, stride=1
 PASSED: overwrite slice 1d, no stride
 h5fortran:TRACE:create: deflate: /int32a-2d
 create and write slice 2d, stride=1
 PASSED: slice write
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start  3: attributes
 2/13 Test  #3: attributes .......................***Failed    0.01 sec
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start  5: cast
 3/13 Test  #5: cast .............................***Failed    0.01 sec
OK: cast write
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start  6: deflate_write
 4/13 Test  #6: deflate_write ....................***Failed    0.01 sec
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start  7: deflate_read
Failed test dependencies: deflate_write
 5/13 Test  #7: deflate_read .....................***Not Run   0.00 sec
      Start  8: deflate_props
Failed test dependencies: deflate_write
 6/13 Test  #8: deflate_props ....................***Not Run   0.00 sec
      Start 10: exist
 7/13 Test #10: exist ............................***Failed    0.01 sec
 OK: is_hdf5
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start 11: fill
 8/13 Test #11: fill .............................***Failed    0.01 sec
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start 12: groups
 9/13 Test #12: groups ...........................***Failed    0.01 sec
 OK: HDF5 group
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start 20: write
10/13 Test #20: write ............................   Passed    0.01 sec
      Start 13: layout
11/13 Test #13: layout ...........................***Failed    0.01 sec
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start 14: lt
12/13 Test #14: lt ...............................***Failed    0.01 sec
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start 17: string
13/13 Test #17: string ...........................***Failed    0.01 sec
 OK: HDF5 string write
 OK: HDF5 string read
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize


8% tests passed, 12 tests failed out of 13

Label Time Summary:
h5fortran    =   0.13 sec*proc (13 tests)

Total Test time (real) =   0.13 sec

The following tests FAILED:
	  2 - array (Failed)
	  3 - attributes (Failed)
	  5 - cast (Failed)
	  6 - deflate_write (Failed)
	  7 - deflate_read (Not Run)
	  8 - deflate_props (Not Run)
	 10 - exist (Failed)
	 11 - fill (Failed)
	 12 - groups (Failed)
	 13 - layout (Failed)
	 14 - lt (Failed)
	 17 - string (Failed)
Errors while running CTest
@mathomp4 mathomp4 added the bug Something isn't working label May 31, 2023
@scivision
Copy link
Member

This seems to be a bug with specific version of HDF5 library: 1.10.10 and 1.14.0 so far. I am trying to figure out a solution.

@scivision scivision self-assigned this Jul 24, 2023
@scivision scivision pinned this issue Jul 24, 2023
@scivision
Copy link
Member

I think this was fixed by HDF5 1.14.2, does it work for you currently?

@mathomp4
Copy link
Author

I'll try and see. I've been holding off on moving to HDF5 1.14 (as we have some code that assumes 1.10). But maybe this is the time to move forward...

@scivision
Copy link
Member

I added an option to force a local HDF5 build (under the build directory)

cmake -Bbuild -Dfind=no

cmake --build build

that would build HDF5 1.14.2 and then h5fortran

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants