Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: Add HDF5 to a windows build #3732

Merged

Conversation

scottwittenburg
Copy link
Collaborator

@scottwittenburg scottwittenburg commented Aug 1, 2023

Turns on HDF5 for windows builds using the vs2022 compiler.

Fixes: #3662

@scottwittenburg
Copy link
Collaborator Author

The build turned up warnings (which were treated as errors) like this one:

  D:\a\ADIOS2\ADIOS2\source\source\adios2\engine\hdf5\HDF5ReaderP.cpp(266,1): error C2220: the following warning is treated as an error [D:\a\ADIOS2\ADIOS2\win2022-vs2022-ompi\source\adios2\adios2_hdf5.vcxproj]
  ...
  D:\a\ADIOS2\ADIOS2\source\source\adios2\engine\hdf5\HDF5ReaderP.cpp(266,1): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data [D:\a\ADIOS2\ADIOS2\win2022-vs2022-ompi\source\adios2\adios2_hdf5.vcxproj]
  D:\a\ADIOS2\ADIOS2\source\source\adios2\engine\hdf5\HDF5ReaderP.cpp(186,1): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data [D:\a\ADIOS2\ADIOS2\win2022-vs2022-ompi\source\adios2\adios2_hdf5.vcxproj]

Looking at the code, I wonder why in VariableBase the m_StepStart is defined as size_t, while the HDF5Common api thinks of steps as int? Should steps be considered as size_t across the entire codebase? Or should we just continue with the implicit conversion but do some runtime check/assert first? Or something else?

@vicentebolea

@vicentebolea
Copy link
Collaborator

vicentebolea commented Aug 2, 2023

Looking at the code, I wonder why in VariableBase the m_StepStart is defined as size_t, while the HDF5Common api thinks of steps as int? Should steps be considered as size_t across the entire codebase? Or should we just continue with the implicit conversion but do some runtime check/assert first? Or something else?

There reason might be to make it interoperable with HDF5, @guj more expertise in those modules.

@scottwittenburg scottwittenburg force-pushed the ci-add-hdf5-to-windows branch 5 times, most recently from d398627 to 7e3b840 Compare August 18, 2023 00:58
@guj
Copy link
Contributor

guj commented Aug 18, 2023

Looking at the code, I wonder why in VariableBase the m_StepStart is defined as size_t, while the HDF5Common api thinks of steps as int? Should steps be considered as size_t across the entire codebase? Or should we just continue with the implicit conversion but do some runtime check/assert first? Or something else?

There reason might be to make it interoperable with HDF5, @guj more expertise in those modules.

Sorry I just saw this. Scott is right, steps should be size_t. I will correct it in HDF5Common.* in another pull request.

@scottwittenburg
Copy link
Collaborator Author

Sorry I just saw this. Scott is right, steps should be size_t. I will correct it in HDF5Common.* in another pull request.

Sounds good, thanks @guj!

@guj
Copy link
Contributor

guj commented Aug 27, 2023

Sorry I just saw this. Scott is right, steps should be size_t. I will correct it in HDF5Common.* in another pull request.

Sounds good, thanks @guj!

@guj guj closed this Aug 27, 2023
@scottwittenburg
Copy link
Collaborator Author

@guj I think we still want this, since it will allow testing HDF5+windows in CI. I'm re-opening and will rebase on the latest master, which should allow verifying the fixes you merged last week.

@scottwittenburg
Copy link
Collaborator Author

@vicentebolea I can reproduce the 6 errors in the Windows/VS2022 builds locally, and I added the latest WIP commit to try to see what's happening in CI when the test passes vs fails. I'm guessing the spack-installed hdf5 in our ubuntu images is missing the extra COMPONENT=HL I had to add for debugging, so those all failed to configure in the latest builds.

But still I can see the debug output from the failed tests as well as from at least one passing. The test failure I instrumented with extra debug output is happening at this line. And what's weird to me is that the h5Type variable which is supposed to equal H5T_NATIVE_LONG has the same description (H5T_STD_I64LE) in both the failed and passing tests.

The HDF5 docs have a table (a little ways down here) which says that the "native" type H5T_NATIVE_LONG corresponds to, among a few other "standard" types, H5T_STD_I64LE. So I don't understand why that isn't holding true for Windows/VS2022.

If you or @guj have any thoughts about what could be going on, please let me know. Thanks!

@guj
Copy link
Contributor

guj commented Aug 31, 2023

@vicentebolea I can reproduce the 6 errors in the Windows/VS2022 builds locally, and I added the latest WIP commit to try to see what's happening in CI when the test passes vs fails. I'm guessing the spack-installed hdf5 in our ubuntu images is missing the extra COMPONENT=HL I had to add for debugging, so those all failed to configure in the latest builds.

But still I can see the debug output from the failed tests as well as from at least one passing. The test failure I instrumented with extra debug output is happening at this line. And what's weird to me is that the h5Type variable which is supposed to equal H5T_NATIVE_LONG has the same description (H5T_STD_I64LE) in both the failed and passing tests.

The HDF5 docs have a table (a little ways down here) which says that the "native" type H5T_NATIVE_LONG corresponds to, among a few other "standard" types, H5T_STD_I64LE. So I don't understand why that isn't holding true for Windows/VS2022.

If you or @guj have any thoughts about what could be going on, please let me know. Thanks!

Thanks @scottwittenburg for checking on windows!!
Looking from these two docs:
https://learn.microsoft.com/en-us/windows/win32/winprog/windows-data-types
https://learn.microsoft.com/en-us/cpp/c-runtime-library/standard-types?view=msvc-170
My guess is, 64 bit integer type H5T_STD_I64LE is treated as long long.
so asserting with H5T_NATIVE_LLONG might pass for windows.

similarly for the later assertion of type H5T_NATIVE_ULONG ..

@scottwittenburg
Copy link
Collaborator Author

so asserting with H5T_NATIVE_LLONG might pass for windows, similarly for the later assertion of type H5T_NATIVE_ULONG ..

You were correct @guj, thank you for your help 😁

In 2e37421 I've taken a stab at what you suggested, and it has resolved the remaining windows+hdf5 failures. If you get a chance, can you take a look and make sure that fix looks good to you? Thanks again!

@scottwittenburg
Copy link
Collaborator Author

@vicentebolea I think this is finally going to pass all the tests 😅 Can you please review when you get a chance? Thanks!

@guj
Copy link
Contributor

guj commented Aug 31, 2023

so asserting with H5T_NATIVE_LLONG might pass for windows, similarly for the later assertion of type H5T_NATIVE_ULONG ..

You were correct @guj, thank you for your help 😁

In 2e37421 I've taken a stab at what you suggested, and it has resolved the remaining windows+hdf5 failures. If you get a chance, can you take a look and make sure that fix looks good to you? Thanks again!

Looks good to me! Thanks @scottwittenburg !

Copy link
Collaborator

@vicentebolea vicentebolea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good. I proposed a change in one the examples, apply the suggestion if you think that it makes sense and I will approve again.

In any case already approved.

examples/heatTransfer/write/IO_hdf5_a.cpp Show resolved Hide resolved
@scottwittenburg
Copy link
Collaborator Author

@vicentebolea The crusher-kokkos-hip build failed with:

slurmstepd: error: *** JOB 367818 ON crusher189 CANCELLED AT 2023-08-31T15:37:19 DUE TO TIME LIMIT ***

Have you seen that before? Should I just retry that job?

@vicentebolea
Copy link
Collaborator

@vicentebolea The crusher-kokkos-hip build failed with:

slurmstepd: error: *** JOB 367818 ON crusher189 CANCELLED AT 2023-08-31T15:37:19 DUE TO TIME LIMIT ***

Have you seen that before? Should I just retry that job?

Hiccups at the olcf ci, i will retry the job a bit later

@vicentebolea
Copy link
Collaborator

@scottwittenburg CI passing

@vicentebolea
Copy link
Collaborator

@scottwittenburg feel free to merge if you see fit, approved :)

@scottwittenburg scottwittenburg merged commit fe61f77 into ornladios:master Sep 5, 2023
@scottwittenburg scottwittenburg deleted the ci-add-hdf5-to-windows branch September 5, 2023 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add HDF5 on the win builds
3 participants