-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new NetCDF_jll v400.702.402+0 broken on Windows #4511
Comments
Smells of upstream bug to me (and given the track record of troubles with this library I'm not even surprised) |
I retried to make a NetCDF 4.8.1 binary (with HDF5 1.12.1) and we got the same |
There is no concept of platform-specificity in the registry nor the package manager, so that's unrealistic |
Should this release be yanked then from all platforms? Unfortunately, this will remove the Apple M1 binary, but Windows is pretty common ... (but the real solution we an updated working binary for all platforms). |
That may be necessary, yes. But it'd also be great to understand why the Windows build is broken again. We haven't touched the mingw toolchain for quite some time, the hdf5 source is always the same (just a newer version maybe?), version of netcdf source is also the same as the last working version, no? However, please make sure no dependents of netcdf_jll require the version you want to yank, otherwise you'll make those packages completely broken |
It might be related to the HDF5_jll update. HDF5.jl runs fine but maybe an issue with the header files in HDF5_jll?
yes, that is exactly the same version of NetCDF sources. If we yank NetCDF_jll 400.702.402+0 we will fall-back to 400.702.400+0, I guess we will have a problem with I did not see any other package incompatible with
|
The funny thing is that Windows is the only platform for which we always used the same source: the msys2 libraries. |
Yes, indeed. I was wondering if there is reason to use msys2 over conda-forge on Windows for HDF5... |
Msys2 is probably more consistent with the toolchain we use here (mingw) and we still need to provide the runtime dependencies. You'd need to investigate whether conda-forge build for Windows is compatible with our other libraries. |
You are right, conda-forge uses the Windows Visual C/C++ compiler (https://conda-forge.org/docs/maintainer/knowledge_base.html) Given that TempestRemap_jll depends on NetCDF_jll 400.702.402+0, does this means that we out of luck with the yanking this NetCDF_jll version? Unfortunately, the diffs in the headers file quite massive (for a patch release!):
|
As an alternative to yanking this version, could we also release a NetCDF 4.8.1 binary (which I got to build after applying several patches) with the "old" HDF5 1.12.0 ? |
With HDF5 1.12.2 from https://packages.msys2.org/package/mingw-w64-x86_64-hdf5, I don't see this error anymore. I guess we would be able to upgrade HDF5 to 1.12.2 once this is merged: |
That's great news! Only 10 pending reviewers I see, haha. |
So I helped a bit to land conda-forge/hdf5-feedstock#175, updating HDF5_jll to 1.12.2 in #5248. This however needed to be yanked from the registry again after finding out that #5249 conda-forge now builds against a libcurl version that is too new for us. Does anyone have ideas how to get our hands on HDF5 1.12.2 builds for these platforms: Yggdrasil/H/HDF5/build_tarballs.jl Lines 20 to 25 in ed865e2
See also #5249 (comment), in case we can use conda infrastructure. |
@visr thanks a lot for your work in updating HDF5 1.12.2. Too bad that we have now this libcurl issue, apparently only on MacOS). (I had a similar issue recently: #5031, but I think it is unrelated). I am not sure how to proceed. What about these possibilities:
Would option 1 or 2 have a chance to get accepted? |
For me personally (1) sounds like a good quick fix that I had not thought of. Since the difference is only a patch release it might be acceptable, though I'm curious to see what @giordano thinks. Here are the patch release notes: https://www.hdfgroup.org/2022/04/release-of-hdf5-1-12-2-newsletter-183/. (2) sounds like a good medium term solution to get more platforms supported. Will probably be some work to organize though. Using julia's buildkite for this might make it easier to get more platforms in one setup. I wonder how that effort compares to getting HDF5 to cross compile at this point (different skills though). |
I'm not a huge fan of either solution (especially mixing and lying about version numbers), but hey, don't we lie all the time? (but at least we don't usually mix different versions....). My problem with 2 is who's going to maintain that? Certaintly not me, I can barely keep up with one project, another one using tools I'm completely unfamiliar with is out of reach for me at the moment (moment which will last fairly long). If it's someone else who ensures everything works fine and here we only need to click on merge, then it's ok.
If you want to understand what the error is about: https://github.com/giordano/macos-compatibility-version |
While keeping the MinGW binaries at 1.12.2, for libnetcdf compatibility. See discussion at JuliaPackaging#4511 (comment).
I submitted option 1 from @Alexander-Barth in #5251. Nobody loves this solution I think, but it should give us a build that we can work with for now, buying us time to get to better solutions. |
For the record: I tried to compile a Linux binary for HDF5 within binary builder (first part of option 2). Unfortunately, it turned out more complicated than I thought (as usual). In fact, the build system uses x86_64-pc-linux-musl and the target is x86_64-linux-gnu, so the build system considers this as cross-compilation HDF5 and fails with:
(even when setting I have seen that conda-forge was able to cross-compile HDF5 for MacOS - aarch64. I guess that this patch would be allow us to by-pass this configure test: Unfortunately, this patch is quite complicated and against an automatically generated file ( |
Yes, cross compilation is challenging with HDF5. The main issue is that it requires one to obtain configuration from the target platform by executing test programs on that platform. Last time I looked into this, I think could be done via the CMakeCache. It probably should be changed so that we can figure out the calculation by just compiling a program since the compiler already knows many of these configuration details. See https://forum.hdfgroup.org/t/cross-compiling-for-windows/6735/6 |
While keeping the MinGW binaries at 1.12.2, for libnetcdf compatibility. See discussion at #4511 (comment).
I tried to compile a sample HDF5 program from https://github.com/HDFGroup/hdf5-examples within BinaryBuilder , but I got a segmentation fault. Here are my steps: From a BinaryBuilder session (with mingw64 target):
I transferred the binary
The example does work on my Linux system and on Windows when compiled natively with MSYS2. Should this test not also succeed on Windows with cross-compilation using BinaryBuilder? |
If I also extract
The example program seems to abort at this line: |
I'm getting confused. The HDF5 libraries are coming from msys2: Yggdrasil/H/HDF5/build_tarballs.jl Lines 15 to 18 in f81af38
Shouldn't you be able to grab that package from msys2 and compile within msys2? |
It is this package exactly: |
Yes, I am confused too, HDF5/NetCDF binaries has been a never-ending stream of moments of confusion :-) The checksums of the HDF5 lib for native compilation in MSYS and BinaryBuilder are identical. I mentioned a similar problem here: Version of GCC is different (12.1 in MSYS, 4.8.5 in BinaryBuilder)... The error is also reproducible with this smaller example This example stops at H5Dcreate when running a cross-compiled binary: |
Yes, I installed this library using the package manager of MSYS (pacman). |
We can specify a GCC version in BinaryBuilder. What I would like to know is if you can compile and run the example completely within msys2. If so, then what is the difference between running it within msys2 and outside msys2? |
Yes, this what I referred as native compilation before.
The difference from what I have seen, is that when you run the binary within msys for any missing DLL (like libwinpthread-1.dll), the DLL at the standard location from MSYS will be used while when you run it outside of MSYS2 an error message is produced. Within MSYS you might get silently get an incompatible (or at least different) DLL. I run only the cross-compiled version outside of MSYS. |
As a test, I tried to cross-compile with the mingw cross compile from Ubuntu 20.04:
I got the HDF5 dll from https://mirror.msys2.org/mingw/mingw64/mingw-w64-x86_64-hdf5-1.12.2-1-any.pkg.tar.zst, extracted in
Surprisingly, the cross compiled binary using the Ubuntu tool chain worked on Windows! So maybe it is a (cross-)compiler bug in gcc triggered by a change in HDF5 ? It is possible to update the compiler version in BinaryBuilder? We use currently version gcc 4.8.5 from 2015. |
Oh nice find! Are you looking for Yggdrasil/G/GDAL/build_tarballs.jl Lines 129 to 130 in 3af3a11
|
Perfect! Yes, indeed with
x64_64 Linux and MacOS work too. I will make a PR soon! |
Oh that's awesome. So in the end it came down to a compiler bug. I guess we should've tried newer GCC sooner haha. It never occured to me. Thanks for the great effort! |
I did not realize that it was so easy to change compiler version. I assumed somehow that we use the same version across all packages. It is great to know that we have to flexibility. |
Yeah I believe for maximum compatibility with old systems and clusters the default GCC is so old. Though if that doesn't work, it can be bumped. |
@Alexander-Barth right now there are three separate build scripts, for julia 1.3+, 1.6+ and 1.8+. HDF5_jll and many others have stopped building new versions for 1.3+, shall we drop it as well? Do you know if having separate build scripts for 1.6+ and 1.8+ is required? This is the only difference: Lines 106 to 116 in df11ddf
and I don't see any other packages requiring libcurl 7.84. Using |
@visr You are right, we can consolidate all this (and drop julia 1.6). I just made a test on Linux/Windows/MacOS (x86_64) and julia 1.6/1.8 and all tests pass: https://github.com/Alexander-Barth/NCDatasets.jl/actions/runs/2860834779 I made a PR here: #5319 with a single build script for current julia versions. |
Oh that's great news, that it works and that all tests pass! Should hopefully make it a bit easier to maintain as well. |
I think that this can be closed with the release of NetCDF_jll v400.902.5+1. |
Unfortunately, the new NetCDF_jll v400.702.402+0 does not work on Windows (as far as I know Linux x86_64 and apple M1 are fine).
The new version of NetCDF_jll was created in this commit:
#4481
The errors seem to be related to the upgrade of HDF5_jll v1.12.1+0.
Related bug reports:
Alexander-Barth/NCDatasets.jl#164
JuliaGeo/NetCDF.jl#151
This is the full error message when a Windows user (on julia 1.7.2) creates a NetCDF (with HDF5 backend) as reported @visr is below.
Is there a way to test via CI natively the library before releasing them?
The text was updated successfully, but these errors were encountered: