Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new NetCDF_jll v400.702.402+0 broken on Windows #4511

Closed
Alexander-Barth opened this issue Feb 27, 2022 · 40 comments
Closed

new NetCDF_jll v400.702.402+0 broken on Windows #4511

Alexander-Barth opened this issue Feb 27, 2022 · 40 comments

Comments

@Alexander-Barth
Copy link
Contributor

Alexander-Barth commented Feb 27, 2022

Unfortunately, the new NetCDF_jll v400.702.402+0 does not work on Windows (as far as I know Linux x86_64 and apple M1 are fine).

The new version of NetCDF_jll was created in this commit:
#4481

The errors seem to be related to the upgrade of HDF5_jll v1.12.1+0.

Related bug reports:
Alexander-Barth/NCDatasets.jl#164
JuliaGeo/NetCDF.jl#151

This is the full error message when a Windows user (on julia 1.7.2) creates a NetCDF (with HDF5 backend) as reported @visr is below.

Is there a way to test via CI natively the library before releasing them?

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x2374cb3 -- .text at C:\Users\visser_mn\.julia\artifacts\2b6e2ce84250e36811c3019c1ad253c1739c888f\bin\libnetcdf-18.dll (unknown line)
in expression starting at REPL[13]:1
.text at C:\Users\visser_mn\.julia\artifacts\2b6e2ce84250e36811c3019c1ad253c1739c888f\bin\libnetcdf-18.dll (unknown line)
NC4_create at C:\Users\visser_mn\.julia\artifacts\2b6e2ce84250e36811c3019c1ad253c1739c888f\bin\libnetcdf-18.dll (unknown line)
NC_create at C:\Users\visser_mn\.julia\artifacts\2b6e2ce84250e36811c3019c1ad253c1739c888f\bin\libnetcdf-18.dll (unknown line)
nc__create at C:\Users\visser_mn\.julia\artifacts\2b6e2ce84250e36811c3019c1ad253c1739c888f\bin\libnetcdf-18.dll (unknown line)
nc_create at C:\Users\visser_mn\.julia\artifacts\2b6e2ce84250e36811c3019c1ad253c1739c888f\bin\libnetcdf-18.dll (unknown line)
top-level scope at .\REPL[13]:1
jl_toplevel_eval_flex at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:876
jl_toplevel_eval_flex at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:830
jl_toplevel_eval_flex at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:830
jl_toplevel_eval at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:894 [inlined]
jl_toplevel_eval_in at /cygdrive/c/buildbot/worker/package_win64/build/src\toplevel.c:944
eval at .\boot.jl:373 [inlined]
eval_user_input at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\REPL\src\REPL.jl:150
repl_backend_loop at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\REPL\src\REPL.jl:246
start_repl_backend at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\REPL\src\REPL.jl:231
#run_repl#47 at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\REPL\src\REPL.jl:364
run_repl at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\REPL\src\REPL.jl:351
#930 at .\client.jl:394
jfptr_YY.930_36349.clone_1 at C:\Users\visser_mn\.julia\juliaup\julia-1.7.2+0~x64\lib\julia\sys.dll (unknown line)
jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1788 [inlined]
jl_f__call_latest at /cygdrive/c/buildbot/worker/package_win64/build/src\builtins.c:757
#invokelatest#2 at .\essentials.jl:716 [inlined]
invokelatest at .\essentials.jl:714 [inlined]
run_main_repl at .\client.jl:379
exec_options at .\client.jl:309
_start at .\client.jl:495
jfptr__start_21275.clone_1 at C:\Users\visser_mn\.julia\juliaup\julia-1.7.2+0~x64\lib\julia\sys.dll (unknown line)
jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1788 [inlined]
true_main at /cygdrive/c/buildbot/worker/package_win64/build/src\jlapi.c:559
jl_repl_entrypoint at /cygdrive/c/buildbot/worker/package_win64/build/src\jlapi.c:701
mainCRTStartup at /cygdrive/c/buildbot/worker/package_win64/build/cli\loader_exe.c:42
BaseThreadInitThunk at C:\WINDOWS\System32\KERNEL32.DLL (unknown line)
RtlUserThreadStart at C:\WINDOWS\SYSTEM32\ntdll.dll (unknown line)
Allocations: 9681000 (Pool: 9675485; Big: 5515); GC: 13
@giordano
Copy link
Member

Smells of upstream bug to me (and given the track record of troubles with this library I'm not even surprised)

@Alexander-Barth
Copy link
Contributor Author

Alexander-Barth commented Mar 3, 2022

I retried to make a NetCDF 4.8.1 binary (with HDF5 1.12.1) and we got the same EXCEPTION_ACCESS_VIOLATION on Windows . However if we downgrade HDF5 to 1.12.0 (without Apple M1 support) NetCDF 4.8.1 and 4.7.4 works on Windows. For Linux, all tested combinations seem to work. Is it possible to release different versions for different platforms? Can we yank only the Windows version of NetCDF_jll v400.702.402+0?

@giordano
Copy link
Member

giordano commented Mar 3, 2022

Can we yank only the Windows version of NetCDF_jll v400.702.402+0?

There is no concept of platform-specificity in the registry nor the package manager, so that's unrealistic

@Alexander-Barth
Copy link
Contributor Author

Should this release be yanked then from all platforms? Unfortunately, this will remove the Apple M1 binary, but Windows is pretty common ... (but the real solution we an updated working binary for all platforms).

@giordano
Copy link
Member

That may be necessary, yes. But it'd also be great to understand why the Windows build is broken again. We haven't touched the mingw toolchain for quite some time, the hdf5 source is always the same (just a newer version maybe?), version of netcdf source is also the same as the last working version, no?

However, please make sure no dependents of netcdf_jll require the version you want to yank, otherwise you'll make those packages completely broken

@Alexander-Barth
Copy link
Contributor Author

But it'd also be great to understand why the Windows build is broken again.

It might be related to the HDF5_jll update. HDF5.jl runs fine but maybe an issue with the header files in HDF5_jll?
The error does not occur in NetCDF when you do not use the HDF5 backend format.

version of netcdf source is also the same as the last working version, no?

yes, that is exactly the same version of NetCDF sources.

If we yank NetCDF_jll 400.702.402+0 we will fall-back to 400.702.400+0, I guess we will have a problem with
the latest version of TempestRemap_jll:

59d2816

I did not see any other package incompatible with

 grep -r  NetCDF_jll .
./T/TempestModel_jll/Compat.toml:NetCDF_jll = "400.701.400-400.799"
./T/TempestModel_jll/Deps.toml:NetCDF_jll = "7243133f-43d8-5620-bbf4-c2c921802cf3"
./T/TempestRemap_jll/Compat.toml:NetCDF_jll = "400.701.400-400.799"
./T/TempestRemap_jll/Compat.toml:NetCDF_jll = "400.702.402-400.799"
./T/TempestRemap_jll/Deps.toml:NetCDF_jll = "7243133f-43d8-5620-bbf4-c2c921802cf3"
./I/IOAPI_jll/Compat.toml:NetCDF_jll = "400.701.400-400.799"
./I/IOAPI_jll/Deps.toml:NetCDF_jll = "7243133f-43d8-5620-bbf4-c2c921802cf3"
./N/NetcdfIO/Compat.toml:NetCDF_jll = "400.701.400-400.702.400"
./N/NetcdfIO/Deps.toml:NetCDF_jll = "7243133f-43d8-5620-bbf4-c2c921802cf3"
./N/NetCDF/Compat.toml:NetCDF_jll = "4.7.4-4"
./N/NetCDF/Compat.toml:NetCDF_jll = "400.701.400-400"
./N/NetCDF/Deps.toml:NetCDF_jll = "7243133f-43d8-5620-bbf4-c2c921802cf3"
./N/NCDatasets/Compat.toml:NetCDF_jll = "4.7.4-4"
./N/NCDatasets/Compat.toml:NetCDF_jll = "400.701.400-400"
./N/NCDatasets/Compat.toml:NetCDF_jll = ["400.701.400", "400.702.400"]
./N/NCDatasets/Deps.toml:NetCDF_jll = "7243133f-43d8-5620-bbf4-c2c921802cf3"
./N/NetCDFF_jll/Compat.toml:NetCDF_jll = "400.701.400-400.799"
./N/NetCDFF_jll/Deps.toml:NetCDF_jll = "7243133f-43d8-5620-bbf4-c2c921802cf3"
./N/NetCDF_jll/Package.toml:name = "NetCDF_jll"
./N/NetCDF_jll/Package.toml:repo = "https://github.com/JuliaBinaryWrappers/NetCDF_jll.jl.git"
./M/MDAL_jll/Compat.toml:NetCDF_jll = "400.701.400-400.799"
./M/MDAL_jll/Deps.toml:NetCDF_jll = "7243133f-43d8-5620-bbf4-c2c921802cf3"
./V/VMEC_jll/Compat.toml:NetCDF_jll = "400.702.400-400.799"
./V/VMEC_jll/Deps.toml:NetCDF_jll = "7243133f-43d8-5620-bbf4-c2c921802cf3"
./Registry.toml:7243133f-43d8-5620-bbf4-c2c921802cf3 = { name = "NetCDF_jll", path = "N/NetCDF_jll" }

@giordano
Copy link
Member

It might be related to the HDF5_jll update.

The funny thing is that Windows is the only platform for which we always used the same source: the msys2 libraries.

@Alexander-Barth
Copy link
Contributor Author

Yes, indeed. I was wondering if there is reason to use msys2 over conda-forge on Windows for HDF5...

@giordano
Copy link
Member

Msys2 is probably more consistent with the toolchain we use here (mingw) and we still need to provide the runtime dependencies. You'd need to investigate whether conda-forge build for Windows is compatible with our other libraries.

@Alexander-Barth
Copy link
Contributor Author

You are right, conda-forge uses the Windows Visual C/C++ compiler (https://conda-forge.org/docs/maintainer/knowledge_base.html)

Given that TempestRemap_jll depends on NetCDF_jll 400.702.402+0, does this means that we out of luck with the yanking this NetCDF_jll version?

Unfortunately, the diffs in the headers file quite massive (for a patch release!):

diff  HDF5.v1.12.0.x86_64-w64-mingw32/include/  HDF5.v1.12.1.x86_64-w64-mingw32/include/ | wc -l
# 44843

@Alexander-Barth
Copy link
Contributor Author

As an alternative to yanking this version, could we also release a NetCDF 4.8.1 binary (which I got to build after applying several patches) with the "old" HDF5 1.12.0 ?

Alexander-Barth/NCDatasets.jl#165 (comment)

@Alexander-Barth
Copy link
Contributor Author

With HDF5 1.12.2 from https://packages.msys2.org/package/mingw-w64-x86_64-hdf5, I don't see this error anymore.

I guess we would be able to upgrade HDF5 to 1.12.2 once this is merged:

conda-forge/hdf5-feedstock#175

@visr
Copy link
Contributor

visr commented Jun 14, 2022

That's great news! Only 10 pending reviewers I see, haha.

@visr
Copy link
Contributor

visr commented Jul 28, 2022

So I helped a bit to land conda-forge/hdf5-feedstock#175, updating HDF5_jll to 1.12.2 in #5248. This however needed to be yanked from the registry again after finding out that #5249 conda-forge now builds against a libcurl version that is too new for us.

Does anyone have ideas how to get our hands on HDF5 1.12.2 builds for these platforms:

# x86_64 and aarch64 for Linux and macOS from https://anaconda.org/conda-forge/hdf5/files
# NOTE: make sure to select those compatible with OpenSSL 1.1.1 (click info icon)
ArchiveSource("https://anaconda.org/conda-forge/hdf5/1.12.2/download/linux-64/hdf5-1.12.2-nompi_h2386368_100.tar.bz2", "6f0a1e7fe9c76fee4f490b33a91465a3a4690a5ccf1df21c7bde141ab320ea96"; unpack_target="x86_64-linux-gnu"),
ArchiveSource("https://anaconda.org/conda-forge/hdf5/1.12.2/download/linux-aarch64/hdf5-1.12.2-nompi_h7bde11e_100.tar.bz2", "934324fea28f82ceb0f57c881bf49687154c17af8cc7a1a57753133e5224db2c"; unpack_target="aarch64-linux-gnu"),
ArchiveSource("https://anaconda.org/conda-forge/hdf5/1.12.2/download/osx-64/hdf5-1.12.2-nompi_hc782337_100.tar.bz2", "066c8feca3d77184e7cb38cd58d9918409f7395a66b16e7d330339225f9c0bea"; unpack_target="x86_64-apple-darwin14"),
ArchiveSource("https://anaconda.org/conda-forge/hdf5/1.12.2/download/osx-arm64/hdf5-1.12.2-nompi_h8968d4b_100.tar.bz2", "3efe747b24b173c6c3be71009c1831049032d81d0414d07d920b0650b8022a58"; unpack_target="aarch64-apple-darwin20"),

See also #5249 (comment), in case we can use conda infrastructure.

@Alexander-Barth
Copy link
Contributor Author

@visr thanks a lot for your work in updating HDF5 1.12.2. Too bad that we have now this libcurl issue, apparently only on MacOS). (I had a similar issue recently: #5031, but I think it is unrelated).

I am not sure how to proceed. What about these possibilities:

  1. ship HDF5 1.12.1 (on Linux/MacOS) and pretending it is HDF5 1.12.2 and ship the actual HDF5 1.12.2 version for Windows (fixing NetCDF for Windows users)
  2. make native builds of HDF5 on Linux x86_64 and Linux i686 with BinaryBuilder, setup an external github action to make a native build for MacOS X x86_64 (which could also be done for Windows) and rely on user contributions for MacOS aarch64 and Linux aarch64. Linux aarch64 could also be emulated via qemu.

Would option 1 or 2 have a chance to get accepted?

@visr
Copy link
Contributor

visr commented Jul 28, 2022

For me personally (1) sounds like a good quick fix that I had not thought of. Since the difference is only a patch release it might be acceptable, though I'm curious to see what @giordano thinks. Here are the patch release notes: https://www.hdfgroup.org/2022/04/release-of-hdf5-1-12-2-newsletter-183/.

(2) sounds like a good medium term solution to get more platforms supported. Will probably be some work to organize though. Using julia's buildkite for this might make it easier to get more platforms in one setup. I wonder how that effort compares to getting HDF5 to cross compile at this point (different skills though).

@giordano
Copy link
Member

I'm not a huge fan of either solution (especially mixing and lying about version numbers), but hey, don't we lie all the time? (but at least we don't usually mix different versions....). My problem with 2 is who's going to maintain that? Certaintly not me, I can barely keep up with one project, another one using tools I'm completely unfamiliar with is out of reach for me at the moment (moment which will last fairly long). If it's someone else who ensures everything works fine and here we only need to click on merge, then it's ok.

apparently only on MacOS

If you want to understand what the error is about: https://github.com/giordano/macos-compatibility-version

visr added a commit to visr/Yggdrasil that referenced this issue Jul 29, 2022
While keeping the MinGW binaries at 1.12.2, for libnetcdf compatibility.

See discussion at JuliaPackaging#4511 (comment).
@visr
Copy link
Contributor

visr commented Jul 29, 2022

I submitted option 1 from @Alexander-Barth in #5251. Nobody loves this solution I think, but it should give us a build that we can work with for now, buying us time to get to better solutions.

@Alexander-Barth
Copy link
Contributor Author

Alexander-Barth commented Jul 29, 2022

For the record: I tried to compile a Linux binary for HDF5 within binary builder (first part of option 2). Unfortunately, it turned out more complicated than I thought (as usual). In fact, the build system uses x86_64-pc-linux-musl and the target is x86_64-linux-gnu, so the build system considers this as cross-compilation HDF5 and fails with:

checking maximum decimal precision for C... configure: error: in `/workspace/srcdir/hdf5-1.12.2':
configure: error: cannot run test program while cross compiling

(even when setting export PAC_C_MAX_REAL_PRECISION=33, value from native compilation).

I have seen that conda-forge was able to cross-compile HDF5 for MacOS - aarch64. I guess that this patch would be allow us to by-pass this configure test:

https://github.com/conda-forge/hdf5-feedstock/blob/main/recipe/patches/osx_cross_configure.patch

Unfortunately, this patch is quite complicated and against an automatically generated file (configure not configure.ac).

@mkitti
Copy link
Contributor

mkitti commented Jul 29, 2022

Yes, cross compilation is challenging with HDF5. The main issue is that it requires one to obtain configuration from the target platform by executing test programs on that platform. Last time I looked into this, I think could be done via the CMakeCache. It probably should be changed so that we can figure out the calculation by just compiling a program since the compiler already knows many of these configuration details.

See https://forum.hdfgroup.org/t/cross-compiling-for-windows/6735/6
https://github.com/stevengj/hdf5

giordano pushed a commit that referenced this issue Jul 29, 2022
While keeping the MinGW binaries at 1.12.2, for libnetcdf compatibility.

See discussion at #4511 (comment).
@Alexander-Barth
Copy link
Contributor Author

I tried to compile a sample HDF5 program from https://github.com/HDFGroup/hdf5-examples within BinaryBuilder , but I got a segmentation fault. Here are my steps:

From a BinaryBuilder session (with mingw64 target):

sandbox:${WORKSPACE}/srcdir/netcdf-c-4.9.0 # wget https://raw.githubusercontent.com/HDFGroup/hdf5-examples/master/C/H5T/h5ex_t_array.c
sandbox:${WORKSPACE}/srcdir/netcdf-c-4.9.0 # gcc -o h5ex_t_array.exe -I/workspace/destdir/include/ h5ex_t_array.c -L/workspace/destdir/bin/  -lhdf5-0
sandbox:${WORKSPACE}/srcdir/netcdf-c-4.9.0 # sha256sum /workspace/destdir/bin/zlib1.dll /workspace/destdir/bin/libsz.dll /workspace/destdir/bin/libhdf5-0.dll
a26b41bb482967b170453c93edf8f108052ab00f0c7d1134761f625c085f175e  /workspace/destdir/bin/zlib1.dll
07e014276e614e91ff1ff55e6e3b465e1d03f736aa38b408f07c3159416060e8  /workspace/destdir/bin/libsz.dll
c2f5d5c789396d7b8f68eeff683433ade42e8feb1832aefe04d921ebe8b85470  /workspace/destdir/bin/libhdf5-0.dll

I transferred the binary h5ex_t_array.exe and the 3 dlls (zlib1.dll, libsz.dll, libhdf5-0.dll) to a Windows system (msys shell):

$ ldd ./h5ex_t_array.exe
        ntdll.dll => /c/WINDOWS/SYSTEM32/ntdll.dll (0x7ff939f10000)
        KERNEL32.DLL => /c/WINDOWS/System32/KERNEL32.DLL (0x7ff938ad0000)
        KERNELBASE.dll => /c/WINDOWS/System32/KERNELBASE.dll (0x7ff9379d0000)
        msvcrt.dll => /c/WINDOWS/System32/msvcrt.dll (0x7ff939900000)
        libhdf5-0.dll => /home/Alexander Barth/libhdf5-0.dll (0x7ff90e180000)   # <- from BinaryBuilder
        ADVAPI32.dll => /c/WINDOWS/System32/ADVAPI32.dll (0x7ff939e20000)
        sechost.dll => /c/WINDOWS/System32/sechost.dll (0x7ff938c40000)
        RPCRT4.dll => /c/WINDOWS/System32/RPCRT4.dll (0x7ff9380c0000)
        zlib1.dll => /home/Alexander Barth/zlib1.dll (0x7ff9328a0000)   # <- from BinaryBuilder
        libwinpthread-1.dll => /mingw64/bin/libwinpthread-1.dll (0x7ff931ba0000)
        libsz.dll => /home/Alexander Barth/libsz.dll (0x7ff931a90000)  # <- from BinaryBuilder

$ ./h5ex_t_array.exe
Segmentation fault

$ sha256sum zlib1.dll libsz.dll libhdf5-0.dll
a26b41bb482967b170453c93edf8f108052ab00f0c7d1134761f625c085f175e *zlib1.dll
07e014276e614e91ff1ff55e6e3b465e1d03f736aa38b408f07c3159416060e8 *libsz.dll
c2f5d5c789396d7b8f68eeff683433ade42e8feb1832aefe04d921ebe8b85470 *libhdf5-0.dll

The example does work on my Linux system and on Windows when compiled natively with MSYS2. Should this test not also succeed on Windows with cross-compilation using BinaryBuilder?

@Alexander-Barth
Copy link
Contributor Author

Alexander-Barth commented Aug 6, 2022

If I also extract libwinpthread-1.dll from BinaryBuilder, the program ./h5ex_t_array.exe does no more return an error (but there is no screen output, unlike native compilation). A output file h5ex_t_array.h5 is created, but it is too small and not readable:

$ h5dump h5ex_t_array.h5
h5dump error: unable to open file "h5ex_t_array.h5"

The example program seems to abort at this line:
https://github.com/HDFGroup/hdf5-examples/blob/master/C/H5T/h5ex_t_array.c#L51

@mkitti
Copy link
Contributor

mkitti commented Aug 6, 2022

I'm getting confused. The HDF5 libraries are coming from msys2:

# 64-bit Windows from https://packages.msys2.org/package/mingw-w64-x86_64-hdf5
ArchiveSource("https://mirror.msys2.org/mingw/mingw64/mingw-w64-x86_64-hdf5-1.12.2-1-any.pkg.tar.zst", "3ba6521d45368aabb334131e10282b25fab9891a20fb9129d897c65c8b6cdbda"; unpack_target="x86_64-w64-mingw32"),
ArchiveSource("https://mirror.msys2.org/mingw/mingw64/mingw-w64-x86_64-libaec-1.0.6-2-any.pkg.tar.zst", "d970bd71e55fc5bd4a55e95ef22355d8c479631973860f2a9c37b49c931c5f35"; unpack_target="x86_64-w64-mingw32"),
ArchiveSource("https://mirror.msys2.org/mingw/mingw64/mingw-w64-x86_64-zlib-1.2.12-1-any.pkg.tar.zst", "e728df08b4db7b291a52d8fd60b96f19016f059ab15170fc98120e5d580c86ac"; unpack_target="x86_64-w64-mingw32"),

Shouldn't you be able to grab that package from msys2 and compile within msys2?

@mkitti
Copy link
Contributor

mkitti commented Aug 6, 2022

It is this package exactly:
https://packages.msys2.org/package/mingw-w64-x86_64-hdf5

@Alexander-Barth
Copy link
Contributor Author

Alexander-Barth commented Aug 6, 2022

Yes, I am confused too, HDF5/NetCDF binaries has been a never-ending stream of moments of confusion :-)

The checksums of the HDF5 lib for native compilation in MSYS and BinaryBuilder are identical.
I included all the steps, because maybe there is a problem how I tested it (I also run the exe from a cmd shell to avoid any interaction with my local MSYS installation). Maybe somebody has the time to produce it.

I mentioned a similar problem here:
Alexander-Barth/NCDatasets.jl#164 (comment)
Native compilation in MSYS of NetCDF with HDF5 worked but it fails with cross-compilation in BinaryBuilder.

Version of GCC is different (12.1 in MSYS, 4.8.5 in BinaryBuilder)...

The error is also reproducible with this smaller example
https://github.com/HDFGroup/hdf5-examples/blob/master/C/H5T/h5ex_t_int.c

This example stops at H5Dcreate when running a cross-compiled binary:
https://github.com/HDFGroup/hdf5-examples/blob/master/C/H5T/h5ex_t_int.c#L55

@Alexander-Barth
Copy link
Contributor Author

Shouldn't you be able to grab that package from msys2 and compile within msys2?

Yes, I installed this library using the package manager of MSYS (pacman).

@mkitti
Copy link
Contributor

mkitti commented Aug 6, 2022

We can specify a GCC version in BinaryBuilder.

What I would like to know is if you can compile and run the example completely within msys2. If so, then what is the difference between running it within msys2 and outside msys2?

@Alexander-Barth
Copy link
Contributor Author

Alexander-Barth commented Aug 6, 2022

We can specify a GCC version in BinaryBuilder.

I see gcc-6 and gcc-7 in the build image. Can we use a more recent version than gcc 7.5.0?
How can we do that?

What I would like to know is if you can compile and run the example completely within msys2.

Yes, this what I referred as native compilation before.

If so, then what is the difference between running it within msys2 and outside msys2?

The difference from what I have seen, is that when you run the binary within msys for any missing DLL (like libwinpthread-1.dll), the DLL at the standard location from MSYS will be used while when you run it outside of MSYS2 an error message is produced. Within MSYS you might get silently get an incompatible (or at least different) DLL.

I run only the cross-compiled version outside of MSYS.

@Alexander-Barth
Copy link
Contributor Author

Alexander-Barth commented Aug 10, 2022

As a test, I tried to cross-compile with the mingw cross compile from Ubuntu 20.04:

 $ x86_64-w64-mingw32-gcc  --version
x86_64-w64-mingw32-gcc (GCC) 9.3-win32 20200320
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I got the HDF5 dll from https://mirror.msys2.org/mingw/mingw64/mingw-w64-x86_64-hdf5-1.12.2-1-any.pkg.tar.zst, extracted in mingw64

$ x86_64-w64-mingw32-gcc  -o h5ex_t_int  -Imingw64/include/ h5ex_t_int.c -Lmingw64/lib -lhdf5

Surprisingly, the cross compiled binary using the Ubuntu tool chain worked on Windows!

So maybe it is a (cross-)compiler bug in gcc triggered by a change in HDF5 ?

It is possible to update the compiler version in BinaryBuilder? We use currently version gcc 4.8.5 from 2015.

@visr
Copy link
Contributor

visr commented Aug 10, 2022

Oh nice find! Are you looking for preferred_gcc_version?

https://docs.binarybuilder.org/dev/troubleshooting/#Building-with-an-old-GCC-version-a-library-that-has-dependencies-built-with-newer-GCC-versions

build_tarballs(ARGS, name, version, sources, script, platforms, products, dependencies;
julia_compat="1.6", preferred_gcc_version=v"7")

@Alexander-Barth
Copy link
Contributor Author

Perfect! Yes, indeed with preferred_gcc_version=v"5", I get a working binary for NetCDF 4.9.0 on Windows 🥳 !

NetCDF_jll.libnetcdf = "C:\\Users\\runneradmin\\.julia\\artifacts\\a6675d43b84627dcab6111bc4186b53bd3497992\\bin\\libnetcdf-19.dll"
NetCDF library: C:\Users\runneradmin\.julia\artifacts\a6675d43b84627dcab6111bc4186b53bd3497992\bin\libnetcdf-19.dll
NetCDF version: 4.9.0 of Aug 10 2022 13:19:08 $
Test Summary: | Pass  Total     Time
NCDatasets    |  829    829  1m32.5s
Test Summary:  | Pass  Total  Time
NetCDF4 groups |    9      9  1.1s
Test Summary:          | Pass  Total  Time
Variable-length arrays |   22     22  1.5s
Test Summary:  | Pass  Total  Time
Compound types |   16     16  2.0s
Test Summary:      | Pass  Total  Time
Time and calendars |   25     25  1.1s
Test Summary:       | Pass  Total   Time
Multi-file datasets |   70     70  28.9s
Test Summary:     | Pass  Total  Time
Deferred datasets |   13     13  0.8s
Test Summary: | Pass  Total   Time
@select macro |   33     33  16.1s

x64_64 Linux and MacOS work too.

I will make a PR soon!

@visr
Copy link
Contributor

visr commented Aug 10, 2022

Oh that's awesome. So in the end it came down to a compiler bug. I guess we should've tried newer GCC sooner haha. It never occured to me. Thanks for the great effort!

@Alexander-Barth
Copy link
Contributor Author

I did not realize that it was so easy to change compiler version. I assumed somehow that we use the same version across all packages. It is great to know that we have to flexibility.

@visr
Copy link
Contributor

visr commented Aug 10, 2022

Yeah I believe for maximum compatibility with old systems and clusters the default GCC is so old. Though if that doesn't work, it can be bumped.

@visr
Copy link
Contributor

visr commented Aug 14, 2022

@Alexander-Barth right now there are three separate build scripts, for julia 1.3+, 1.6+ and 1.8+. HDF5_jll and many others have stopped building new versions for 1.3+, shall we drop it as well?

Do you know if having separate build scripts for 1.6+ and 1.8+ is required? This is the only difference:

jll_stdlibs = Dict(
v"1.3" => [
Dependency("LibCURL_jll"; compat = "~7.71.1"),
],
v"1.6" => [
Dependency("LibCURL_jll"; compat = "~7.73.0"),
],
v"1.8" => [
Dependency("LibCURL_jll"; compat = "~7.84.0"),
]
)

and I don't see any other packages requiring libcurl 7.84.

Using Dependency("LibCURL_jll"; compat="7.73") and julia_compat="1.6" normally seems enough for builds to pass on everything from 1.6 to nightly. See for instance #5301.

@Alexander-Barth
Copy link
Contributor Author

@visr You are right, we can consolidate all this (and drop julia 1.6). I just made a test on Linux/Windows/MacOS (x86_64) and julia 1.6/1.8 and all tests pass:

https://github.com/Alexander-Barth/NCDatasets.jl/actions/runs/2860834779

I made a PR here: #5319 with a single build script for current julia versions.

@visr
Copy link
Contributor

visr commented Aug 16, 2022

Oh that's great news, that it works and that all tests pass! Should hopefully make it a bit easier to maintain as well.

@Alexander-Barth
Copy link
Contributor Author

I think that this can be closed with the release of NetCDF_jll v400.902.5+1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants