-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault reading file with NCDatasets v0.12.6 #187
Comments
As a test, does the error persists if you comment-out this line? (assuming that you do not need OPENDAP over HTTPS support): https://github.com/Alexander-Barth/NCDatasets.jl/blob/master/src/NCDatasets.jl#L31 If the error is still present without the call to |
The error is still there if I comment out Here's a cut down (although probably not minimal) code example that still fails, although less frequently than the full app. file
and then the test was:
Example stacktrace (this is the most common failure, although it can fail in different ways, see below):
|
A couple of example of less common failures (with With a similar, but not identical, test script:
With the full app:
|
Also fails with a minimal netcdf file Test script is modified with:
Example stack trace of failure:
|
Can you also test if the issue is also present in NCDatasets v0.12.6 with NetCDF_jll v400.702.400+0 by forcing to use the version A smaller reproducer:
craches with:
(using NetCDF_jll v400.902.5+0) |
This is likely to be an upstream issue. |
I can reproduce the first failure above ('classic' format netcdf files, fails in nc_open) using a single publicly available test file downloaded from the Unidata website. The 'inner' and 'outer' loop do seem to be necessary (at least to provoke a failure quickly, a test with a single loop is still running after >1500 outer iterations). To me it looks like this is a different error and plausibly a different issue to the netCDF-4 case? This is using: (also there is no failure after changing to NetCDF_jll v400.702.400+0 using File testnc6.jl contains:
with test:
and stacktrace:
|
Thank for your additional testing! There seem to be also an issue during initialization of NetCDF: Hopefully this is the same problem, because this error happens right away. |
Can you test with https://github.com/Alexander-Barth/NetCDF_jll.jl/releases/tag/NetCDF-v400.902.29%2B0 ?
On my end, it does not crash any more after 3000 outer iterations using this reproducer: import NCDatasets
dataarrays = []
niter = 1
netcdf_filename1 = "coords.nc"
fields1 = ["latitude"]
netcdf_filename2 = "coords.nc"
fields2 = ["latitude", "longitude"]
@noinline function prepare_data(darrays, fields, ds)
for f in fields
push!(darrays, ds[f][:])
end
end
while niter < 100
#println("niter: ", niter)
NCDatasets.Dataset(netcdf_filename1) do ds
prepare_data(dataarrays, fields1, ds)
end
NCDatasets.Dataset(netcdf_filename2) do ds
prepare_data(dataarrays, fields2, ds)
end
global niter += 1
end Run with: nouter = 1; while true; println("nouter: ", nouter);include("testnc2.jl");global nouter += 1;end |
Looks good using https://github.com/Alexander-Barth/NetCDF_jll.jl/releases/tag/NetCDF-v400.902.29%2B0 !! I've run three tests:
|
Thanks a lot for this comprehensive testing! In this build a remove the NetCDF c-flag |
Many thanks for addressing this issue, as well as your work on NCDatasets ! (and confirm all still looks good here after updating to the latest released packages) |
Great! Thank you testing and creating the reproducer! The new NetCDF_jll has been released. I think that an |
Describe the bug
Julia exits with segmentation fault when attempting to read from a netcdf file, apparently at random.
To Reproduce
No issues seen with NCdatasets v0.12.5 (with NetCDF_jll v400.702.400+0, using Julia v1.7), or with earlier versions (going back to approx ~1yr ago).
Occurs with NCDatasets v0.12.6 (with NetCDF_jll v400.902.5+0), using either Julia v1.7 or v1.8
This happens while running an application that is repeatedly opening and closing two netcdf files. Fails seemingly at random while opening either file after successfully open/read/close for ~10 attempts, doesn't seem to be associated with opening and reading any particular field or file.
Apologies, this isn't an example or dataset I can share. The code is of the form:
and it looks like the failure is when opening the netcdf file (see stacktrace below)
julia> Pkg.test("NCDatasets")
passes all tests.Environment
julia> versioninfo()
Julia Version 1.7.3
Commit 742b9abb4d (2022-05-06 12:58 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-12.0.1 (ORCJIT, broadwell)
Environment:
JULIA_NUM_PRECOMPILE_TASKS = 4
JULIA_DEPOT_PATH = /data/sd336/software/julia/depot
Full output
signal (11): Segmentation fault
in expression starting at /data/sd336/runtests.jl:11
posixio_open at /data/sd336/software/julia/depot/artifacts/b51d396f750184183d5594c558b417883f749e07/lib/libnetcdf.so (unknown line)
ncio_open at /data/sd336/software/julia/depot/artifacts/b51d396f750184183d5594c558b417883f749e07/lib/libnetcdf.so (unknown line)
NC3_open at /data/sd336/software/julia/depot/artifacts/b51d396f750184183d5594c558b417883f749e07/lib/libnetcdf.so (unknown line)
NC_open at /data/sd336/software/julia/depot/artifacts/b51d396f750184183d5594c558b417883f749e07/lib/libnetcdf.so (unknown line)
nc_open at /data/sd336/software/julia/depot/artifacts/b51d396f750184183d5594c558b417883f749e07/lib/libnetcdf.so (unknown line)
nc_open at /data/sd336/software/julia/depot/packages/NCDatasets/gV1QR/src/netcdf_c.jl:267
unknown function (ip: 0x7f91e3a5fcf9)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
#NCDataset#12 at /data/sd336/software/julia/depot/packages/NCDatasets/gV1QR/src/dataset.jl:203
NCDataset at /data/sd336/software/julia/depot/packages/NCDatasets/gV1QR/src/dataset.jl:172 [inlined]
NCDataset at /data/sd336/software/julia/depot/packages/NCDatasets/gV1QR/src/dataset.jl:172 [inlined]
#NCDataset#13 at /data/sd336/software/julia/depot/packages/NCDatasets/gV1QR/src/dataset.jl:239
NCDataset at /data/sd336/software/julia/depot/packages/NCDatasets/gV1QR/src/dataset.jl:239 [inlined]
prepare_do_force_grid at /data/sd336/software/julia/depot/packages/PALEOboxes/iA0AD/src/reactioncatalog/GridForcings.jl:125
...etc...
The text was updated successfully, but these errors were encountered: