-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GDAL Error when reading many files: GDALOpen() called with too many recursion levels #455
Comments
Oh it has no cleanup mechanism? That is concerning. |
There is something in place here (aftercare and gdaljl_errorhandler), but I'm not sure whether they also close open files automatically; seems like no. |
Probably @visr will understand this better than I do |
So, I managed to produce a similar error without reading with Rasters: using Rasters
using ArchGDAL
s = 1000
r = Raster(rand(s,s,1); dims=(X(1:s), Y(1:s), Band(1:1)))
# this can jam your session but should produce a corrupted file when it works
Threads.@threads for _ in 1:8 write("corrupted.tif", r; force=true) end
for i in 1:3
try
ds = ArchGDAL.read("corrupted.tif")
catch err
@warn err
end
end
┌ Warning: GDAL.GDALError(GDAL.CE_Failure, 1, "corrupted.tif: TIFFFetchDirectory:Sanity check on directory count failed, this is probably not a valid IFD offset")
┌ Warning: GDAL.GDALError(GDAL.CE_Failure, 1, "GDALOpen() called on corrupted.tif recursively")
┌ Warning: GDAL.GDALError(GDAL.CE_Failure, 1, "GDALOpen() called on corrupted.tif recursively") @visr is this to be expected? If not, should I move this issue upstream to AG or GDAL.jl? |
So in this last example you read a bad GeoTIFF, and get a good error the first time, and a different error afterwards, right? And you suspect that it is because the file is not properly closed on such an error. Does the error also occur when you use the do-block syntax? Because then things can be cleaned up directly rather than relying on the GC to trigger the finalizer. https://yeesian.com/ArchGDAL.jl/stable/memory/ I'm not sure how to best improve this for the interactive usecase. I guess it would be nice if we could directly call |
right
yes
Probably yes. I think as of now all gdal functions receive the same "aftercare". Maybe the IO functions should have their own "ioaftercare" which closes the dataset right away. Because closing them after they threw an error is almost impossible for the user. In my case, the system almost crashed, because I had too many files open. |
Rasters.jl is using the do block syntax in this case too. ioaftercare sounds like a solution (also, kudos to the creative corrupted file generation...) |
It would be good to test the solution first with some manual edits in GDAL.jl, and if it works we can build it into the generator and regenerate the code. I see that https://github.com/JuliaGeo/GDAL.jl/blob/8e2ef58f694e6022301a0923f0cbf7ef4aaa6b81/src/libgdal.jl#L6997 |
So, another minimum failing example: # run twice
ret = ccall(
(:GDALOpenEx, GDAL.libgdal),
GDAL.GDALDatasetH,
(Cstring, Cuint, Ptr{Cstring}, Ptr{Cstring}, Ptr{Cstring}),
"corrupted.tif",
0x40,
Ptr{Cvoid}(C_NULL),
Ptr{Cvoid}(C_NULL),
Ptr{Cvoid}(C_NULL),
) I think the problem is unrelated to Rasters and AG, as it also occurs with plain GDAL. What I think happens is that the gdaljl_errorhandler gets triggered immediately when the call to gdalopenex fails and before the C side has closed the file. When I comment the error handler out, the recursion is gone. I think it boils down to that |
You mean plain GDAL.jl or gdal proper? |
GDAL.jl |
Yes, right now with the error handler GDAL calls Julia when an error is encountered, and only if it is fatal we throw. Though since we also have |
You mean CE_Fatal? Currently it throws for CE_Failure.
Yes, I tested it and the aftercare catches this error. I noticed that the error handler actually runs twice during a single call to |
Ah I meant CE_Failure. Is there a benefit to keep using it but only for CE_Fatal versus not registering the error handler at all? I'd probably be fine with either. |
I think if you don't register an error handler all errors will be printed to stdout. But yes, a simple return C_NULL in the handler should do it, aftercare can also throw after CE_Fatal (if gdal didnt crash completely by this point). |
Closing in favour of JuliaGeo/GDAL.jl#153 |
Hi! I'm trying to implement a sanity check for image files downloaded from the web. I want to check ~600k images of size 2240x2240x4. After 105 corrupted files, I always get a
LoadError: GDALError (CE_Failure, code 1): GDALOpen() called with too many recursion levels
. I suspect the file descriptors are not properly closed. A minimum failing example looks like this:Edit: Ok, I think this is an upstream issue with GDAL.jl not closing the file descriptors after the read function threw an error. If you have any further thoughts I'm happy to hear them!
The text was updated successfully, but these errors were encountered: