Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suppress canonicalization for lazy IO [DNM] #51

Merged
merged 1 commit into from
Jun 23, 2022
Merged

Conversation

timholy
Copy link
Member

@timholy timholy commented May 25, 2022

Also supports the upcoming TiffImages 0.6, but do not merge until @tlnagy gives the word.

closes #52

Also supports the upcoming TiffImages 0.6.
# load(io; mmap=true)
# load(io; lazyio=true)
# do not canonicalize by default. A package that supports both is TiffImages v0.6+.
uses_lazy(kwargs) = get(kwargs, :mmap, false) || get(kwargs, :lazyio, false)
Copy link
Member

@johnnychen94 johnnychen94 May 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be aware of the world-age issue

With this branch and tlnagy/TiffImages.jl#79:

julia> using TestImages, FileIO

julia> function f()
           file = testimage("cameraman", download_only=true)
           img = load(file; mmap=true)
           return length(img)
       end
f (generic function with 1 method)

julia> f()
ERROR: MethodError: no method matching size(::TiffImages.MmappedTIFF{ColorTypes.Gray{FixedPointNumbers.N0f8}, 2, UInt32, Base.ReinterpretArray{ColorTypes.Gray{FixedPointNumbers.N0f8}, 2, UInt8, Base.ReshapedArray{UInt8, 2, SubArray{UInt8, 1, Vector{UInt8}, Tuple{UnitRange{Int64}}, true}, Tuple{}}, true}})
The applicable method may be too new: running in world age 31358, while current world is 31367.
Closest candidates are:
  size(::TiffImages.MmappedTIFF{T, 2, O, A} where {O<:Unsigned, A<:AbstractMatrix{T}}) where T at ~/Documents/Julia/TiffImages.jl/src/types/mmapped.jl:60 (method too new to be called from this world context.)
  size(::AbstractArray{T, N}, ::Any) where {T, N} at ~/packages/julias/julia-1.7/share/julia/base/abstractarray.jl:42
  size(::Union{LinearAlgebra.Adjoint{T, var"#s859"}, LinearAlgebra.Transpose{T, var"#s859"}} where {T, var"#s859"<:(AbstractVector)}) at ~/packages/julias/julia-1.7/share/julia/stdlib/v1.7/LinearAlgebra/src/adjtrans.jl:172
  ...
Stacktrace:
 [1] length(t::TiffImages.MmappedTIFF{ColorTypes.Gray{FixedPointNumbers.N0f8}, 2, UInt32, Base.ReinterpretArray{ColorTypes.Gray{FixedPointNumbers.N0f8}, 2, UInt8, Base.ReshapedArray{UInt8, 2, SubArray{UInt8, 1, Vector{UInt8}, Tuple{UnitRange{Int64}}, true}, Tuple{}}, true}})
   @ Base ./abstractarray.jl:273
 [2] f()
   @ Main ./REPL[2]:4
 [3] top-level scope
   @ REPL[3]:1

julia> f()
262144

We should probably eagerly check if TiffImages is already loaded when canonicalize=false.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably eagerly check if TiffImages is already loaded when canonicalize=false.

This can be done via the new helper: johnnychen94/LazyModules.jl#6

Copy link
Member Author

@timholy timholy May 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(EDIT: I should have refreshed the browser before responding, so this doesn't reflect some of your later comments.)

But this would also be a problem:

img = load(file; mmap=true)
# now wait 2.5 hours for that 4TB file to be converted to an Array, and it will fail because it exhausts memory on your machine

So canonicalization just turns it into a different, harder to circumvent problem. The world age issue is circumvented merely by trying again.

This does seem to be a good case for an error hint, and it seems feasible to do since MethodError gets a lot of detail.

Copy link
Member

@johnnychen94 johnnychen94 May 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably eagerly check if TiffImages is already loaded when canonicalize=false.

I mean, how about

 function load(f::File{DataFormat{:TIFF}}; canonicalize::Union{Bool,Nothing}=nothing, kwargs...)
     canonicalize = something(canonicalize, !uses_lazy(kwargs))
+    # If we skip canonicalization, we're open to world age issues unless TiffImages
+    # is already loaded.
+    canonicalize || LazyModules.require(TiffImages)
     data = TiffImages.load(f.filename; kwargs...)
     return canonicalize ? enforce_canonical_type(f, data) : data
 end
 function load(s::Stream{DataFormat{:TIFF}}; canonicalize::Union{Bool,Nothing}=nothing, kwargs...)
     canonicalize = something(canonicalize, !uses_lazy(kwargs))
+    canonicalize || LazyModules.require(TiffImages)
     data = TiffImages.load(stream(s); kwargs...)
     return canonicalize ? enforce_canonical_type(s, data) : data
 end

and this gives a perhaps useful enough error message already -- if we ignore the FileIO's complicated error stack.

julia> using TestImages, FileIO

julia> function f()
           file = testimage("cameraman", download_only=true)
           img = load(file; mmap=true)
           return length(img)
       end
f (generic function with 1 method)

julia> f()
[ Info: Precompiling ImageIO [82e4d734-157c-48bb-816b-45c225c6df19]
[ Info: Precompiling ImageMagick [6218d12a-5da1-5696-b52f-db25d2ecc6d1]
Errors encountered while load File{DataFormat{:TIFF}, String}("/home/jc/.julia/artifacts/27a4c26bcdd47eb717bee089ec231a899cb8ef69/cameraman.tif").
All errors:
===========================================
TiffImages is required to be loaded first, maybe `using TiffImages` or `import TiffImages` and try again.
===========================================
MethodError: no method matching load_(::String, ::Bool; mmap=true)
Closest candidates are:
  load_(::Union{AbstractString, IO, Vector{UInt8}}, ::Any; ImageType, extraprop, extrapropertynames, view) at ~/.julia/packages/ImageMagick/Fh2BX/src/ImageMagick.jl:137 got unsupported keyword argument "mmap"
  load_(::Union{AbstractString, IO, Vector{UInt8}}) at ~/.julia/packages/ImageMagick/Fh2BX/src/ImageMagick.jl:137 got unsupported keyword argument "mmap"
===========================================

Fatal error:
ERROR: TiffImages is required to be loaded first, maybe `using TiffImages` or `import TiffImages` and try again.
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:33
 [2] require
   @ ~/Documents/Julia/LazyModules.jl/src/LazyModules.jl:170 [inlined]
 [3] load(f::File{DataFormat{:TIFF}, String}; canonicalize::Nothing, kwargs::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:mmap,), Tuple{Bool}}})
   @ ImageIO ~/Documents/Julia/ImageIO.jl/src/ImageIO.jl:113
 [4] invokelatest(f::Any, args::Any; kwargs::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:mmap,), Tuple{Bool}}})
   @ Base ./essentials.jl:718
 [5] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Formatted; options::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:mmap,), Tuple{Bool}}})
   @ FileIO ~/.julia/packages/FileIO/Nl9Lh/src/loadsave.jl:219
 [6] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Symbol, ::String; options::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:mmap,), Tuple{Bool}}})
   @ FileIO ~/.julia/packages/FileIO/Nl9Lh/src/loadsave.jl:185
 [7] load(::String; options::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:mmap,), Tuple{Bool}}})
   @ FileIO ~/.julia/packages/FileIO/Nl9Lh/src/loadsave.jl:113
 [8] f()
   @ Main ./REPL[3]:3
 [9] top-level scope
   @ REPL[4]:1

julia> using TiffImages

julia> f()
262144

Copy link
Member Author

@timholy timholy May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, the problem would be that calling LazyModules.require(TiffImages) would always force users to explicitly load TiffImages---at that point, FileIO.load is pretty useless and we might as well suggest that people use TiffImages.load directly.

Now consider what happens if we handle this with an error hint instead:

  • the world-age error only happens if you load images from inside a function and then perform operations on them before returning from that function. A hint will at least tell users how to cope with the error, much like your LazyModules.require strategy
  • users who load image(s) from the REPL and then return to the REPL before doing anything with them (probably the most common usage in interactive data analysis, I'd guess) would not get an error.

To me that seems strictly better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh... I knew there must be something I missed when I opened PR #48 now I get it:

julia> f() = collect(load("tmp.png"))
f (generic function with 1 method)

julia> f()
ERROR: MethodError: no method matching size(::IndirectArrays.IndirectArray{ColorTypes.RGB{FixedPointNumbers.N0f8}, 2, UInt8, Matrix{UInt8}, OffsetArrays.OffsetVector{ColorTypes.RGB{FixedPointNumbers.N0f8}, Vector{ColorTypes.RGB{FixedPointNumbers.N0f8}}}})
The applicable method may be too new: running in world age 31349, while current world is 31382.
Closest candidates are:
  size(::IndirectArrays.IndirectArray) at ~/.julia/packages/IndirectArrays/BUQO3/src/IndirectArrays.jl:52 (method too new to be called from this world context.)
  size(::AbstractArray{T, N}, ::Any) where {T, N} at ~/packages/julias/julia-1.7/share/julia/base/abstractarray.jl:42
  size(::Union{LinearAlgebra.Adjoint{T, var"#s859"}, LinearAlgebra.Transpose{T, var"#s859"}} where {T, var"#s859"<:(AbstractVector)}) at ~/packages/julias/julia-1.7/share/julia/stdlib/v1.7/LinearAlgebra/src/adjtrans.jl:172

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if there is a real solution to battle the world-age issue with all the lazy loading benefits.

But maybe for now we'll just let this happen, i.e., merge this PR, and document the issue somewhere JuliaIO/FileIO.jl#362

@tlnagy
Copy link
Contributor

tlnagy commented Jun 21, 2022

I haven't heard any complaints about TiffImages.jl v0.6.0. So this PR has the all clear from me @timholy @johnnychen94

@johnnychen94
Copy link
Member

johnnychen94 commented Jun 23, 2022

Since ImageIO is likely to be used together with many packages, I'm going to merge this to resolve the version incompatibility.

@johnnychen94 johnnychen94 merged commit f5e27bc into master Jun 23, 2022
@johnnychen94 johnnychen94 deleted the teh/canonicalize branch June 23, 2022 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants