-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flux.Conv type instability #1178
Comments
Bumping this issue, which is still persisting on Julia v1.5.3 and v1.6.0-beta1:
|
Can someone maybe explain, why we need to specialize Looking, for example, at how the output array in y = similar(x, promote_type(xT, wT), output_size(cdims)..., channels_out(cdims), size(x,N)) |
These configuration settings can then be specialised on at compile time in many cases, which is why we need these. |
Worth looking at which type params are absolutely needed for performance and are used in the wild. For the rest, if one can demonstrate that removing them helps significantly with compilation latency while not compromising the targets in the original PR, that should be a shoe-in. |
As a proof of concept, I've implemented custom But it allows for full type inference on the Side note: pooling layers have the same issue with type-inference. Function used to test: function main()
x = zeros(Float32, 28, 28, 1, 5)
y = zeros(Float32, 10, 5)
m = Chain(
Conv((3, 3), 1 => 2, relu),
Conv((3, 3), 2 => 2, relu),
Conv((3, 3), 2 => 3, relu),
Conv((3, 3), 3 => 3, relu),
Conv((3, 3), 3 => 3, relu),
Conv((3, 3), 3 => 3, relu),
x -> reshape(x, :, size(x, 4)),
Dense(768, 10), softmax)
θ = params(m)
@time m(x)
@time gradient(θ) do
Flux.crossentropy(m(x), y)
end
end Before: (F) julia> @code_warntype m(x)
MethodInstance for (::Chain{Tuple{Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Flux.var"#312#314", Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}, typeof(softmax)}})(::Array{Float32, 4})
from (c::Chain)(x) in Flux at /home/pxl-th/.julia/dev/Flux/src/layers/basic.jl:49
Arguments
c::Chain{Tuple{Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Flux.var"#312#314", Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}, typeof(softmax)}}
x::Array{Float32, 4}
Body::Any
1 ─ %1 = Base.getproperty(c, :layers)::Tuple{Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Flux.var"#312#314", Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}, typeof(softmax)}
│ %2 = Flux.Tuple(%1)::Tuple{Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Flux.var"#312#314", Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}, typeof(softmax)}
│ %3 = Flux.applychain(%2, x)::Any
└── return %3 After: (F) julia> @code_warntype m(x)
MethodInstance for (::Chain{Tuple{Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Flux.var"#312#314", Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}, typeof(softmax)}})(::Array{Float32, 4})
from (c::Chain)(x) in Flux at /home/pxl-th/.julia/dev/Flux/src/layers/basic.jl:49
Arguments
c::Chain{Tuple{Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Flux.var"#312#314", Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}, typeof(softmax)}}
x::Array{Float32, 4}
Body::Matrix{Float32}
1 ─ %1 = Base.getproperty(c, :layers)::Tuple{Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Flux.var"#312#314", Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}, typeof(softmax)}
│ %2 = Flux.Tuple(%1)::Tuple{Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Conv{2, 4, typeof(relu), Array{Float32, 4}, Vector{Float32}}, Flux.var"#312#314", Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}, typeof(softmax)}
│ %3 = Flux.applychain(%2, x)::Matrix{Float32}
└── return %3 |
Very nice, IIUC this can be done as a backwards compatible change to NNlib? For the backwards pass, I guess we'd have to break down how much time is doing into pullback function generation vs pullback execution. At least on my machine, it takes at least 10s just to compile the Zygote compiler itself (for any function). |
I'd say it is breaking if someone expects |
Hi all,
As I was playing around with Flux I noticed that the convolutional layers I was using (
Flux.Conv
) have some un-inferable types, at least that is my understanding.The following is a minimal example:
On my laptop, with Julia version 1.4.1 and Flux version 0.10.4 this results in:
It seems to be caused by the un-inferable type of
cdims
, causing the output type ofFlux.conv
to be inferred asAbstractArray{yT,4} where yT
.While looking at the type definition and constructors of
DenseConvDims
I noticed that its parametric type is dependent on the contents of, for instance, the variable corresponding to thepad
keyword of theFlux.Conv
constructor. It might be a stupid question, but why are all the parameters of the typeDenseConvDims
stored in its type directly instead of in its fields? Doesn't that make the type uncertain by default since the compiler has no way of knowing the contents of variables beforehand? I have only recently started programming in Julia, so please correct me if I am wrong.The text was updated successfully, but these errors were encountered: