-
-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DimensionMismatch("array could not be broadcast to match destination") #1457
Comments
I presume this is an adaptation of the code at SciML/DiffEqFlux.jl#387? The pullback code likely works because it's not actually differentiating through the loss function. This would be the proper equivalent: l, back = Zygote.pullback(() -> loss(img, lab))
back(one(l)) I would try running this model on CPU first and verifying there is no dimension mismatch there. |
Likely not Flux related, since BatchNorm and friends shouldn't be changing the output dimensions. Could you test with checking the input and output dims? |
You did a reshape that seems really wrong. return reshape(xarr, size(xarr)[1:end-1]) You should preserve the total length of an array |
I receive the same error without using a reshape function. Funny enough the error only happens for the recurrent layers GRU or RNN, but not for LSTM. Maybe its a different bug from the above, not sure, atleast the error message is the same. using Flux, Statistics
# some setting
nT = 100
ndata = 20
batchsize = 5
ninputs = 3
noutputs = 1
# create artificial data
struct SeqData
x::AbstractVector
y::AbstractVector
end
data = Vector{SeqData}(undef, 0)
for i = 1:ndata
input = [randn(Float32, ninputs, batchsize) for i = 1:nT]
output = [randn(Float32, noutputs, batchsize) for i = 1:nT]
push!(data, SeqData(input, output) )
end
train_loader = Flux.Data.DataLoader(data)
# Create a model
model = Chain(GRU(ninputs, ninputs), Dense(ninputs, noutputs)) # broken for GRU and RNN, works for LSTM
# Loss function
function loss(x, y)
Flux.reset!(model)
y_model = model.(x)
diff = [mean(abs2, y[i] .- y_model[i]) for i = 1:length(y) ]
return mean(diff)
end
loss(data::SeqData) = loss(data.x, data.y)
loss(data::Vector{SeqData}) = mean( loss(seq) for seq in data )
# Evaluate the loss and try training the model
loss(data) # This works for all types of rnn
Flux.train!(loss, params(model), train_loader, ADAM()) # This does not work for GRU and RNN I used Flux version 0.11.3 on Windows with Julia 1.5.3 |
I should probably provide the full error message for my code above: ERROR: LoadError: DimensionMismatch("cannot broadcast array to have fewer dimensions")
Stacktrace:
[1] check_broadcast_shape(::Tuple{}, ::Tuple{Base.OneTo{Int64}}) at .\broadcast.jl:518
[2] check_broadcast_shape(::Tuple{Base.OneTo{Int64}}, ::Tuple{Base.OneTo{Int64},Base.OneTo{Int64}}) at .\broadcast.jl:521
[3] check_broadcast_axes at .\broadcast.jl:523 [inlined]
[4] check_broadcast_axes at .\broadcast.jl:527 [inlined]
[5] instantiate at .\broadcast.jl:269 [inlined]
[6] materialize! at .\broadcast.jl:848 [inlined]
[7] materialize!(::Array{Float32,1}, ::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2},Nothing,typeof(+),Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1},Nothing,typeof(*),Tuple{Float64,Array{Float32,1}}},Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2},Nothing,typeof(*),Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0},Nothing,typeof(-),Tuple{Int64,Float64}},Array{Float32,2}}}}}) at .\broadcast.jl:845
[8] apply!(::ADAM, ::Array{Float32,1}, ::Array{Float32,2}) at C:\Users\christian.dengler\.julia\packages\Flux\sY3yx\src\optimise\optimisers.jl:175
[9] update!(::ADAM, ::Array{Float32,1}, ::Array{Float32,2}) at C:\Users\christian.dengler\.julia\packages\Flux\sY3yx\src\optimise\train.jl:23
[10] update!(::ADAM, ::Zygote.Params, ::Zygote.Grads) at C:\Users\christian.dengler\.julia\packages\Flux\sY3yx\src\optimise\train.jl:29
[11] macro expansion at C:\Users\christian.dengler\.julia\packages\Flux\sY3yx\src\optimise\train.jl:105 [inlined]
[12] macro expansion at C:\Users\christian.dengler\.julia\packages\Juno\n6wyj\src\progress.jl:134 [inlined]
[13] train!(::Function, ::Zygote.Params, ::Flux.Data.DataLoader{Array{SeqData,1}}, ::ADAM; cb::Flux.Optimise.var"#16#22") at C:\Users\christian.dengler\.julia\packages\Flux\sY3yx\src\optimise\train.jl:100
[14] train!(::Function, ::Zygote.Params, ::Flux.Data.DataLoader{Array{SeqData,1}}, ::ADAM) at C:\Users\christian.dengler\.julia\packages\Flux\sY3yx\src\optimise\train.jl:98
[15] top-level scope at d:\User\CDE\Tapping_Pred_Maint\test.jl:39
[16] include_string(::Function, ::Module, ::String, ::String) at .\loading.jl:1091
[17] invokelatest(::Any, ::Any, ::Vararg{Any,N} where N; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at .\essentials.jl:710
[18] invokelatest(::Any, ::Any, ::Vararg{Any,N} where N) at .\essentials.jl:709
[19] inlineeval(::Module, ::String, ::Int64, ::Int64, ::String; softscope::Bool) at c:\Users\christian.dengler\.vscode\extensions\julialang.language-julia-1.0.10\scripts\packages\VSCodeServer\src\eval.jl:185
[20] (::VSCodeServer.var"#61#65"{String,Int64,Int64,String,Module,Bool,VSCodeServer.ReplRunCodeRequestParams})() at c:\Users\christian.dengler\.vscode\extensions\julialang.language-julia-1.0.10\scripts\packages\VSCodeServer\src\eval.jl:144
[21] withpath(::VSCodeServer.var"#61#65"{String,Int64,Int64,String,Module,Bool,VSCodeServer.ReplRunCodeRequestParams}, ::String) at c:\Users\christian.dengler\.vscode\extensions\julialang.language-julia-1.0.10\scripts\packages\VSCodeServer\src\repl.jl:124
[22] (::VSCodeServer.var"#60#64"{String,Int64,Int64,String,Module,Bool,Bool,VSCodeServer.ReplRunCodeRequestParams})() at c:\Users\christian.dengler\.vscode\extensions\julialang.language-julia-1.0.10\scripts\packages\VSCodeServer\src\eval.jl:142
[23] hideprompt(::VSCodeServer.var"#60#64"{String,Int64,Int64,String,Module,Bool,Bool,VSCodeServer.ReplRunCodeRequestParams}) at c:\Users\christian.dengler\.vscode\extensions\julialang.language-julia-1.0.10\scripts\packages\VSCodeServer\src\repl.jl:36
[24] (::VSCodeServer.var"#59#63"{String,Int64,Int64,String,Module,Bool,Bool,VSCodeServer.ReplRunCodeRequestParams})() at c:\Users\christian.dengler\.vscode\extensions\julialang.language-julia-1.0.10\scripts\packages\VSCodeServer\src\eval.jl:110
[25] with_logstate(::Function, ::Any) at .\logging.jl:408
[26] with_logger at .\logging.jl:514 [inlined]
[27] (::VSCodeServer.var"#58#62"{VSCodeServer.ReplRunCodeRequestParams})() at c:\Users\christian.dengler\.vscode\extensions\julialang.language-julia-1.0.10\scripts\packages\VSCodeServer\src\eval.jl:109
[28] #invokelatest#1 at .\essentials.jl:710 [inlined]
[29] invokelatest(::Any) at .\essentials.jl:709
[30] macro expansion at c:\Users\christian.dengler\.vscode\extensions\julialang.language-julia-1.0.10\scripts\packages\VSCodeServer\src\eval.jl:27 [inlined]
[31] (::VSCodeServer.var"#56#57")() at .\task.jl:356 |
As @CarloLucibello pointed, layers in flux expect the last dim to be the batch, and the reshape above seems to drop that. Also note that normalisation on batch size of 1 not meaningful |
I am also running in this issue with Optim BFGS and Optim LBFGS I think this issue is related https://discourse.julialang.org/t/optimization-with-lbfgs-gives-dimensionmismatch-dimensions-must-match/22167 |
The DimensionMismatch error could come from a great many places and Optim is not a FluxML package, so perhaps it would be better to seek help there? If things are only reproducible with Flux + Optim, then a separate issue + MWE would be very much appreciated. In the meantime, I think this thread is safe to close because both the original and follow-up example have answers. |
I'm trying to make a neuralode using conv layers. After I build the model, the forward pass works fine but when I try to get the gradient using
g = gradient(() -> loss(x, y), params(model))
, I get aDimensionMismatch("array could not be broadcast to match destination")
.To reproduce the error:
Running the above code results in
DimensionMismatch
error.Full stack trace: https://pastebin.com/P0iV2ihP
Further investigation:
BatchNorm
fromconvode_base
gets rid of the error.GroupNorm
inconvode_base
also results in the same error.convode
with the output ofc1(img)
. This works.Code for pullback:
Pullback with x1 and x2 doesn't work.
I was not able to narrow down what is causing the original error.
The text was updated successfully, but these errors were encountered: