Gradient incorrect for Conv-layer and complex numbers #1876

zsoerenm · 2022-02-15T12:39:39Z

I created a real valued convolution layer to verify, if the gradients calculated for the complex valued Conv-layer are correct:

using Flux
struct ComplexWeight w_re; w_im end # A 1 dim conv layer, where real and imaginary parts are seperated
eachmat(A) = (view(A, :, :, i) for i in axes(A, 3))
(cw::ComplexWeight)(A) = reshape(mapreduce(x -> [x[:,1:4] * cw.w_re .- x[:,5:8] * cw.w_im x[:,5:8] * cw.w_re .+ x[:,1:4] * cw.w_im], hcat, eachmat(A)), (size(A,1), 2, 1))
Flux.@functor ComplexWeight
complex_init = randn(ComplexF32, 1, 4, 1)
real_convl = ComplexWeight(real.(vec(complex_init)), imag.(vec(complex_init)))
convl = Conv((1,), 4 => 1, identity; pad=SamePad(), init=(dims...) -> complex_init, bias=false)
xs = randn(ComplexF32, 256, 4, 1);
ys = randn(ComplexF32, 256, 1, 1);
to_real(A) = hcat(real.(A), imag.(A))
to_complex(A) = complex.(A[:,1:size(A,2) >> 1,:], A[:,size(A,2) >> 1 + 1:end,:])

# Check if layers produce the same output
real_y = real_convl(to_real(xs));
convl(xs) ≈ complex.(real_y[:,1], real_y[:,2]) # true

# Create loss functions and check if they result in the same output
loss_real(model, xs, ys) = Flux.Losses.mse(to_complex(model(xs)), ys)
loss(model, xs, ys) = Flux.Losses.mse(model(xs), ys)
loss_real(real_convl, to_real(xs), ys) ≈ loss(convl, xs, ys) # true

# Calculate gradients
params_real = Flux.params(real_convl)
grads_real = Flux.gradient(params_real) do
    loss_real(real_convl, to_real(xs), ys)
end
params = Flux.params(convl)
grads = Flux.gradient(params) do
    loss(convl, xs, ys)
end
vec(grads[params[1]]) ≈ complex.(grads_real[params_real[1]], grads_real[params_real[2]]) # false

The layers and the loss functions produce the same output given the same weights. However, the gradients are different.
I've checked a basic gradient calculation:

using Statistics: mean
using Flux
xs = randn(ComplexF64, 12, 4)
w = randn(ComplexF64, 4)
y = xs * w + randn(ComplexF64, 12)
f(w) = mean(abs2.(xs * w - y))
f2(w) = Flux.Losses.mse(xs * w, y)
function f_real(w) 
    re_part = real.(xs) * w[1] - imag.(xs) * w[2] - real.(y)
    im_part = imag.(xs) * w[1] + real.(xs) * w[2] - imag.(y)
    mean(re_part .* re_part + im_part .* im_part)
end
f_real2(w) = mean(abs2.(complex.(real.(xs) * w[1] - imag.(xs) * w[2], imag.(xs) * w[1] + real.(xs) * w[2]) - y))
f(w) ≈ f2(w) ≈ f_real([real.(w), imag.(w)]) ≈ f_real2([real.(w), imag.(w)]) # true
df(w) = gradient(f, w)[1]
df2(w) = gradient(f2, w)[1]
df_real(w) = gradient(f_real, w)[1]
df_real2(w) = gradient(f_real2, w)[1]
df_real_w = df_real([real.(w), imag.(w)])
df_real2_w = df_real2([real.(w), imag.(w)])
df(w) ≈ df2(w) ≈ complex.(df_real_w[1], df_real_w[2]) ≈ complex.(df_real2_w[1], df_real2_w[2]) # true

This is correct. My guess is that the error is somewhere in the Conv-layer.

This was referenced Feb 15, 2022

Fix gradient of convolution for complex values zsoerenm/NNlib.jl#1

Closed

Fix gradient of convolution for complex values FluxML/NNlib.jl#389

Merged

mcabbott added the bug label Feb 20, 2022

ToucheSir closed this as completed in FluxML/NNlib.jl#389 Mar 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gradient incorrect for Conv-layer and complex numbers #1876

Gradient incorrect for Conv-layer and complex numbers #1876

zsoerenm commented Feb 15, 2022 •

edited

Loading

Gradient incorrect for Conv-layer and complex numbers #1876

Gradient incorrect for Conv-layer and complex numbers #1876

Comments

zsoerenm commented Feb 15, 2022 • edited Loading

zsoerenm commented Feb 15, 2022 •

edited

Loading