Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug when using Matrix as differential for Adjoint #133

Closed
sethaxen opened this issue Jan 5, 2021 · 3 comments
Closed

Bug when using Matrix as differential for Adjoint #133

sethaxen opened this issue Jan 5, 2021 · 3 comments

Comments

@sethaxen
Copy link
Member

sethaxen commented Jan 5, 2021

#131 introduced a bug when differentials of Adjoint are regular matrices. Minimal example:

julia> using FiniteDifferences, LinearAlgebra

julia> n, T = 3, ComplexF64;

julia> x = randn(T, n);

julia> y = adjoint(x);

julia> ȳ_adjoint = adjoint(randn(T, n))
1×3 adjoint(::Vector{ComplexF64}) with eltype ComplexF64:
 -0.0351965+0.649028im  0.868707+0.237612im  -1.45535-0.149933im

julia> ȳ_mat = collect(ȳ_adjoint);

julia> ȳ_composite = FiniteDifferences.Composite{typeof(y)}(parent=parent(ȳ_adjoint));

julia> ȳ_adjoint' # expected
3-element Vector{ComplexF64}:
 -0.03519650654436998 - 0.6490275763507989im
   0.8687071179992576 - 0.23761165442658064im
  -1.4553505347699434 + 0.14993263689809244im

julia> j′vp(central_fdm(5, 1), adjoint, ȳ_adjoint, x)[1] # fine
3-element Vector{ComplexF64}:
 -0.035196506544367144 - 0.6490275763508034im
    0.8687071179992422 - 0.2376116544265735im
   -1.4553505347699498 + 0.14993263689809289im

julia> j′vp(central_fdm(5, 1), adjoint, ȳ_composite, x)[1] # fine
3-element Vector{ComplexF64}:
 -0.035196506544367144 - 0.6490275763508034im
    0.8687071179992422 - 0.2376116544265735im
   -1.4553505347699498 + 0.14993263689809289im

julia> j′vp(central_fdm(5, 1), adjoint, ȳ_mat, x)[1] # conjugated
3-element Vector{ComplexF64}:
 -0.03519650654436785 + 0.6490275763508094im
    0.868707117999241 + 0.23761165442657906im
  -1.4553505347699507 - 0.1499326368980859im

I've stared at the PR but can't figure out where it went wrong. Or maybe the issue is that a regular matrix is not an acceptable cotangent for an Adjoint. This was caught by the ChainRules test suite.
@willtebbutt I also noticed that PR disabled testing of to_vec for many complex types. Was that intentional?

@willtebbutt
Copy link
Member

@willtebbutt I also noticed that PR disabled testing of to_vec for many complex types. Was that intentional?

Oh god, no, that was most certainly not intentional.

I'll take a look at this now.

@willtebbutt
Copy link
Member

willtebbutt commented Jan 5, 2021

So I think that the problem is a fundamental one to do with how we compute thej′vp.

The current implementation is as follows:

function _j′vp(fdm, f, ȳ::Vector{<:Real}, x::Vector{<:Real})
    isempty(x) && return eltype(ȳ)[] # if x is empty, then so is the jacobian and x̄
    return transpose(first(jacobian(fdm, f, x))) *end

where (crucially) x and have been to_veced.

Since transpose(first(jacobian(fdm, f, x))) is unaffected by the value of , the problem must be that the different ways of representing that you provided above map to different values when to_vec is applied. Indeed, this is the case:

julia> to_vec(ȳ_mat)[1]
6-element Array{Float64,1}:
  0.6348048873881045
  2.0834602173828207
 -0.7874817791499601
  0.33312169706657996
  0.09073736368627416
 -0.37933784525930764

julia> to_vec(ȳ_adjoint)[1]
6-element Array{Float64,1}:
  0.6348048873881045
 -2.0834602173828207
 -0.7874817791499601
 -0.33312169706657996
  0.09073736368627416
  0.37933784525930764

I think what's going wrong is clear from this -- the to_vec form of ȳ_adjoint is appropriately conjugated because the parent is pulled out, while no conjugation happens with ȳ_mat.

I think the temporary solution is pretty clear (revert to the old implementation with some reshaping), but I'm very confused about how to resolve this moving forward, and how it relates to the discussion in #132 .

edit: I'm also not sure how to write tests to prevent this from happening in the future.

@sethaxen
Copy link
Member Author

sethaxen commented Jan 5, 2021

I haven't checked the discussion in #132 , but I think to_vec is doing two different things. First, it maps primal types to a real Euclidean space (maybe using a chart, or maybe not) based on their type. Second, it maps differential types to a real Euclidean space based on their type. But these shouldn't be the same thing. The former map is fine, but for the latter map, it's important that different differential types that represent the same cotangent vector map to the same (co)tangent vector. Perhaps what we need is a separate function for vectorizing (co)tangents than primals that also takes the primal point. Then we can ensure that a Matrix cotangent for an Adjoint primal is treated differently than a Matrix cotangent for a Matrix primal.

@sethaxen sethaxen closed this as completed Jan 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants