to_vec doesn't preserve equality #132

willtebbutt · 2021-01-03T15:23:27Z

Firstly, I'm very fond of to_vec -- it's served us well over the last couple of years. We know it has some shortcomings though, that we will need to address at some point.

Presently we have implementations of to_vec like this instead of implementations that look like this.

IIRC the reason for this is that if you define things in the latter fashion then you'll find if you do something like

A = randn(5, 4)
A_vec, _ = to_vec(A)
At_vec, _ = to_vec(transpose(A))

then A_vec == At_vec. My suspicion is that at some point in time, we were trying to compare the equality of two things based on their to_vec representation and it came up that things weren't being compared properly. We do a similar thing with Diagonal etc.

Unfortunately, it seems quite straightforward to construct examples which don't admit such a straightforward resolution as we employed for Adjoint, Transpose, and Diagonal. For a trivial example, consider that

A = randn(5)
A == (A, ) # is false -- a `Tuple` is never equal to a `Vector`

However, since to_vec(::Tuple) simply concatenates to_vec of each of the fields, we'll find that

to_vec(A)[1] == to_vec((A, ))[1]

This leads me to believe that it's likely not possible to define to_vec such that equality is preserved.

As such, I propose that we

embrace, and explicitly document, that knowing whether to_vec(x)[1] == to_vec(y)[1] in general tells you nothing about whether x == y, and vice versa. We should point this out to explicitly warn people away from trying to check whether two things are equal based on their to_vec representation.
refactor some of the things (like Tranpose and Adjoint) to use the struct representation, which is more computationally efficient.
to_vec should purely be thought of as a way to make FiniteDifferences' life easier. We might even consider ceasing to export it.

Now that @oxinabox has added some new approximate equality checking to ChainRulesTestUtils, this ought to be less of a problem anyway, because it's now more straightforward to check whether things are approximately equal.

Of course, if anyone can see a different path forward, I'm all ears.

The text was updated successfully, but these errors were encountered:

willtebbutt · 2021-01-03T15:36:32Z

I also suspect that this change would make it much safer to rely on recursive definitions of to_vec. For example, we ought to be able to define a single (maybe @generated) fallback version of to_vec for struct-types.

oxinabox · 2021-01-03T15:59:28Z

Would be solved if we just entirely stopped using to_vec by doing #97
but til then your plan seems good.

Lets keep exporting it though, it is part of the public API, in that users are expected to extend it.

willtebbutt · 2021-01-07T20:17:34Z

The other thing to say here is that the present design means that to_vec of a structured array can be of a different dimensionality from the tangent generated by rand_tangent which is, by design, an appropriate tangent in the ChainRulesCore-sense.

In particular, to_vec(Diagonal(randn(3)))[1] will have length 9, while to_vec(rand_tangent(Diagonal(3)))[1] will have length 3. The causes problems for our implementation of jvp, (presumably) among other things.

Again, this isn't to say that what we're presently doing in to_vec is necessarily wrong, just that it's inconsistent with what we're doing in other places.

willtebbutt · 2021-01-07T20:17:59Z

Also, regarding the above plan, @sethaxen what are your thoughts on it in light of our adjoint-adventures the other day? I'm nervous that if I start changing things, things will start breaking in strange ways.

sethaxen · 2021-01-10T00:43:28Z

Also, regarding the above plan, @sethaxen what are your thoughts on it in light of our adjoint-adventures the other day?

I agree in general with your comments. to_vec's current implementation worries me after the adventure in #121.

I need to formalize my thinking here, but I think what we have been doing so far is using to_vec to embed in some real space, and then (co)tangent vectors are elements of a (co)tangent subspace (e.g. mapping to points on a sphere with tangent vectors in a plane). For primals, I think we should prefer to_vec implementations that are charts, that is, maps to a real Euclidean space whose dimension is equivalent to the dimension of the point. So to_vec(::Diagonal) should extract the diagonal, and to_vec(::Hermitian) should extract a strict triangle and the real part of the diagonal. It might require more overloads, but this will at least ensure the right thing is being done, and the decreased dimensionality should make it faster. And of course, it can be recursive.

We need a different function to handle (co)tangents. Because we randomize the (co)tangents, the map here doesn't really matter all that much (though maybe it should). The important requirements are 1) it produces a real vector with the same dimension as the primal and 2) all valid equivalent representations for the same (co)tangent vector should map to the same real vector. So e.g. to_vec(primal::Adjoint, vector::Adjoint), to_vec(primal::Adjoint, vector::Composite{Adjoint}), to_vec(primal::Adjoint, vector::Matrix) all should produce the same output. I think this latter requirement is the harder one to fulfill, for all the same reasons that it's hard to support arbitrary cotangent vectors in pullbacks. That's why it's been easier so far to embed everything in the Matrixes through calling Matrix and then vectorize.

I'm nervous that if I start changing things, things will start breaking in strange ways.

We could always set up a ChainRules integration test, since arguably ChainRules's test suite puts this package through its paces more than our test suite can.

willtebbutt · 2021-01-10T01:05:37Z

We could always set up a ChainRules integration test, since arguably ChainRules's test suite puts this package through its paces more than our test suite can.

I'm also still trying to figure out what I think about this whole thing, but I agree that we should really do this at the very least.

sethaxen · 2021-01-10T01:11:44Z

Also, regarding the above plan, @sethaxen what are your thoughts on it in light of our adjoint-adventures the other day? I'm nervous that if I start changing things, things will start breaking in strange ways.

I'll get on it

willtebbutt changed the title ~~Some ambiguity in to_vec~~ to_vec doesn't preserve equality Jan 3, 2021

willtebbutt mentioned this issue Jan 5, 2021

Bug when using Matrix as differential for Adjoint #133

Closed

sethaxen mentioned this issue Jan 10, 2021

Add ChainRules integration test #136

Merged

willtebbutt mentioned this issue Feb 16, 2021

Only use the diagonal values in Diag, not the whole Matrix #146

Closed

willtebbutt mentioned this issue May 28, 2021

Symmetric double counts #164

Open

willtebbutt mentioned this issue Aug 1, 2021

WIP: Change matrix to_vec semantics #187

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

to_vec doesn't preserve equality #132

to_vec doesn't preserve equality #132

willtebbutt commented Jan 3, 2021 •

edited

Loading

willtebbutt commented Jan 3, 2021 •

edited

Loading

oxinabox commented Jan 3, 2021 •

edited

Loading

willtebbutt commented Jan 7, 2021

willtebbutt commented Jan 7, 2021 •

edited

Loading

sethaxen commented Jan 10, 2021

willtebbutt commented Jan 10, 2021

sethaxen commented Jan 10, 2021

to_vec doesn't preserve equality #132

to_vec doesn't preserve equality #132

Comments

willtebbutt commented Jan 3, 2021 • edited Loading

willtebbutt commented Jan 3, 2021 • edited Loading

oxinabox commented Jan 3, 2021 • edited Loading

willtebbutt commented Jan 7, 2021

willtebbutt commented Jan 7, 2021 • edited Loading

sethaxen commented Jan 10, 2021

willtebbutt commented Jan 10, 2021

sethaxen commented Jan 10, 2021

willtebbutt commented Jan 3, 2021 •

edited

Loading

willtebbutt commented Jan 3, 2021 •

edited

Loading

oxinabox commented Jan 3, 2021 •

edited

Loading

willtebbutt commented Jan 7, 2021 •

edited

Loading