use ArrayInterface.restructure in update! #1613

CarloLucibello · 2021-06-10T08:06:28Z

Suggestion coming from @ChrisRackauckas in FluxML/Zygote.jl#989.
Now update! handles basically any gradient Zygote emits, e.g. FillArrays and Zygote.OneElement.

Fix #1510

ChrisRackauckas · 2021-06-10T09:08:19Z

It would be interesting to test this on some odd case like where Params contains a ComponentArray and the user just assumes that the gradient will be able to do grad.a == grad[1].

DhairyaLGandhi · 2021-06-10T09:25:33Z

Better handling is present in optimisers.jl

mcabbott · 2021-06-10T09:44:00Z

src/optimise/train.jl

 function update!(opt, x, x̄)
-  x .-= apply!(opt, x, x̄)
+  x̄r = ArrayInterface.restructure(x, x̄) # address some cases where Zygote's


Are there solutions without ArrayInterface? Since this is (I think) intended as a quick fix pending a better design.

Perhaps it could just test ismutable, and then use broadcasting to make a similar array (since x is already assumed mutable)?

function update!(opt, x, x̄) # quick fix for #1510, Zygote's output may not be mutable: x̄r = ismutable(x̄) ? x̄ : x̄ .+ false .* x x .-= apply!(opt, x, x̄r) end

Or perhaps better to test x̄ isa DenseArray or something, maybe there are mutable structs which aren't writeable...

function update!(opt, x, x̄::DenseArray) x .-= apply!(opt, x, x̄r) # as before end function update!(opt, x, x̄) x .-= apply!(opt, x, x̄ .+ false .* x) # fix for #1510, Zygote's output may not be mutable end

ismutable is wrong: it doesn't tell you if an array is mutable. It tells you if a type is a mutable type, which is different.

julia> ismutable(Adjoint(rand(4))) false

Base Julia does not have the right verbiage to answer these kinds of questions in an adequately generic way, which is the reason for ArrayInterface.jl becoming a standard trait interface to cover these kinds of issues.

ArrayInterface.ismutable(x) does signify "array is mutable"

I'm not worried about carrying on the ArrayInterface dependence, but I do want to get the semantics right.
Is ArrayInterface.ismutable(x) more appropriate than ArrayInterface.destructure?

Yes to Adjoint, but that would fail to the safe side, of making one too many copies.

ArrayInterface alone is about 0.5 seconds of loading time, Flux about 8.5 s without.

Ok! I guess I see this PR as a band-aid for the current structure of things.

I'd forgotten about that weird issue caused by Zeros. However, honest parameters expressed as x::SArray might not be crazy, and a design that allowed them might be nice. Perhaps a smaller step would be a design that didn't need to mutate dx to do x .+= ε .* dx. If the former happened before the latter then some variants of this PR would need re-visiting... which seems OK.

I don't think it's a tough fix, essentially https://github.com/FluxML/Flux.jl/blob/v0.12.4/src/optimise/train.jl#L29 needs to be rewritten to something like:

xs[x] = update!(opt, x, gs[x])

Where update always returns the transformed parameter and can optionally mutate it to save an allocation. This would require implementing setindex on Params, but it's the only solution I can think of that preserves both the API and semantics.

If you replace what object is stored in the IdDict, this won't be picked up by the model. I agree that you want some mutate-where-possible-but-always-return scheme, but maybe it has to take & return the whole model? IIRC this was being discussed at some point, Functors?

Yeah, the path we're walking down where dx is not mutated and neither is x when it is immutable is Optimisers.jl (based on Functors.jl to facilitate taking & returning the whole model). I would say it is not something we needed to address in this PR.

Pretty much, you'd still need to reconstruct afterwards. It's the inherent limitation of implicit params, which is why Optimisers.jl focuses on explicit params and being able to traverse through structures (à la JAX's pytrees) instead. I would love it if we could nuke implicit params overnight, but there are still a few UX things to figure out with explicit params before then.

CarloLucibello · 2021-06-10T17:23:23Z

It would be interesting to test this on some odd case like where Params contains a ComponentArray and the user just assumes that the gradient will be able to do grad.a == grad[1].

@ChrisRackauckas is the test I added what you meant here? Although probably that was already working, the returned gradient from Zygote is a ComponentArray

ChrisRackauckas · 2021-06-10T17:37:33Z

@ChrisRackauckas is the test I added what you meant here? Although probably that was already working, the returned gradient from Zygote is a ComponentArray

Yup that's the test. It was not working because if you did that on the sum example, you'd get back a FillArray in previous versions. So it's not that it was always not working, it was Zygote gets a little fussy and chooses when it wanted to obey those rules 😆

ToucheSir

Since this has come up before and Optimisers.jl needs some more time to bake, shall we get this merged?

CarloLucibello · 2021-06-17T06:25:55Z

@darsnack or @DhairyaLGandhi can I get an approval here?

DhairyaLGandhi · 2021-06-17T06:37:05Z

What needs to bake in optimisers.jl?

we don't have a timeline for the switch to Optimisers.jl yet, it will probably take months, so considerations about Optimisers.jl shouldn't stop us from fixing issues here and now

EDIT: Instead of replying to Dhairya comment I inadvertently edited and replied inside his own comment, sorry for that (Carlo Lucibello)

darsnack

Agree let's merge and not wait around for Optimisers.jl.

CarloLucibello · 2021-06-17T12:20:13Z

bors r+

@ChrisRackauckas

1613: use ArrayInterface.restructure in update! r=CarloLucibello a=CarloLucibello Suggestion coming from @ChrisRackauckas in FluxML/Zygote.jl#989. Now `update!` handles basically any gradient Zygote emits, e.g. FillArrays and Zygote.OneElement. Fix #1510 Co-authored-by: CarloLucibello <[email protected]>

CarloLucibello · 2021-06-17T14:05:17Z

bors r-

bors · 2021-06-17T14:05:18Z

Canceled.

CarloLucibello · 2021-06-17T14:05:26Z

bors r+

DhairyaLGandhi · 2021-06-17T14:12:20Z

Seeing as we have come around to switching out the zeros and have momentum now, I would say we should focus on moving ahead with Optimisers.jl sooner rather than later.

bors · 2021-06-17T16:23:15Z

Build succeeded:

buildkite/flux-dot-jl

CarloLucibello added 3 commits June 10, 2021 09:45

use ArrayInterface.restructure in update!

4f67a9b

more tests

f96fe93

fix typo

28bd53e

CarloLucibello requested a review from DhairyaLGandhi June 10, 2021 08:08

mcabbott reviewed Jun 10, 2021

View reviewed changes

CarloLucibello mentioned this pull request Jun 10, 2021

Tied weights using Flux layers #1592

Open

add test for ComponentArrays

6bae1e7

CarloLucibello added 2 commits June 10, 2021 19:24

add another ComponentArray test

1fe64dc

cleanup project

a77b32f

ToucheSir approved these changes Jun 14, 2021

View reviewed changes

darsnack mentioned this pull request Jun 14, 2021

add namedparams #1144

Open

darsnack approved these changes Jun 17, 2021

View reviewed changes

darsnack mentioned this pull request Jun 17, 2021

Use Optimisers.jl #1481

Open

bors bot merged commit de76e08 into master Jun 17, 2021

DhairyaLGandhi deleted the cl/opt3 branch June 17, 2021 16:58

darsnack mentioned this pull request Jun 23, 2021

ERROR: setindex! not defined for Zygote.OneElement{...} #1626

Closed

ToucheSir mentioned this pull request Jul 15, 2021

It seems a breaking change was introduced after v0.6.14 FluxML/Zygote.jl#1029

Closed

mcabbott mentioned this pull request Nov 2, 2022

Safer gradients (by copying before mutating) & less piracy (by removing ArrayInterface) #2098

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use ArrayInterface.restructure in update! #1613

use ArrayInterface.restructure in update! #1613

CarloLucibello commented Jun 10, 2021 •

edited

Loading

ChrisRackauckas commented Jun 10, 2021

DhairyaLGandhi commented Jun 10, 2021

mcabbott Jun 10, 2021

ChrisRackauckas Jun 10, 2021

ChrisRackauckas Jun 10, 2021

CarloLucibello Jun 10, 2021

mcabbott Jun 10, 2021 •

edited

Loading

mcabbott Jun 11, 2021

ToucheSir Jun 11, 2021

mcabbott Jun 11, 2021

darsnack Jun 11, 2021

ToucheSir Jun 11, 2021 •

edited

Loading

CarloLucibello commented Jun 10, 2021

ChrisRackauckas commented Jun 10, 2021

ToucheSir left a comment

CarloLucibello commented Jun 17, 2021

DhairyaLGandhi commented Jun 17, 2021 •

edited by CarloLucibello

Loading

darsnack left a comment

CarloLucibello commented Jun 17, 2021

CarloLucibello commented Jun 17, 2021

bors bot commented Jun 17, 2021

CarloLucibello commented Jun 17, 2021

DhairyaLGandhi commented Jun 17, 2021

bors bot commented Jun 17, 2021

use ArrayInterface.restructure in update! #1613

use ArrayInterface.restructure in update! #1613

Conversation

CarloLucibello commented Jun 10, 2021 • edited Loading

ChrisRackauckas commented Jun 10, 2021

DhairyaLGandhi commented Jun 10, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mcabbott Jun 10, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ToucheSir Jun 11, 2021 • edited Loading

Choose a reason for hiding this comment

CarloLucibello commented Jun 10, 2021

ChrisRackauckas commented Jun 10, 2021

ToucheSir left a comment

Choose a reason for hiding this comment

CarloLucibello commented Jun 17, 2021

DhairyaLGandhi commented Jun 17, 2021 • edited by CarloLucibello Loading

darsnack left a comment

Choose a reason for hiding this comment

CarloLucibello commented Jun 17, 2021

CarloLucibello commented Jun 17, 2021

bors bot commented Jun 17, 2021

CarloLucibello commented Jun 17, 2021

DhairyaLGandhi commented Jun 17, 2021

bors bot commented Jun 17, 2021

CarloLucibello commented Jun 10, 2021 •

edited

Loading

mcabbott Jun 10, 2021 •

edited

Loading

ToucheSir Jun 11, 2021 •

edited

Loading

DhairyaLGandhi commented Jun 17, 2021 •

edited by CarloLucibello

Loading