Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_restructure as part of the public API? #1624

Closed
ericphanson opened this issue Jun 21, 2021 · 9 comments
Closed

_restructure as part of the public API? #1624

ericphanson opened this issue Jun 21, 2021 · 9 comments

Comments

@ericphanson
Copy link
Contributor

destructure seems like a convienent way to extract model weights to serialize out, but it's unfortunate the mechanism to reload them is a callback (which is not easy to serialize). Would it be possible to make _restructure part of the public API so it can be relied on to reload weights?

@DhairyaLGandhi
Copy link
Member

There's some limitations and potential work to fix them eg #1353

There's also some issues with holding on to some stateful fields in RNNs

@ericphanson
Copy link
Contributor Author

Ah, good to know. Do you think _restructure should hold off being part of the public API until these issues are solved?

@ToucheSir
Copy link
Member

Yes, destructure currently has the same problem as params/loadparams!. Namely that the mechanism is not robust to model structure or parameter size changes. Ideally we'd provide an API for getting a more structured parameter set instead, something like what gradient returns when you use explicit params.

@DhairyaLGandhi
Copy link
Member

DhairyaLGandhi commented Jun 21, 2021

There are reasons to want that, esp when your recipient models isn't the same structure as the model which generated the params, which is useful for loading into backbones and such, but that's separate.

@ToucheSir
Copy link
Member

Right, my point is that making restructure part of the public API should come with a strongly-worded warning to not use it for loading saved parameters.

@ericphanson
Copy link
Contributor Author

Makes sense; in my case we aren't using stateful layers AFAIK so the current behavior is fine; even

not robust to model structure or parameter size changes

is ok. Just want a simple (de)-serialization path for loading weights back into same model without scary non-local bugs like BSON's JuliaIO/BSON.jl#75 😄 . Currently using the Serialization stdlib for some models but that is very finnicky with presence/absence of anonymous functions in the (de)-serialization environment, and of course not robust to Julia minor version bumps.

@DhairyaLGandhi
Copy link
Member

DhairyaLGandhi commented Jun 21, 2021

As an aside, seems bson has been buggy recently with anonymous functions.

Right, my point is that making restructure part of the public API should come with a strongly-worded warning to not use it for loading saved parameters.

Not sure about this? Should be about equivalent to loadparams if the saved params were in a vector for some reason.

@ToucheSir
Copy link
Member

Just want a simple (de)-serialization path for loading weights back into same model without scary non-local bugs like BSON's JuliaIO/BSON.jl#75 smile .

This is what loadparams! was designed for :). You can collect(params(m)) and serialize it with most anything. That said, BSON should really have as little spooky action at a distance as possible.

What I'm trying to avoid by making restructure public without a big honking warning is users falling into the issues outlined in #1408 and related threads. As FluxML/Zygote.jl#941 shows, trying to migrate people off of something incorrect is much harder than stopping them from using it in the first place. Flux is also the only DL framework I know of that requires loading parameters in as a flattened list instead of a structured dict/namedtuple. It's not like we lack the machinery for it either, I just think it took the recent reckoning around implicit params to get the ball rolling again.

@ericphanson
Copy link
Contributor Author

ericphanson commented Jun 21, 2021

This is what loadparams! was designed for :). You can collect(params(m)) and serialize it with most anything.

Ah cool, I didn’t realize this. I think that covers this use case, so I’ll close this. Thanks for the quick feedback from both of you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants