Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't rely on Enzyme.gradient, use Enzyme.autodiff directly #512

Closed
gdalle opened this issue Sep 30, 2024 · 5 comments
Closed

Don't rely on Enzyme.gradient, use Enzyme.autodiff directly #512

gdalle opened this issue Sep 30, 2024 · 5 comments
Labels
backend Related to one or more autodiff backends

Comments

@gdalle
Copy link
Member

gdalle commented Sep 30, 2024

See EnzymeAD/Enzyme.jl#1895

@wsmoses
Copy link

wsmoses commented Sep 30, 2024

Before you go ahead and tackle this, mind walking me through what your intended design is?

I'd want to help avoid you doing unnecessary work if the design needs changing and also to help you understand the pros/cons of different choices.

@gdalle
Copy link
Member Author

gdalle commented Sep 30, 2024

The intended design is only for DI.gradient to be as fast as possible. We've had a few examples in the past days where you showed that DI.gradient was slower than Enzyme.autodiff, but it often turned out to be because Enzyme.gradient itself is slow, and DI.gradient calls Enzyme.gradient.
That is why I'd like to bypass Enzyme.gradient, because the high-level utility is worthless to me if it causes a 10x slowdown. To be fair, most of the slowdown might be due to a slow Enzyme.make_zero, as was the case in TuringLang/AdvancedVI.jl#98. And I don't think we'd see much slowdown in the case of plain arrays, but I'm open to being proven wrong.
In any case, the pull request #515 errors because of a bug I still don't fully understand, so it's not usable right now.

@wsmoses
Copy link

wsmoses commented Sep 30, 2024 via email

@gdalle
Copy link
Member Author

gdalle commented Sep 30, 2024

Right now my test suite only involves arrays, for which I just use Duplicated(x, grad). I don't think it is very different from what Enzyme.gradient would do, but I need autodiff anyway for the in-place version DI.gradient! where grad is provided.
In the future to support more input types, my idea was to use the preparation step DI.prepare_gradient to call some form of Enzyme.guess_activity (I seem to remember this existed somewhere), and then adapt to that?
The preparation step also allows me to amortize the cost of the type-unstable make_zero by paying it only once, and doing make_zero! for each gradient (which is probably faster?).

@gdalle gdalle removed a link to a pull request Sep 30, 2024
@gdalle
Copy link
Member Author

gdalle commented Oct 10, 2024

#557 reverted to Enzyme.gradient for better static array support

@gdalle gdalle closed this as completed Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Related to one or more autodiff backends
Projects
None yet
Development

No branches or pull requests

2 participants