Don't rely on `Enzyme.gradient`, use `Enzyme.autodiff` directly #512

gdalle · 2024-09-30T08:34:20Z

wsmoses · 2024-09-30T15:53:36Z

Before you go ahead and tackle this, mind walking me through what your intended design is?

I'd want to help avoid you doing unnecessary work if the design needs changing and also to help you understand the pros/cons of different choices.

gdalle · 2024-09-30T16:53:38Z

The intended design is only for DI.gradient to be as fast as possible. We've had a few examples in the past days where you showed that DI.gradient was slower than Enzyme.autodiff, but it often turned out to be because Enzyme.gradient itself is slow, and DI.gradient calls Enzyme.gradient.
That is why I'd like to bypass Enzyme.gradient, because the high-level utility is worthless to me if it causes a 10x slowdown. To be fair, most of the slowdown might be due to a slow Enzyme.make_zero, as was the case in TuringLang/AdvancedVI.jl#98. And I don't think we'd see much slowdown in the case of plain arrays, but I'm open to being proven wrong.
In any case, the pull request #515 errors because of a bug I still don't fully understand, so it's not usable right now.

wsmoses · 2024-09-30T16:58:01Z

I guess my question is how will you make DI.gradient use autodiff optimally for all inputs. In particular this is an ongoing issue (that we actually made strides in over the weekend as we fixed the staticarrays stuff) where determining if active is legal and optimal for performance is a hard problem that may require user information.

…

On Mon, Sep 30, 2024 at 11:54 AM Guillaume Dalle ***@***.***> wrote: The intended design is only for DI.gradient to be as fast as possible. We've had a few examples in the past days where you showed that DI.gradient was slower than Enzyme.autodiff, but it often turned out to be because Enzyme.gradient itself is slow, and DI.gradient calls Enzyme.gradient. That is why I'd like to bypass Enzyme.gradient, because the high-level utility is worthless to me if it causes a 10x slowdown. To be fair, most of the slowdown might be due to a slow Enzyme.make_zero, as was the case in TuringLang/AdvancedVI.jl#98 <TuringLang/AdvancedVI.jl#98>. And I don't think we'd see much slowdown in the case of plain arrays, but I'm open to being proven wrong. — Reply to this email directly, view it on GitHub <#512 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJTUXE53WSEYKDIKPMRPHTZZF62RAVCNFSM6AAAAABPCYAQGGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBTG4YDQMBSGM> . You are receiving this because you commented.Message ID: ***@***.***>

gdalle · 2024-09-30T17:03:25Z

Right now my test suite only involves arrays, for which I just use Duplicated(x, grad). I don't think it is very different from what Enzyme.gradient would do, but I need autodiff anyway for the in-place version DI.gradient! where grad is provided.
In the future to support more input types, my idea was to use the preparation step DI.prepare_gradient to call some form of Enzyme.guess_activity (I seem to remember this existed somewhere), and then adapt to that?
The preparation step also allows me to amortize the cost of the type-unstable make_zero by paying it only once, and doing make_zero! for each gradient (which is probably faster?).

gdalle · 2024-10-10T12:14:10Z

#557 reverted to Enzyme.gradient for better static array support

gdalle added the backend Related to one or more autodiff backends label Sep 30, 2024

gdalle mentioned this issue Sep 30, 2024

Migrate to DifferentiationInterface TuringLang/AdvancedVI.jl#98

Merged

gdalle linked a pull request Sep 30, 2024 that will close this issue

Optimize Enzyme gradient #515

Merged

gdalle mentioned this issue Sep 30, 2024

Active return values with automatic pullback (differential return value) deduction only supported for floating-like values and not type EnzymeCore.Active{Float64} EnzymeAD/Enzyme.jl#1929

Closed

gdalle removed a link to a pull request Sep 30, 2024

Optimize Enzyme gradient #515

Merged

gdalle closed this as completed Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't rely on `Enzyme.gradient`, use `Enzyme.autodiff` directly #512

Don't rely on `Enzyme.gradient`, use `Enzyme.autodiff` directly #512

gdalle commented Sep 30, 2024 •

edited

Loading

wsmoses commented Sep 30, 2024

gdalle commented Sep 30, 2024 •

edited

Loading

wsmoses commented Sep 30, 2024 via email

gdalle commented Sep 30, 2024 •

edited

Loading

gdalle commented Oct 10, 2024

Don't rely on Enzyme.gradient, use Enzyme.autodiff directly #512

Don't rely on Enzyme.gradient, use Enzyme.autodiff directly #512

Comments

gdalle commented Sep 30, 2024 • edited Loading

wsmoses commented Sep 30, 2024

gdalle commented Sep 30, 2024 • edited Loading

wsmoses commented Sep 30, 2024 via email

gdalle commented Sep 30, 2024 • edited Loading

gdalle commented Oct 10, 2024

Don't rely on `Enzyme.gradient`, use `Enzyme.autodiff` directly #512

Don't rely on `Enzyme.gradient`, use `Enzyme.autodiff` directly #512

gdalle commented Sep 30, 2024 •

edited

Loading

gdalle commented Sep 30, 2024 •

edited

Loading

gdalle commented Sep 30, 2024 •

edited

Loading