You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
NOTE: I am aware of the twice VJP trick https://j-towns.github.io/2017/06/12/A-new-trick.html. Unfortunately, none of the NN-optimized AD Tools in Julia verse are well suited for nested AD. Enzyme & ReverseDiff might work here, but their rules are a different story.
Quite a few SciML Applications require JVPs (e.g. Deep Equilibrium Models if using JFNK). However, currently, ForwardDiff tries to differentiate through most of the DL Kernels, which is terrible for performance due to:
Hitting the CPU / non-specialized dispatches on different architectures
Not using CUDNN / MIOpen altogether
List of operations we should be supporting at the bare minimum:
Convolutions
Regular
Depthwise
Grouped Convolution
Cross-Correlation
Pooling
Max Pooling
Mean Pooling
Conv Transpose
Rules for ∇conv_data and ∇conv_filter
Normalization
GroupNorm
BatchNorm
Softmax
LogSoftmax
Another benefit we get from writing forward-mode AD rules is the batched Jacobian construction. Since ForwardDiff Chunks the Jacobian Computation, we can construct Multiple Columns of the Jacobian in one go.
The text was updated successfully, but these errors were encountered:
NOTE: I am aware of the twice VJP trick https://j-towns.github.io/2017/06/12/A-new-trick.html. Unfortunately, none of the NN-optimized AD Tools in Julia verse are well suited for nested AD. Enzyme & ReverseDiff might work here, but their rules are a different story.
Quite a few SciML Applications require JVPs (e.g. Deep Equilibrium Models if using JFNK). However, currently, ForwardDiff tries to differentiate through most of the DL Kernels, which is terrible for performance due to:
List of operations we should be supporting at the bare minimum:
∇conv_data
and∇conv_filter
Another benefit we get from writing forward-mode AD rules is the batched Jacobian construction. Since ForwardDiff Chunks the Jacobian Computation, we can construct Multiple Columns of the Jacobian in one go.
The text was updated successfully, but these errors were encountered: