You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add a per-parameter learning rate implementation of gradient descent.
Motivation
The current gradient descent implementation uses the minimum sigma0 applied to all parameters. This is problematic if the gradient magnitudes are dissimilar, or for learning rate calibration to fine tune convergence.
Possible implementation
No response
Additional context
No response
The text was updated successfully, but these errors were encountered:
Feature description
Add a per-parameter learning rate implementation of gradient descent.
Motivation
The current gradient descent implementation uses the minimum
sigma0
applied to all parameters. This is problematic if the gradient magnitudes are dissimilar, or for learning rate calibration to fine tune convergence.Possible implementation
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: