-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Adadelta optimization operator #4576
Conversation
paddle/operators/adadelta_op.h
Outdated
framework::EigenVector<T>::Flatten(*avg_squared_update_out); | ||
auto place = ctx.GetEigenDevice<Place>(); | ||
|
||
g_acc_out.device(place) = rho * g_acc + (1 - rho) * g.square(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can name g_acc_out
as avg_squared_grad_eigen
to be consistent with the formula written in the DOC, it will be better to read and understand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will make the change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since avg_squared_grad_eigen
is a litter too long, we can use avg_squared_grad
here and change the formal value avg_squared_grad
get from tensor to avg_squared_grad_t
or something like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job, LGTM!
No description provided.