Gradient descent variants #984

ozanoktem · 2017-04-22T06:48:00Z

We should implement some of the popular stochastic gradient optimisation techniques, such as SGD, SGD+momentum, Adagrad, Adadelta and Adam. These methods find local optimum (global when dealing with convex problem) of differentiable objective, see nice surveys in this arXiv preprint and this blog post.

Furthermore, this arXiv preprint suggests a gradient descent where they replace the classical (squared) two norm metric in the gradient descent setting with a generalised Bregman distance, based on a more general proper, convex and lower semi-continuous functional.

ozanoktem added the type: new feature label Apr 22, 2017

adler-j added a commit to adler-j/odl that referenced this issue May 22, 2017

ENH: Add ADAM solver, see odlgroup#984

77d83d0

adler-j added the area: solvers label May 23, 2017

adler-j added a commit to adler-j/odl that referenced this issue Aug 1, 2017

ENH: Add ADAM solver, see odlgroup#984

34f11fc

kohr-h pushed a commit to kohr-h/odl that referenced this issue Aug 31, 2017

ENH: Add ADAM solver, see odlgroup#984

d39088c

mehrhardt pushed a commit to mehrhardt/odl that referenced this issue Sep 19, 2017

ENH: Add ADAM solver, see odlgroup#984

8fbb20e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gradient descent variants #984

Gradient descent variants #984

ozanoktem commented Apr 22, 2017 •

edited

Loading

Gradient descent variants #984

Gradient descent variants #984

Comments

ozanoktem commented Apr 22, 2017 • edited Loading

ozanoktem commented Apr 22, 2017 •

edited

Loading