Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce optimizer_base_type in support of different optimizers #116

Merged

Conversation

milancurcic
Copy link
Member

This is the first step toward decoupling the optimizer logic from the concrete layers.

This PR only introduces the abstract optimizer_base_type and a concrete sgd type.

The update of weights is still hardcoded in network % train and the concrete layer implementations; decoupling that remains a TODO.

In a nutshell, the idea is to have

  • Concrete optimizer types such as sgd, adam, etc. in nf_optimizers.f90 (and its submodules, eventually);
  • The type constructors would expect optimizer parameters from the user (e.g. adam(learning_rate, beta1, beta2, epsilon, ...))
  • Each concrete type would define an update subroutine which would expect the needed gradients (dw, db) as input, and also the weights and biases arrays as intent(out) to update.

@rweed let me know if this approach seems reasonable to you.

@milancurcic milancurcic merged commit edd3f70 into modern-fortran:main Jan 19, 2023
@milancurcic milancurcic deleted the refactor-optimizer-stub branch January 19, 2023 15:30
wilsonify pushed a commit to wilsonify/modern-fortran that referenced this pull request Jan 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant