SOAP

This is the official (preliminary) implementation of the SOAP optimizer from SOAP: Improving and Stabilizing Shampoo using Adam. To use, copy the soap.py file to your codebase and use SOAP optimizer in the following fashion:

from soap import SOAP

optim = SOAP(lr = 3e-3, betas=(.95, .95), weight_decay=.01, precondition_frequency=10)

We recommend trying it with as large batch size as possible, as expected from second order optimizers, the benefits are larger at larger batch sizes.

While in the paper our experiments are restricted to Transformers which only have 2D layers, the code supports nD layers. If you are using the optimizer for (n > 2) nD layers please see additional hyperparameters in soap.py.

We will release an improved version of the optimizer with support for lower precision and distributed training.

Haydn Jones has implemented a JAX version at https://github.com/haydn-jones/SOAP_JAX, though we have not yet verified the implementation.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
LICENSE		LICENSE
README.md		README.md
soap.py		soap.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SOAP

About

Releases

Packages

Contributors 5

Languages

License

nikhilvyas/SOAP

Folders and files

Latest commit

History

Repository files navigation

SOAP

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages