Pay-Attention-to-MLPs

Implementation of the gMLP model introduced in Pay Attention to MLPs.

The authors of the paper propose a simple attention-free network architecture, gMLP, based solely on MLPs with gating, and show that it can perform as well as Transformers in key language and vision applications.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Pay-Attention-to-MLPs

Files

README.md

Latest commit

History

README.md

File metadata and controls

Pay-Attention-to-MLPs