Skip to content

Latest commit

 

History

History
2 lines (2 loc) · 128 Bytes

README.md

File metadata and controls

2 lines (2 loc) · 128 Bytes

MoE-LLM

Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"