-
Notifications
You must be signed in to change notification settings - Fork 3k
Distributed Training
Sherlock edited this page Mar 12, 2021
·
1 revision
- Good read: https://mpitutorial.com/tutorials/
- Understand NCCLAllReduce
- Get familiar with DDP usage/setup
-
Zero-1
- Understand ReduceScatter/AllGather
- Understand how optimizer state is partitioned
-
Zero-2
-
Zero-3
- Understand All2All
Please use the learning roadmap on the home wiki page for building general understanding of ORT.