Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add OptimType.NONE in SplitTBE (defuse bwd and optim) (#1819)
Summary: Pull Request resolved: #1819 This diff is the **backend** part This diff introduces `OptimType.NONE`. Unlike other `OptimType`s, `OptimType.NONE` does not perform the optimizer step during SplitTBE's backward pass. With `OptimType.NONE`, SplitTBE deduplicates output gradients in the backward pass and generates a sparse gradient tensor (PyTorch's `sparse_coo_tensor`) for the device's weight (FQN: `weights_dev`). Currently, `OptimType.NONE` only supports the case where the embedding dimensions of all embedding tables are identical. Differential Revision: D44392172 fbshipit-source-id: b1264e5a5032ebad051d5c5b739dd9ffec1d8a92
- Loading branch information