Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add OptimType.NONE in SplitTBE (defuse bwd and optim)
Summary: This diff is the **backend** part This diff introduces `OptimType.NONE`. Unlike other `OptimType`s, `OptimType.NONE` does not perform the optimizer step during SplitTBE's backward pass. With `OptimType.NONE`, SplitTBE deduplicates output gradients in the backward pass and generates a sparse gradient tensor (PyTorch's `sparse_coo_tensor`) for the device's weight (FQN: `weights_dev`). Currently, `OptimType.NONE` only supports the case where the embedding dimensions of all embedding tables are identical. Differential Revision: D44392172 fbshipit-source-id: 7e6df6857bc9d4dc1666aef855f21572c4fd35fc
- Loading branch information