-
Notifications
You must be signed in to change notification settings - Fork 511
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Allocate a big output tensor and split in group_index_select_dim0_bac…
…kward Summary: Before this diff, `group_index_select_dim0` backward calls `at::zeros` `group_size` number of times which launches `group_size` elementwise kernels. Since `group_size` can be a large value (up to 55), this can be costly. This diff fixes the problem by allocating one big tensor and splitting it into smaller tensors. This will launch only one elementwise kernel per group. However, this can cause higher overhead on the host side. Differential Revision: D45823864 fbshipit-source-id: f127b82bea6e49d4373bedf6c7307635161db87a
- Loading branch information
1 parent
36b0d18
commit e477fe9
Showing
1 changed file
with
33 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters