Skip to content

Commit

Permalink
Use int32_t for kThreadGroupSize in warp_per_row (#1332)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: #1332

Use int32_t for kThreadGroupSize instead of size_t.  Using size_t
causes a slowdown in some cases.

Reviewed By: jianyuh

Differential Revision: D39568569

fbshipit-source-id: 7a1e56f96a0e9929fb9600156f22fe3c58a3c59e
  • Loading branch information
sryap committed Sep 16, 2022
1 parent 54594d0 commit bb74666
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion fbgemm_gpu/codegen/embedding_backward_split_template.cu
Original file line number Diff line number Diff line change
Expand Up @@ -518,7 +518,7 @@ template <
typename grad_t,
typename cache_t,
size_t kMaxVecsPerThread,
size_t kThreadGroupSize = kWarpSize>
int32_t kThreadGroupSize = kWarpSize>
__global__
__launch_bounds__(kBackwardMaxThreads)
void
Expand Down

0 comments on commit bb74666

Please sign in to comment.