You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
I'm currently trying to use the grouped gemm code in my project, but I've noticed that in every iteration, workspace is initialized (based on torch::Tensor workspace = torch::empty(workspace_size, options)); that seems unnecessary?
Because cutlass's workspace is reuseable. And it seems to affect performance when used frequently, such as in many MoE layers, or when the MxNxK is large. Has anyone tested the effects of this?
The text was updated successfully, but these errors were encountered:
Hello,
I'm currently trying to use the grouped gemm code in my project, but I've noticed that in every iteration, workspace is initialized (based on
torch::Tensor workspace = torch::empty(workspace_size, options)
); that seems unnecessary?Because cutlass's workspace is reuseable. And it seems to affect performance when used frequently, such as in many MoE layers, or when the MxNxK is large. Has anyone tested the effects of this?
The text was updated successfully, but these errors were encountered: