Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support SM90 Grouped GEMM #1280

Closed
imoneoi opened this issue Dec 26, 2023 · 2 comments
Closed

[FEA] Support SM90 Grouped GEMM #1280

imoneoi opened this issue Dec 26, 2023 · 2 comments
Labels
feature request New feature or request
Milestone

Comments

@imoneoi
Copy link

imoneoi commented Dec 26, 2023

Is your feature request related to a problem? Please describe.
Grouped GEMM using cutlass is ~30% slower than a for-loop with cuBLAS GEMM on SM90 (H100). Implementation of grouped GEMM using cutlass and cuBLAS can be found here https://github.com/tgale96/grouped_gemm/blob/main/csrc/grouped_gemm.cu

Describe the solution you'd like
Consider adding SM90 support to the Grouped GEMM kernel in cutlass. It's currently using SM80. Grouped GEMM is important for training MoE models.

@imoneoi imoneoi added ? - Needs Triage feature request New feature or request labels Dec 26, 2023
@thakkarV
Copy link
Collaborator

Initial grouped GEMM for hopper is releasing imminently with 3.4 in the coming week or two

@mnicely mnicely added this to the CUTLASS 3.4 milestone Jan 2, 2024
@mnicely
Copy link
Collaborator

mnicely commented Jan 2, 2024

Grouped GEMM for Hopper was added last week. We will be tagging v3.4 soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants