Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] CUTLASS 3.3 changelog change request #1170

Closed
manishucsd opened this issue Nov 2, 2023 · 2 comments
Closed

[DOC] CUTLASS 3.3 changelog change request #1170

manishucsd opened this issue Nov 2, 2023 · 2 comments
Labels

Comments

@manishucsd
Copy link
Contributor

manishucsd commented Nov 2, 2023

Suggesting a change to the changelog for CUTLASS 3.3

Current ChangeLog

New [Mixed Precision Hopper GEMMs](https://github.com/NVIDIA/cutlass/blob/main/examples/55_hopper_mixed_dtype_gemm) support covering 16-bit x 8-bit input types with optimal performance.
New [Mixed Precision Ampere GEMMs](https://github.com/NVIDIA/cutlass/commit/7d8317a63e0a978a8dbb3c1fb7af4dbe4f286616) with support for canonical layouts (TN) and {fp16, bf16} x {s8/u8}. They also include fast numeric conversion recipes and warp level shuffles to achieve optimal performance.

Suggested change

New [Mixed-Input Hopper GEMMs](https://github.com/NVIDIA/cutlass/blob/main/examples/55_hopper_mixed_dtype_gemm) support covering 16-bit x 8-bit input types with optimal performance.
New [Mixed-Input Ampere GEMMs](https://github.com/NVIDIA/cutlass/pull/1084) with support for canonical layouts (TN) and {fp16, bf16} x {s8/u8}. They also include fast numeric conversion recipes and warp level shuffles to achieve optimal performance.

Summary and rationale for the suggested changes

  1. Mixed-Precision to Mixed-Input. Mixed-Precision is taken by the GEMM data-type where inputs (DataType(operandA) == DataType(operandB) are mixed with a different accumulation data type (F16*F16+F32 and BF16*BF16+F32). The code uses cutlass::arch::OpMultiplyAddMixedInputUpcast tag to navigate and communicate that input data types are mixed. It would be good to set a nomenclature that is consistent and distinguishes between Mixed-Precision and Mixed-Input use-case.

  2. Update the hyperlink for Mixed Precision Ampere GEMMs to the PR#1084 which has detailed description, steps to only compile Ampere mixed-input GEMMs, reproduce performance results, and a performance graph.

Copy link

github-actions bot commented Dec 3, 2023

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@mnicely
Copy link
Collaborator

mnicely commented Jan 2, 2024

We've moved to Mixed "Input"

@mnicely mnicely closed this as completed Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants