You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Results produced by EVT-based DeviceGemmStreamK GEMM in example 47_ampere_gemm_universal_streamk/ampere_gemm_universal_streamk_broadcast.cu seem to be wrong for the case when m and n in the problem size differ.
All values in reference results are 67, as expected, while right half of result matrix produced by DeviceGemmStreamK GEMM is all zero.
Expected behavior
Zero or near-zero relative error is expected to be produced for given problem size.
Environment details (please complete the following information):
Tested with latest CUTLASS main, on a machine with A100.
Additional context
Note that the problem is actually quite probably with StreamK, as all the other StreamK GEMMs in the example will also report the same relative error - but I found the issue while actually trying to mimic this example in order to use EVT in my code, so I was focused on DeviceGemmStreamK in this example.
The text was updated successfully, but these errors were encountered:
Describe the bug
Results produced by EVT-based
DeviceGemmStreamK
GEMM in example47_ampere_gemm_universal_streamk/ampere_gemm_universal_streamk_broadcast.cu
seem to be wrong for the case whenm
andn
in the problem size differ.Steps/Code to reproduce bug
Run example say as follows:
Large relative error will be reported.
To further verify that results calculated are wrong, I've set all inputs to 1 by adding following block after inputs initialization:
and then printed reference results and results produced by
DeviceGemmStreamK
GEMM, just before the end of themain()
function:All values in reference results are 67, as expected, while right half of result matrix produced by
DeviceGemmStreamK
GEMM is all zero.Expected behavior
Zero or near-zero relative error is expected to be produced for given problem size.
Environment details (please complete the following information):
Tested with latest CUTLASS main, on a machine with A100.
Additional context
Note that the problem is actually quite probably with StreamK, as all the other StreamK GEMMs in the example will also report the same relative error - but I found the issue while actually trying to mimic this example in order to use EVT in my code, so I was focused on
DeviceGemmStreamK
in this example.The text was updated successfully, but these errors were encountered: