Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] EVT-based GEMM in example 47 produces wrong results when m!=n in the problem size #1145

Closed
alexsamardzic opened this issue Oct 16, 2023 · 2 comments
Labels
? - Needs Triage bug Something isn't working

Comments

@alexsamardzic
Copy link
Contributor

Describe the bug

Results produced by EVT-based DeviceGemmStreamK GEMM in example 47_ampere_gemm_universal_streamk/ampere_gemm_universal_streamk_broadcast.cu seem to be wrong for the case when m and n in the problem size differ.

Steps/Code to reproduce bug

Run example say as follows:

./47_ampere_gemm_universal_streamk_broadcast --m=128 --n=256 --k=64

Large relative error will be reported.

To further verify that results calculated are wrong, I've set all inputs to 1 by adding following block after inputs initialization:

  cutlass::reference::host::TensorFill(options.tensor_a.host_view(), ElementA(1));
  cutlass::reference::host::TensorFill(options.tensor_b.host_view(), ElementB(1));
  cutlass::reference::host::TensorFill(options.tensor_c1.host_view(), ElementC(1));
  cutlass::reference::host::TensorFill(options.tensor_c2.host_view(), ElementC(1));
  cutlass::reference::host::TensorFill(options.tensor_Vector.host_view(), ElementC(1));

and then printed reference results and results produced by DeviceGemmStreamK GEMM, just before the end of the main() function:

  std::cout << options.tensor_ref_d.host_view() << "\n\n\n";
  std::cout << options.tensor_d.host_view() << "\n\n\n";

All values in reference results are 67, as expected, while right half of result matrix produced by DeviceGemmStreamK GEMM is all zero.

Expected behavior

Zero or near-zero relative error is expected to be produced for given problem size.

Environment details (please complete the following information):

Tested with latest CUTLASS main, on a machine with A100.

Additional context

Note that the problem is actually quite probably with StreamK, as all the other StreamK GEMMs in the example will also report the same relative error - but I found the issue while actually trying to mimic this example in order to use EVT in my code, so I was focused on DeviceGemmStreamK in this example.

@alexsamardzic alexsamardzic added ? - Needs Triage bug Something isn't working labels Oct 16, 2023
@hwu36
Copy link
Collaborator

hwu36 commented Oct 16, 2023

have you tried #1120?

@alexsamardzic
Copy link
Contributor Author

Thanks, that change fixes it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants