[BUG] Sparse GEMM seems to produce wrong results for `F32` datatype and specific inputs #1270

alexsamardzic · 2023-12-13T20:13:03Z

Tried to change 15_ampere_sparse_tensorop_gemm example, in order to test sparse GEMM for F32 inputs, and specific m, n, k values. Here is the
diff of my changes, very simple and everything works fine: throughout several runs, example code reports that results of sparse GEMM and reference dense GEMM match.

However, when changed to particular values for tensor_a, tensor_b and tensor_e, instead of ones generated by example, the comparison with reference result fails. Here is the full changed example
source file (please rename to .cu). The file is rather big as specific values for mentioned tensors are put inline but besides that, changes are again minimal on top of the above mentioned diff: these specific values are applied to corresponding tensors through std::copy, and I'm also double checking that 16-bit values provided for meta tensor contain only 0x4 and 0xE quad-bits, as this should be the only limitation regarding specific values provided for mentioned tensors. The example will fail in this case, and I've added also a printout showing that the difference between specific element of sparse GEMM result and reference result is quite big - it's -1.26617 vs. -0.67898.

The text was updated successfully, but these errors were encountered:

hwu36 · 2023-12-14T00:02:15Z

could you try to use small int like [-3, 3] as the input? we don't have true fp32 tensor cores. we use tf32 to compute. tf32 has only 10 explicit mantissa bits, 14 less than true fp32. fp32 are converted to tf32 first before calling tensor cores.

hwu36 · 2023-12-14T00:03:47Z

moreover, try to use 64x32x32 warp tile size if you use 128x64x32 threadblock tile size.

alexsamardzic · 2023-12-14T15:15:45Z

Thanks, if dense GEMM force to tf32 then results indeed match.

alexsamardzic added ? - Needs Triage bug Something isn't working labels Dec 13, 2023

alexsamardzic mentioned this issue Dec 13, 2023

[sparse][semi-structured] enable fp32 support, separate sparse and dense constraints pytorch/pytorch#115550

Closed

alexsamardzic closed this as completed Dec 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Sparse GEMM seems to produce wrong results for `F32` datatype and specific inputs #1270

[BUG] Sparse GEMM seems to produce wrong results for `F32` datatype and specific inputs #1270

alexsamardzic commented Dec 13, 2023

hwu36 commented Dec 14, 2023 •

edited

Loading

hwu36 commented Dec 14, 2023

alexsamardzic commented Dec 14, 2023

[BUG] Sparse GEMM seems to produce wrong results for F32 datatype and specific inputs #1270

[BUG] Sparse GEMM seems to produce wrong results for F32 datatype and specific inputs #1270

Comments

alexsamardzic commented Dec 13, 2023

hwu36 commented Dec 14, 2023 • edited Loading

hwu36 commented Dec 14, 2023

alexsamardzic commented Dec 14, 2023

[BUG] Sparse GEMM seems to produce wrong results for `F32` datatype and specific inputs #1270

[BUG] Sparse GEMM seems to produce wrong results for `F32` datatype and specific inputs #1270

hwu36 commented Dec 14, 2023 •

edited

Loading