You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A STRICT build using sm_86 with MULTIGRID on fails with:
Building CUDA object lib/CMakeFiles/quda.dir/dslash_mdw_fused_ls20.cu.o
ptxas error : Value of threads per SM for entry ZN4quda10raw_kernelINS_18mobius_tensor_core17FusedMobiusDslashENS1_14FusedDslashArgIsLi3EL21QudaReconstructType_s8ELi20ELNS_19MdwfFusedDslashTypeE4ELi32ELi3ELb0EEELb0EEEvT0 is out of range. .minnctapersm will be ignored
The text was updated successfully, but these errors were encountered:
@jcosborn This is due to SM 86, 87 and 89 only allow a maximum number of 1536 (as supposed to 2048) per SM. I will have a PR to fix this. Meanwhile you can disable this part of the code by having -D QUDA_MDW_FUSED_LS_LIST="" as part of the cmake parameters, which would decrease your compile time by quite a bit I expect.
A STRICT build using sm_86 with MULTIGRID on fails with:
Building CUDA object lib/CMakeFiles/quda.dir/dslash_mdw_fused_ls20.cu.o
ptxas error : Value of threads per SM for entry ZN4quda10raw_kernelINS_18mobius_tensor_core17FusedMobiusDslashENS1_14FusedDslashArgIsLi3EL21QudaReconstructType_s8ELi20ELNS_19MdwfFusedDslashTypeE4ELi32ELi3ELb0EEELb0EEEvT0 is out of range. .minnctapersm will be ignored
The text was updated successfully, but these errors were encountered: