Skip to content

Commit

Permalink
Filter out fwd igemm kernel where split number is larger than input c…
Browse files Browse the repository at this point in the history
…hannel (#2187)

* fix a bug in fwd where split is larger than input channel

* format code

* use proper variable
  • Loading branch information
carlushuang authored Jun 8, 2023
1 parent a705ea5 commit deeec67
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion src/solver/conv_asm_implicit_gemm_gtc_fwd_nhwc.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -700,7 +700,8 @@ bool PerformanceConfigAsmImplicitGemmGTCFwdXdlopsNHWC::IsValid(
if(!(tensor_a_thread_lengths[1] == 1 && tensor_b_thread_lengths[1] == 1))
{
// if both 1, indicate padded c support
if(((c >> gemm_k_global_split) / group) % gemm_k_per_block != 0)
if((c >> gemm_k_global_split == 0) ||
(((c >> gemm_k_global_split) / group) % gemm_k_per_block != 0))
return false;
// also, add this restriction to k, for vector write out
if(problem.IsFp16())
Expand Down

0 comments on commit deeec67

Please sign in to comment.