Skip to content

Commit

Permalink
xe: jit: gemm: workaround slow OOB check
Browse files Browse the repository at this point in the history
  • Loading branch information
Simonsays095 authored and karturov committed Nov 18, 2024
1 parent 280bd28 commit 3dd4f43
Showing 1 changed file with 10 additions and 0 deletions.
10 changes: 10 additions & 0 deletions src/gpu/intel/jit/gemm/generator/pieces/layout_setup.cxx
Original file line number Diff line number Diff line change
Expand Up @@ -680,6 +680,16 @@ bool BLASKernelGenerator<hw>::getBlockInfo(Type T, const MatrixAddressing &atype
block.byteGlue = true;
block.crosspack /= T.perByte();
}

// Xe2: manually mask in the height dimension to work around slow LSC
// out-of-bounds checks.
bool remainderH = memCM ? remainderC : remainderR;
if (hw >= HW::Xe2 && remainderH) {
auto &vymask = memCM ? block.colMask.variable : block.rowMask.variable;
vymask.isFixed = false;
vymask.bitRep = vymask.maskRep = vymask.rsize = 1;
vymask.rshift = 0;
}
break;
}
case AccessType::CacheLine: {
Expand Down

0 comments on commit 3dd4f43

Please sign in to comment.