You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Layout<Shape<_1,_2,_1>>>; // 1x2x1 value group for 16x16x16 MMA and LDSM
For a kernel with TileShape 128x128x32, the TiledMMA uses 2x2x1 AtomLayout and 1x2x1 ValLayout, with corresponding 32x32x16 TiledShape_MNK in TiledMMA. Then I think cute will use the TiledMMA to tile 128x128x32.
Can we just set the ValLayout to 4x8x1 to make TiledShape_MNK 128x128x16? Then for M- and N- extent, we can process them "once".
And could you please give some suggestions on how to choose a proper ValLayout?
The text was updated successfully, but these errors were encountered:
You can absolutely set ValLayout to 4x8x1 to get a 128x128x16 MMA. With print_latex(tiled_mma), this can be very helpful for visualization.
However, the MMA partitioner will still partition any MxN tensor into a (MMA, MMA_M, MMA_N) tensor where the MMA-mode is for a single instruction and MMA_M and MMA_N modes are the number of instructions in M and N respectively. Thus, ValLayout won't actually affect the partitioning. It is actually being removed as a parameter in the next revision because it only interacts with the Permutation parameter in an unintuitive way and is difficult to explain. Feel free to ignore it.
You can absolutely set ValLayout to 4x8x1 to get a 128x128x16 MMA. With print_latex(tiled_mma), this can be very helpful for visualization.
However, the MMA partitioner will still partition any MxN tensor into a (MMA, MMA_M, MMA_N) tensor where the MMA-mode is for a single instruction and MMA_M and MMA_N modes are the number of instructions in M and N respectively. Thus, ValLayout won't actually affect the partitioning. It is actually being removed as a parameter in the next revision because it only interacts with the Permutation parameter in an unintuitive way and is difficult to explain. Feel free to ignore it.
Thank you. Looking forward to the new TiledMMA interface and docs.
Hi, @ccecka
cutlass/test/unit/gemm/device/default_gemm_configuration.hpp
Lines 166 to 172 in a75b4ac
For a kernel with TileShape
128x128x32
, the TiledMMA uses2x2x1
AtomLayout and1x2x1
ValLayout, with corresponding32x32x16
TiledShape_MNK in TiledMMA. Then I think cute will use the TiledMMA to tile128x128x32
.Can we just set the ValLayout to
4x8x1
to make TiledShape_MNK128x128x16
? Then forM-
andN-
extent, we can process them "once".And could you please give some suggestions on how to choose a proper ValLayout?
The text was updated successfully, but these errors were encountered: