[DispatchCreation] Collapse `iree_linalg_ext.attention` #19012

IanWood1 · 2024-11-04T19:10:11Z

This change adds support for attention in CollapseDimensionsPass so that the attention op will be collapsed as much as possible. This is motivated by reducing the different variants of attention that the sdxl attention spec has to handle.

Changes to LinalgExt/Transforms/ReshapeFusion.cpp are mostly taken directly from https://github.com/llvm/llvm-project/blob/002a0a27bc4702d6f34434c1838cb1698a0b0098/mlir/lib/Dialect/Linalg/Transforms/ElementwiseOpFusion.cpp (attributed at the top of the file). I attempted to keep not modify the original logic as much as possible to keep it general in case it needs to be reused for other LinalgExt ops.

Add support for attention in `CollapseDimensionsPass` so that the attention op gets collapsed as much as possible. This is motivated by reducing the different variants of attention that the sdxl attention spec has to handle. Signed-off-by: Ian Wood <[email protected]>

Since this pass now handles more than just `linalg.generic` ops. Fix up the comments and drop references to linalg.generic ops. Signed-off-by: Ian Wood <[email protected]>

MaheshRavishankar

Looks mostly good. Lets chat offline for me to get better context on this. Left a few minor comments.

MaheshRavishankar · 2024-11-07T19:24:18Z

compiler/src/iree/compiler/Dialect/LinalgExt/Transforms/ReshapeFusion.cpp

+
+  /// Map from iteration domain index in the original op to the iteration domain
+  /// index in the collapsed op.
+  SmallVector<std::pair<int64_t, unsigned>> origOpToCollapsedOpIterationDim;


Why int64_t and unsigned ?

compiler/src/iree/compiler/Dialect/LinalgExt/Transforms/ReshapeFusion.cpp

compiler/src/iree/compiler/DispatchCreation/test/collapse_dimensions.mlir

MaheshRavishankar

Spoke to Ian offline to get more context. Looks good.

Signed-off-by: Ian Wood <[email protected]>

This change adds support for attention in `CollapseDimensionsPass` so that the attention op will be collapsed as much as possible. This is motivated by reducing the different variants of attention that the sdxl attention spec has to handle. Changes to LinalgExt/Transforms/ReshapeFusion.cpp are mostly taken directly from https://github.com/llvm/llvm-project/blob/002a0a27bc4702d6f34434c1838cb1698a0b0098/mlir/lib/Dialect/Linalg/Transforms/ElementwiseOpFusion.cpp (attributed at the top of the file). I attempted to keep not modify the original logic as much as possible to keep it general in case it needs to be reused for other `LinalgExt` ops. --------- Signed-off-by: Ian Wood <[email protected]>

This change adds support for attention in `CollapseDimensionsPass` so that the attention op will be collapsed as much as possible. This is motivated by reducing the different variants of attention that the sdxl attention spec has to handle. Changes to LinalgExt/Transforms/ReshapeFusion.cpp are mostly taken directly from https://github.com/llvm/llvm-project/blob/002a0a27bc4702d6f34434c1838cb1698a0b0098/mlir/lib/Dialect/Linalg/Transforms/ElementwiseOpFusion.cpp (attributed at the top of the file). I attempted to keep not modify the original logic as much as possible to keep it general in case it needs to be reused for other `LinalgExt` ops. --------- Signed-off-by: Ian Wood <[email protected]> Signed-off-by: Giacomo Serafini <[email protected]>

Reland after fixing sdxl int8 regressions via #19012. Running CI revealed further performance regressions that have pending patches: #19325 and #19326. This reverts commit 8d3faf8. --------- Signed-off-by: Ian Wood <[email protected]>

IanWood1 changed the title ~~[DispatchCreation] Collapse LinalgExt::AttentionOp~~ [DispatchCreation] Collapse iree_linalg_ext.attention Nov 4, 2024

IanWood1 marked this pull request as ready for review November 4, 2024 20:51

IanWood1 requested review from hanhanW and MaheshRavishankar as code owners November 4, 2024 20:51

IanWood1 added 2 commits November 4, 2024 23:01

Cleanup comments & naming

0cb19cd

Since this pass now handles more than just `linalg.generic` ops. Fix up the comments and drop references to linalg.generic ops. Signed-off-by: Ian Wood <[email protected]>

MaheshRavishankar requested changes Nov 7, 2024

View reviewed changes

hanhanW requested a review from Groverkss November 7, 2024 21:43

MaheshRavishankar approved these changes Nov 7, 2024

View reviewed changes

IanWood1 added 2 commits November 8, 2024 02:22

Make test more strict

685e6aa

Signed-off-by: Ian Wood <[email protected]>

Use curly brackets

15cfd6e

Signed-off-by: Ian Wood <[email protected]>

IanWood1 merged commit 2bfc639 into iree-org:main Nov 12, 2024
36 checks passed

IanWood1 mentioned this pull request Nov 12, 2024

Reapply "Propagate reshapes through generics with reduction… (#18968) #19113

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DispatchCreation] Collapse `iree_linalg_ext.attention` #19012

[DispatchCreation] Collapse `iree_linalg_ext.attention` #19012

IanWood1 commented Nov 4, 2024

MaheshRavishankar left a comment

MaheshRavishankar Nov 7, 2024

MaheshRavishankar left a comment

[DispatchCreation] Collapse iree_linalg_ext.attention #19012

[DispatchCreation] Collapse iree_linalg_ext.attention #19012

Conversation

IanWood1 commented Nov 4, 2024

MaheshRavishankar left a comment

Choose a reason for hiding this comment

MaheshRavishankar Nov 7, 2024

Choose a reason for hiding this comment

MaheshRavishankar left a comment

Choose a reason for hiding this comment

[DispatchCreation] Collapse `iree_linalg_ext.attention` #19012

[DispatchCreation] Collapse `iree_linalg_ext.attention` #19012