Reapply "Propagate reshapes through generics with reduction… (#18968) #19113

IanWood1 · 2024-11-12T15:58:27Z

Reland after fixing sdxl int8 regressions via #19012.

Running CI revealed further performance regressions that have pending patches: #19325 and #19326.

This reverts commit 8d3faf8.

IanWood1 · 2024-11-13T19:22:36Z

Waiting to rebase on top of #19088, then will land

MaheshRavishankar · 2024-11-14T00:50:41Z

Am I reading this right? The dispatch count seems to have regressed?

hanhanW · 2024-11-14T00:59:42Z

Am I reading this right? The dispatch count seems to have regressed?

246 -> 245 looks like an improvment?

IanWood1 · 2024-11-14T02:04:14Z

Am I reading this right? The dispatch count seems to have regressed?

246 -> 245 looks like an improvment?

But the new benchmarks seem to be regressing. I'm not sure why since there was no effect on the old unet benchs

MaheshRavishankar · 2024-11-14T02:17:07Z

Try rebasing

IanWood1 · 2024-11-14T03:39:51Z

Try rebasing

I'll try that. I was getting some weird dispatch count results when testing locally with main

IanWood1 · 2024-11-14T04:33:07Z

I think it is all coming from:

%extracted_slice_237 = tensor.extract_slice %152[0, 0, 0] [2, 4096, 2560] [1, 1, 1] : tensor<2x4096x5120xf16> to tensor<2x4096x2560xf16>
%extracted_slice_238 = tensor.extract_slice %152[0, 0, 2560] [2, 4096, 2560] [1, 1, 1] : tensor<2x4096x5120xf16> to tensor<2x4096x2560xf16>
%expanded_239 = tensor.expand_shape %extracted_slice_237 [[0], [1, 2], [3]] output_shape [2, 64, 64, 2560] : tensor<2x4096x2560xf16> into tensor<2x64x64x2560xf16>
%expanded_240 = tensor.expand_shape %extracted_slice_238 [[0], [1, 2], [3]] output_shape [2, 64, 64, 2560] : tensor<2x4096x2560xf16> into tensor<2x64x64x2560xf16>

Which could be cloned into the dispatch if the expand_shape wasn't blocking

IanWood1 · 2024-11-19T00:12:00Z

Looks like there is some bad codegen when reductions aren't collapsed:

Slow dispatch -- expanded ~164us
Fast dispatch -- collapsed ~ 79us

MaheshRavishankar · 2024-11-19T02:51:44Z

That's interesting, but shouldn't the collapse dimension fold those back?

IanWood1 · 2024-11-19T03:45:47Z

That's interesting, but shouldn't the collapse dimension fold those back?

Yeah, I had to change it a bit to make isEligibleForCollapse less restrictive, as well as modifying getCollapsibleLoops so that it considered the results of all indexing_maps (not just the first one). Before force pushing, it seemed to regain perf (I just fixed the failing lit tests). hopefully this resolves all perf problems🤞

IanWood1 · 2024-11-27T18:20:39Z

@hanhanW would you be able to look at this again? Only the latest commit is new since last review

hanhanW

@hanhanW would you be able to look at this again? Only the latest commit is new since last review

The PR title and description need to be updated. It is not just reapply a reverted commit. btw, should we create a separate PR for the last commit?

hanhanW · 2024-11-27T18:30:41Z

compiler/src/iree/compiler/DispatchCreation/BubbleUpExpandShapes.cpp

+    if (llvm::any_of(origType.getShape(), ShapedType::isDynamic) ||
+        llvm::any_of(extractedType.getShape(), ShapedType::isDynamic) ||
+        llvm::any_of(expandedType.getShape(), ShapedType::isDynamic)) {
+      return failure();
+    }


I think you can use !xxxType.hasStaticShape().

hanhanW · 2024-11-27T18:32:20Z

compiler/src/iree/compiler/DispatchCreation/test/collapse_dimensions.mlir

+
+

nit: remove one blank line

hanhanW · 2024-11-27T18:35:08Z

compiler/src/iree/compiler/DispatchCreation/CollapseDimensions.cpp

+          genericOp.getDpsInputOperands(), [&](OpOperand *operand) -> bool {
+            auto genericOperand =
+                operand->get().getDefiningOp<linalg::GenericOp>();
+            if (!genericOperand) {
+              return false;
+            }
+
+            if (genericOperand.getNumReductionLoops() == 0) {
+              return false;
+            }
+
+            auto map = genericOp.getMatchingIndexingMap(operand);
+            return !map.isPermutation() && map.isProjectedPermutation();
+          })) {


[optional]: It'd be easier if you declare a lambda/function for this.

remove my approval because of new changes.

IanWood1 · 2024-11-27T19:27:17Z

@hanhanW would you be able to look at this again? Only the latest commit is new since last review

The PR title and description need to be updated. It is not just reapply a reverted commit. btw, should we create a separate PR for the last commit?

That makes sense, this change had several perf regressions that needed to get fixed. Would you like me to create a new PR with the changes (with your comments applied) and then after landing I would rebase and force push so that this only contains the revert commit?

hanhanW · 2024-11-27T19:36:27Z

That makes sense, this change had several perf regressions that needed to get fixed. Would you like me to create a new PR with the changes (with your comments applied) and then after landing I would rebase and force push so that this only contains the revert commit?

Yes, that'd be great; it makes the codebase state and commit tracking better!

IanWood1 · 2024-11-27T21:09:05Z

I created 2 new PRs for each of the separate fixes and updated the description of this PR. I'll rebase this branch after those have landed so that this only serves as a revert.

Before force pushing, the previous HEAD was ef99f8d

This is the 1/2 changes needed to reland #18857 (with an open PR #19113). Adds pattern to bubble up expand shape through extract slice. i.e `expand(extract)` to `extract(expand)`. This only supports the case where the expanded dimensions are not modified by the extract slice and there are no dynamic dimensions. This is important because `tensor.expand_shape` ops _cannot be cloned_ while `tensor.extract_slice` ops _can be cloned_. So, if the `expand_shape` gets stuck on the bottom of the `extract_slice` it will block it from being cloned and the `extract_slice` will have to be put into its own dispatch. --------- Signed-off-by: Ian Wood <[email protected]>

…#19325) This is the 1/2 changes needed to reland iree-org#18857 (with an open PR iree-org#19113). Adds pattern to bubble up expand shape through extract slice. i.e `expand(extract)` to `extract(expand)`. This only supports the case where the expanded dimensions are not modified by the extract slice and there are no dynamic dimensions. This is important because `tensor.expand_shape` ops _cannot be cloned_ while `tensor.extract_slice` ops _can be cloned_. So, if the `expand_shape` gets stuck on the bottom of the `extract_slice` it will block it from being cloned and the `extract_slice` will have to be put into its own dispatch. --------- Signed-off-by: Ian Wood <[email protected]> Signed-off-by: Giacomo Serafini <[email protected]>

This is the 2/2 changes needed to reland #18857 (open PR #19113). There are 2 small subchanges in this pr: - Refactor to lambda `isPossiblySoftmax` and tweak the condition to check for operand indexing maps that are projected permutations but not permutations. - Extend `getCollapsibleLoops` look at all operations to get contiguous loops. Also, I added test case that needed both these changes to pass (without being collapsed it was causing regressions on the GPU backend). --------- Signed-off-by: Ian Wood <[email protected]>

…g#18968) This reverts commit 8d3faf8. Signed-off-by: Ian Wood <[email protected]>

IanWood1 marked this pull request as ready for review November 12, 2024 19:29

IanWood1 requested review from hanhanW, MaheshRavishankar and ScottTodd as code owners November 12, 2024 19:29

IanWood1 requested review from nithinsubbiah and removed request for ScottTodd November 12, 2024 19:29

nithinsubbiah approved these changes Nov 12, 2024

View reviewed changes

hanhanW previously approved these changes Nov 12, 2024

View reviewed changes

IanWood1 force-pushed the reland_prop_through_reduction branch from 6383ab6 to 4765737 Compare November 14, 2024 03:39

IanWood1 force-pushed the reland_prop_through_reduction branch 2 times, most recently from 62bdaef to 98aa2a5 Compare November 18, 2024 20:32

IanWood1 force-pushed the reland_prop_through_reduction branch from 8409122 to ef99f8d Compare November 19, 2024 03:36

IanWood1 requested a review from hanhanW November 27, 2024 18:19

hanhanW reviewed Nov 27, 2024

View reviewed changes

This was referenced Nov 27, 2024

[Dispatch] Add pattern to bubble expand through extract 1/2 #19325

Merged

[Dispatch] extend CollapseDimensionsPass to more cases 2/2 #19326

Merged

IanWood1 force-pushed the reland_prop_through_reduction branch 2 times, most recently from 044da91 to 1a747d5 Compare December 5, 2024 21:27

Reapply "Propagate reshapes through generics with reduction… (iree-or…

e609b15

…g#18968) This reverts commit 8d3faf8. Signed-off-by: Ian Wood <[email protected]>

IanWood1 force-pushed the reland_prop_through_reduction branch from 1a747d5 to e609b15 Compare December 7, 2024 00:15

IanWood1 added 2 commits December 12, 2024 01:42

Merge branch 'main' into reland_prop_through_reduction

5427716

Merge branch 'main' into reland_prop_through_reduction

feb954f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reapply "Propagate reshapes through generics with reduction… (#18968) #19113

Reapply "Propagate reshapes through generics with reduction… (#18968) #19113

IanWood1 commented Nov 12, 2024 •

edited

Loading

IanWood1 commented Nov 13, 2024

MaheshRavishankar commented Nov 14, 2024

hanhanW commented Nov 14, 2024

IanWood1 commented Nov 14, 2024

MaheshRavishankar commented Nov 14, 2024

IanWood1 commented Nov 14, 2024

IanWood1 commented Nov 14, 2024

IanWood1 commented Nov 19, 2024 •

edited

Loading

MaheshRavishankar commented Nov 19, 2024

IanWood1 commented Nov 19, 2024 •

edited

Loading

IanWood1 commented Nov 27, 2024

hanhanW left a comment

hanhanW Nov 27, 2024

hanhanW Nov 27, 2024

hanhanW Nov 27, 2024

IanWood1 commented Nov 27, 2024

hanhanW commented Nov 27, 2024

IanWood1 commented Nov 27, 2024 •

edited

Loading

Reapply "Propagate reshapes through generics with reduction… (#18968) #19113

Are you sure you want to change the base?

Reapply "Propagate reshapes through generics with reduction… (#18968) #19113

Conversation

IanWood1 commented Nov 12, 2024 • edited Loading

IanWood1 commented Nov 13, 2024

MaheshRavishankar commented Nov 14, 2024

hanhanW commented Nov 14, 2024

IanWood1 commented Nov 14, 2024

MaheshRavishankar commented Nov 14, 2024

IanWood1 commented Nov 14, 2024

IanWood1 commented Nov 14, 2024

IanWood1 commented Nov 19, 2024 • edited Loading

MaheshRavishankar commented Nov 19, 2024

IanWood1 commented Nov 19, 2024 • edited Loading

IanWood1 commented Nov 27, 2024

hanhanW left a comment

Choose a reason for hiding this comment

hanhanW Nov 27, 2024

Choose a reason for hiding this comment

hanhanW Nov 27, 2024

Choose a reason for hiding this comment

hanhanW Nov 27, 2024

Choose a reason for hiding this comment

IanWood1 commented Nov 27, 2024

hanhanW commented Nov 27, 2024

IanWood1 commented Nov 27, 2024 • edited Loading

IanWood1 commented Nov 12, 2024 •

edited

Loading

IanWood1 commented Nov 19, 2024 •

edited

Loading

IanWood1 commented Nov 19, 2024 •

edited

Loading

IanWood1 commented Nov 27, 2024 •

edited

Loading