Add pass to bubble-up extract_slice operations. #18332

MaheshRavishankar · 2024-08-22T23:39:21Z

This adds pass to replace a tensor.extract_slice operation with a
slice of the producer. In general there might be more opportunities to
use this pass more aggressively (like when an operation has a single
use which is a slice), but for now this is being done only for
bit-extend operations.

Fixes #18254

Signed-off-by: MaheshRavishankar [email protected]

This adds pass to replace a `tensor.extract_slice` operation with a slice of the producer. In general there might be more opportunities to use this pass more aggressively (like when an operation has a single use which is a slice), but for now this is being done only for bit-extend operations. Signed-off-by: MaheshRavishankar <[email protected]>

Signed-off-by: MaheshRavishankar <[email protected]>

IanWood1 · 2024-08-23T06:18:16Z

I tried https://gist.github.com/monorimet/3a0a4310c1ed09265353ce747599d502 but it seems like there is a collapse_shape in the way:

  %4 = linalg.generic {indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d1, d2)>, affine_map<(d0, d1, d2) -> (d0, d1, d2)>], iterator_types = ["parallel", "parallel", "parallel"]} ins(%collapsed : tensor<24x4608x128xf16>) outs(%3 : tensor<24x4608x128xf32>) {
  ^bb0(%in: f16, %out: f32):
    %21 = arith.extf %in : f16 to f32
    linalg.yield %21 : f32
  } -> tensor<24x4608x128xf32>
  %expanded = tensor.expand_shape %4 [[0, 1], [2], [3, 4, 5]] output_shape [1, 24, 4608, 64, 1, 2] : tensor<24x4608x128xf32> into tensor<1x24x4608x64x1x2xf32>
  %extracted_slice_7 = tensor.extract_slice %expanded[0, 0, 0, 0, 0, 1] [1, 24, 4608, 64, 1, 1] [1, 1, 1, 1, 1, 1] : tensor<1x24x4608x64x1x2xf32> to tensor<24x4608x64xf32>
  %expanded_8 = tensor.expand_shape %extracted_slice_7 [[0, 1], [2], [3, 4]] output_shape [1, 24, 4608, 64, 1] : tensor<24x4608x64xf32> into tensor<1x24x4608x64x1xf32>

also, iree/compiler/Dialect/Flow/Transforms/test/bubble_up_extract_slice.mlir just needs to be updated to use --iree-flow-bubble-up-extract-slices

Edit:

It seems like the expand/extracts should be foldable

MaheshRavishankar · 2024-08-23T07:13:59Z

I tried https://gist.github.com/monorimet/3a0a4310c1ed09265353ce747599d502 but it seems like there is a collapse_shape in the way:

  %4 = linalg.generic {indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d1, d2)>, affine_map<(d0, d1, d2) -> (d0, d1, d2)>], iterator_types = ["parallel", "parallel", "parallel"]} ins(%collapsed : tensor<24x4608x128xf16>) outs(%3 : tensor<24x4608x128xf32>) {
  ^bb0(%in: f16, %out: f32):
    %21 = arith.extf %in : f16 to f32
    linalg.yield %21 : f32
  } -> tensor<24x4608x128xf32>
  %expanded = tensor.expand_shape %4 [[0, 1], [2], [3, 4, 5]] output_shape [1, 24, 4608, 64, 1, 2] : tensor<24x4608x128xf32> into tensor<1x24x4608x64x1x2xf32>
  %extracted_slice_7 = tensor.extract_slice %expanded[0, 0, 0, 0, 0, 1] [1, 24, 4608, 64, 1, 1] [1, 1, 1, 1, 1, 1] : tensor<1x24x4608x64x1x2xf32> to tensor<24x4608x64xf32>
  %expanded_8 = tensor.expand_shape %extracted_slice_7 [[0, 1], [2], [3, 4]] output_shape [1, 24, 4608, 64, 1] : tensor<24x4608x64xf32> into tensor<1x24x4608x64x1xf32>

also, iree/compiler/Dialect/Flow/Transforms/test/bubble_up_extract_slice.mlir just needs to be updated to use --iree-flow-bubble-up-extract-slices

That makes sense. I should just run this after bubble up expand shapes.

Signed-off-by: Ian Wood <[email protected]>

MaheshRavishankar · 2024-08-27T03:51:57Z

Probably needs more tests. I added a rank reduced slice test, we want without rank reduction

MaheshRavishankar · 2024-08-27T03:53:03Z

Someone else should review this since I co-authored this commit. Probably needs more tests. I added a rank reduced slice test, we want without rank reduction slice test always

Signed-off-by: Ian Wood <[email protected]>

qedawkins

Overall looks good, I have some concerns about the ordering of some of the failure cases within the pattern though.

compiler/src/iree/compiler/DispatchCreation/BubbleUpExtractSlices.cpp

qedawkins · 2024-08-27T17:55:08Z

compiler/src/iree/compiler/DispatchCreation/BubbleUpExtractSlices.cpp

+      if (tilingResult->tiledOps.size() != 1 ||
+          !isa<linalg::GenericOp>(tilingResult->tiledOps[0])) {
+        return rewriter.notifyMatchFailure(
+            linalgOp, "expected extract_slice to generate a `linalg.generic`");


Failure after generating IR in a pattern is problematic. At a minimum I would restrict to generics like I said above, but also would be worth adding checks for isProjectedPermutation(indexingMaps) and !hasIndexSemantics.

Good point, I removed this check and added the checks you suggested before IR mutation. I also changed the failed checks so asserts since there is no graceful way to exit at that point

compiler/src/iree/compiler/DispatchCreation/BubbleUpExtractSlices.cpp

Signed-off-by: Ian Wood <[email protected]>

qedawkins

Nice, this looks good to me!

IanWood1 · 2024-08-27T22:00:17Z

@ScottTodd test_einsum_inner_prod is timing out in onnx regression tests https://github.com/iree-org/iree/actions/runs/10585615045/job/29333076426?pr=18332 the message is:

FAILED iree-test-suites/onnx_ops/onnx/node/generated/test_einsum_inner_prod/run_module_io_flags.txt::model.mlir::gpu_rocm_rdna3 - Failed: Timeout >30.0s

I don't think it is related to this pr because I was getting a segfault during torch conversion, Should the rdna3 config be updated similar to what was done in #18357?

MaheshRavishankar · 2024-08-27T22:03:14Z

@ScottTodd test_einsum_inner_prod is timing out in onnx regression tests https://github.com/iree-org/iree/actions/runs/10585615045/job/29333076426?pr=18332 the message is:

FAILED iree-test-suites/onnx_ops/onnx/node/generated/test_einsum_inner_prod/run_module_io_flags.txt::model.mlir::gpu_rocm_rdna3 - Failed: Timeout >30.0s

I don't think it is related to this pr because I was getting a segfault during torch conversion, Should the rdna3 config be updated similar to what was done in #18357?

THis is existing error. We've been hitting this all week

MaheshRavishankar · 2024-08-27T23:49:08Z

Oh, you can merge even with the failure

MaheshRavishankar

Thanks @IanWood1 !

IanWood1 · 2024-08-28T01:29:51Z

Oh, you can merge even with the failure

Great, I wasn't sure!

This adds pass to replace a `tensor.extract_slice` operation with a slice of the producer. In general there might be more opportunities to use this pass more aggressively (like when an operation has a single use which is a slice), but for now this is being done only for bit-extend operations. Co-authored-by: Ian Wood <[email protected]>

MaheshRavishankar requested review from hanhanW and IanWood1 as code owners August 22, 2024 23:39

MaheshRavishankar mentioned this pull request Aug 22, 2024

[ROCM][gfx942] shared memory limit exceeded on elemwise broadcast (bf16) (flux-dev) #18254

Closed

IanWood1 and others added 3 commits August 22, 2024 18:15

Pick up llvm/llvm-project#105749

5e61b9b

Signed-off-by: MaheshRavishankar <[email protected]>

Fixes for LLVM change.

0df11d3

Signed-off-by: MaheshRavishankar <[email protected]>

MaheshRavishankar force-pushed the extract_slice_prop branch from 75dbd33 to 0df11d3 Compare August 23, 2024 01:15

IanWood1 added 4 commits August 26, 2024 23:26

Merge branch 'main' into extract_slice_prop

b33c606

Signed-off-by: Ian Wood <[email protected]>

Move pass to after bubble up extract slices

01cdf90

Signed-off-by: Ian Wood <[email protected]>

fix merge

38fe301

Signed-off-by: Ian Wood <[email protected]>

Fix headers

3d16dd7

Signed-off-by: Ian Wood <[email protected]>

MaheshRavishankar requested a review from qedawkins August 27, 2024 03:52

Add a few more tests

5748e9f

Signed-off-by: Ian Wood <[email protected]>

IanWood1 force-pushed the extract_slice_prop branch from 9e1d446 to 5748e9f Compare August 27, 2024 16:17

qedawkins requested changes Aug 27, 2024

View reviewed changes

IanWood1 added 2 commits August 27, 2024 20:40

Address comments

2cae331

Signed-off-by: Ian Wood <[email protected]>

Check for index semantics

6fedd7d

Signed-off-by: Ian Wood <[email protected]>

qedawkins approved these changes Aug 27, 2024

View reviewed changes

MaheshRavishankar commented Aug 27, 2024

View reviewed changes

IanWood1 merged commit d6762d4 into iree-org:main Aug 28, 2024
36 of 37 checks passed

IanWood1 deleted the extract_slice_prop branch August 28, 2024 01:29

IanWood1 mentioned this pull request Aug 28, 2024

Compilation error for SHARK-TestSuite (onnx/models/RAFT_vaiq_int8) #17455

Closed

IanWood1 mentioned this pull request Aug 28, 2024

Dequantization + Extract Slice Problems #17642

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pass to bubble-up extract_slice operations. #18332

Add pass to bubble-up extract_slice operations. #18332

MaheshRavishankar commented Aug 22, 2024 •

edited

Loading

IanWood1 commented Aug 23, 2024 •

edited

Loading

MaheshRavishankar commented Aug 23, 2024

MaheshRavishankar commented Aug 27, 2024

MaheshRavishankar commented Aug 27, 2024

qedawkins left a comment

qedawkins Aug 27, 2024

IanWood1 Aug 27, 2024

qedawkins left a comment

IanWood1 commented Aug 27, 2024

MaheshRavishankar commented Aug 27, 2024

MaheshRavishankar commented Aug 27, 2024

MaheshRavishankar left a comment

IanWood1 commented Aug 28, 2024

Add pass to bubble-up extract_slice operations. #18332

Add pass to bubble-up extract_slice operations. #18332

Conversation

MaheshRavishankar commented Aug 22, 2024 • edited Loading

IanWood1 commented Aug 23, 2024 • edited Loading

MaheshRavishankar commented Aug 23, 2024

MaheshRavishankar commented Aug 27, 2024

MaheshRavishankar commented Aug 27, 2024

qedawkins left a comment

Choose a reason for hiding this comment

qedawkins Aug 27, 2024

Choose a reason for hiding this comment

IanWood1 Aug 27, 2024

Choose a reason for hiding this comment

qedawkins left a comment

Choose a reason for hiding this comment

IanWood1 commented Aug 27, 2024

MaheshRavishankar commented Aug 27, 2024

MaheshRavishankar commented Aug 27, 2024

MaheshRavishankar left a comment

Choose a reason for hiding this comment

IanWood1 commented Aug 28, 2024

MaheshRavishankar commented Aug 22, 2024 •

edited

Loading

IanWood1 commented Aug 23, 2024 •

edited

Loading