Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attach Fusion interface to linalg.softmax #18550

Merged
merged 3 commits into from
Sep 25, 2024

Conversation

IanWood1
Copy link
Contributor

@IanWood1 IanWood1 commented Sep 18, 2024

There are 86 softmax ops in clip (acounting for the 1225 -> 1139 decrease in dispatches), each followed by a 'bit truncate' op. This patch allows the linalg.softmax ops to fuse with the truncation op.

@IanWood1 IanWood1 marked this pull request as ready for review September 19, 2024 15:41
@ScottTodd ScottTodd removed their request for review September 19, 2024 16:03
Copy link
Contributor

@hanhanW hanhanW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we able to add a transform op in LinalgExt and test the fusion implementation with it? This mechanism is similar to what we have in tiling. Once we have this interface being mature, we can upstream the transform op and tests to MLIR repo smoothly.

@IanWood1
Copy link
Contributor Author

@hanhanW What do you mean by a transform op?

@hanhanW
Copy link
Contributor

hanhanW commented Sep 23, 2024

@hanhanW What do you mean by a transform op?

I wonder if we can have testing setup like:

// RUN: iree-opt --iree-transform-dialect-interpreter --split-input-file --verify-diagnostics -canonicalize -cse %s | FileCheck %s
func.func @scatter_tiling(
%original: tensor<?x?xf32>, %indices: tensor<?x1xi32>,
%update : tensor<?x?xf32>) -> tensor<?x?xf32> {
%0 = iree_linalg_ext.scatter
dimension_map = [0]
unique_indices(true)
ins(%update, %indices : tensor<?x?xf32>, tensor<?x1xi32>)
outs(%original : tensor<?x?xf32>) {
^bb0(%arg1: f32, %arg2: f32):
%1 = arith.addf %arg1, %arg2 : f32
iree_linalg_ext.yield %1 : f32
} -> tensor<?x?xf32>
return %0 : tensor<?x?xf32>
}
module attributes { transform.with_named_sequence } {
transform.named_sequence @__transform_main(%module_op: !transform.any_op {transform.readonly}) {
%0 = transform.structured.match ops{["iree_linalg_ext.scatter"]} in %module_op : (!transform.any_op) -> !transform.any_op
%1, %loops:2 = transform.structured.tile_using_for %0 tile_sizes [10, 20] : (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op)
transform.yield
}
}

@hanhanW
Copy link
Contributor

hanhanW commented Sep 23, 2024

I think now the testing is relying on one of the tests in https://github.com/iree-org/iree/tree/main/compiler/src/iree/compiler/DispatchCreation/test. (Please correct me if I'm wrong) In the transform op setup, we can decouple the deps between IREE and the interface better, which helps the upstream in the future.

@IanWood1
Copy link
Contributor Author

I see. Yes, the testing is relying on DispatchCreation's test, so that makes sense. One issue I see is that the fusion interface offers helpful information to FormDispatchRegionsPass that uses heuristics tailored to codegen to fuse the ops (e.g. "clonable" ops and aggressive fusion). I don't think much of the logic in FormDispatchRegions pass could be moved out because it is pretty specific to IREE. So, I'm not sure what the transform op would do since "fusing" ops is essentially just putting them in the same flow.dispatch.region (also, I'm not sure how we would represent 'fusing' ops if this was moved to linalgext).

Copy link
Contributor

@MaheshRavishankar MaheshRavishankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test for this?

@hanhanW
Copy link
Contributor

hanhanW commented Sep 25, 2024

I see. Yes, the testing is relying on DispatchCreation's test, so that makes sense. One issue I see is that the fusion interface offers helpful information to FormDispatchRegionsPass that uses heuristics tailored to codegen to fuse the ops (e.g. "clonable" ops and aggressive fusion). I don't think much of the logic in FormDispatchRegions pass could be moved out because it is pretty specific to IREE. So, I'm not sure what the transform op would do since "fusing" ops is essentially just putting them in the same flow.dispatch.region (also, I'm not sure how we would represent 'fusing' ops if this was moved to linalgext).

I see, then I think it is fine for now. What I'm looking for is something similar to the upstream transform-op-fuse.mlir. We don't really care about the heuristics that used in IREE, but more like functionality. I realized that we need an op that has a region to test the fusion. We could create a test op if it is in upstream, but it looks tricky in IREE. So I think it is fine for now.

Signed-off-by: Ian Wood <[email protected]>
@IanWood1 IanWood1 enabled auto-merge (squash) September 25, 2024 17:54
@IanWood1 IanWood1 merged commit 672ae82 into iree-org:main Sep 25, 2024
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants