Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Global opt] add flag to generalize batch matmul ops #17877

Merged
merged 3 commits into from
Jul 12, 2024

Conversation

IanWood1
Copy link
Contributor

@IanWood1 IanWood1 commented Jul 11, 2024

After adding the flag to sdxl scripts, I saw a decent decrease in the number of dispatches.

Initially, I was trying to manually generalize+fuse broadcasts branch here, but quinn saw good results with just this.

@IanWood1 IanWood1 closed this Jul 11, 2024
@IanWood1 IanWood1 reopened this Jul 11, 2024
@IanWood1 IanWood1 marked this pull request as ready for review July 11, 2024 22:10
@IanWood1 IanWood1 requested a review from hanhanW as a code owner July 11, 2024 22:10
Copy link
Contributor

@qedawkins qedawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one comment on flag naming

@@ -23,6 +23,10 @@ static llvm::cl::opt<bool> clEnableQuantizedMatmulReassociation(
llvm::cl::desc(
"Enables reassociation of quantized matmul ops (experimental)."),
llvm::cl::init(false));
static llvm::cl::opt<bool>
clGeneralizeLinalgMatmulOps("enable-generalize-linalg-matmul-ops",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be codified somewhere, but there is some thoughts about adding -experimental to flag names that are not really intended to be user facing (i.e. papering over backend issues that are WIP), per this discussion: #17788 (comment). Maybe something like this, but open to suggestions.

Suggested change
clGeneralizeLinalgMatmulOps("enable-generalize-linalg-matmul-ops",
clGeneralizeLinalgMatmulOps("iree-global-opt-generalize-linalg-matmul-ops-experimental",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a good idea, added that to the name

Copy link
Contributor

@hanhanW hanhanW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we do this at preprocessing stage?

@hanhanW
Copy link
Contributor

hanhanW commented Jul 11, 2024

My concern is that we are making stable passes be experimental passes. The GeneralizeLinalgNamedOps pass is indeed what we want in GlobalOpt because it lowers linalg.add/sub/etc to generic ops. Now we are adding new option to the pass, which makes people start thinking -- okay, it might be an experimental pass, so I can unconditionally add more ad-hoc patterns to it. It sounds bad to me.

@IanWood1 IanWood1 requested review from hanhanW and qedawkins July 12, 2024 00:04
@IanWood1
Copy link
Contributor Author

@hanhanW Good point, this is definitely more of a preprocessing pass than anything else

@MaheshRavishankar
Copy link
Contributor

Agree with @hanhanW for now this is a one-off preprocessing. If all the backends can handle these appropriately, then we do this always in the generalize named ops pass.

@qedawkins
Copy link
Contributor

Ok I will defer to you both then, but if this is the only way to enable fusion of broadcasts, I would not consider this preprocessing, but rather a bug in the backend that is being worked around with an experimental flag we intend to turn on by default later.

@MaheshRavishankar
Copy link
Contributor

Ok I will defer to you both then, but if this is the only way to enable fusion of broadcasts, I would not consider this preprocessing, but rather a bug in the backend that is being worked around with an experimental flag we intend to turn on by default later.

It seems like a pre-mature lowering to me, and the real bug is not having a linalg.contraction op (I wonder who suggested that :P ) that allows such broadcasting semantics. I find this pattern of using isContractionOp... all through the stack kind of strange. Using a named op and doing isa<..> seems like a more robust way. The backends should be able to handle any einsum-like op, but I would rather not drop information very early

@qedawkins
Copy link
Contributor

oooo if the solution is linalg.contraction, then that sounds good to me

Copy link
Contributor

@MaheshRavishankar MaheshRavishankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks!

@IanWood1 IanWood1 merged commit f0d24cd into iree-org:main Jul 12, 2024
53 checks passed
qedawkins added a commit to nod-ai/sdxl-scripts that referenced this pull request Jul 12, 2024
LLITCHEV pushed a commit to LLITCHEV/iree that referenced this pull request Jul 30, 2024
Helps when the producer is a broadcast op. After adding the flag to sdxl scripts, I saw a decent decrease in the
number of dispatches.

Initially, I was trying to manually generalize+fuse broadcasts [branch
here](https://github.com/IanWood1/iree/tree/broadcast_matmul), but quinn
saw good results with just this.

---------

Signed-off-by: Ian Wood <[email protected]>
Signed-off-by: Lubo Litchev <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants