Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dispatch] extend CollapseDimensionsPass to more cases 2/2 #19326

Merged
merged 4 commits into from
Dec 5, 2024

Conversation

IanWood1
Copy link
Contributor

This is the 2/2 changes needed to reland #18857 (open PR #19113).

There are 2 small subchanges in this pr:

  • Refactor to lambda isPossiblySoftmax and tweak the condition to check for operand indexing maps that are projected permutations but not permutations.
  • Extend getCollapsibleLoops look at all operations to get contiguous loops.

Also, I added test case that needed both these changes to pass (without being collapsed it was causing regressions on the GPU backend).

@IanWood1
Copy link
Contributor Author

IanWood1 commented Dec 2, 2024

I also cleaned up the logic a bit in getCollapsibleLoops based on @Groverkss PR #19336.

for (auto map : fusionInterfaceOp.getIndexingMapsArray()) {
ReassociationIndices range;
AffineExpr preExpr;

auto appendAndClearRange = [&]() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to suggest a similar lambda, but I gave up. Because I did not get a good name, and then I totally forgot it. :p

@IanWood1
Copy link
Contributor Author

IanWood1 commented Dec 2, 2024

The only changes I'm seeing for punet is collapsed matmuls. For example, before changes and after changes. Does this require an update to https://github.com/iree-org/iree/blob/main/build_tools/pkgci/external_test_suite/attention_and_matmul_spec_punet.mlir (it doesn't have any specs for regular matmuls)?

cc @raikonenfnu since you recently made changes there for your attention work.

@IanWood1 IanWood1 requested a review from ScottTodd as a code owner December 4, 2024 05:55
Signed-off-by: Ian Wood <[email protected]>
@IanWood1 IanWood1 force-pushed the extend_collapse_dimensions branch from dfa11be to 0b13107 Compare December 4, 2024 08:37
@ScottTodd ScottTodd removed their request for review December 4, 2024 17:01
@IanWood1
Copy link
Contributor Author

IanWood1 commented Dec 4, 2024

Okay, the spec seems to get it back into the "passing" range for int8. The three specs I added cover the three instances of batch mmt getting collapsed into a mmt and tracing confirmed that these dispatches no longer were regressing. However, there also appears to be a batch norm-like dispatch https://gist.github.com/IanWood1/80d0e6668694ae38f022f260381fbf80 that got 1.5 slower.

I checked again the dispatches were exactly the same, probably noise in the bench

@IanWood1 IanWood1 merged commit a1664e3 into iree-org:main Dec 5, 2024
38 checks passed
@IanWood1
Copy link
Contributor Author

IanWood1 commented Dec 5, 2024

I was going to suggest a similar lambda, but I gave up. Because I did not get a good name, and then I totally forgot it. :p

Thank you @Groverkss for the lambda idea + other cleanup. But, I forgot to add you as a co-author :(

@raikonenfnu
Copy link
Collaborator

@IanWood1 @nithinsubbiah FYI, this regress the benchmark for punet-fp16 to consistently(based on original check, manual re run on my end, and post-commit check) 52.1ms on pkgCI where it used to be consistenly ~51.1 ms. Funny enough punet-fp8 stays the same/consistent at 52.1ms.

I am not saying that it will definitely cause perf degradation on our canonical perf checker, but it is something to look out for just in case.

IanWood1 added a commit that referenced this pull request Dec 10, 2024
IanWood1 added a commit that referenced this pull request Dec 10, 2024
Regresses int8 sdxl with f8 attention. I suspect this is mostly a tuning
issue since this change only affects the shape of dispatches. I'll
reland this after investigating further (and most likely fixing tuning).



Reverts #19326
IanWood1 added a commit that referenced this pull request Dec 11, 2024
Re-lands #19326 after adding a few
more matmul specs that were missed. I left `@match_mmt_8192x640x2560`
commented out because the config was causing a slight regression in perf
but the overall time spent was very low.

This reverts commit 7177c29.

---------

Signed-off-by: Ian Wood <[email protected]>
IanWood1 added a commit that referenced this pull request Jan 8, 2025
Reland after fixing sdxl int8 regressions via
#19012.

Running CI revealed further performance regressions that have pending
patches: #19325 and
#19326.

This reverts commit 8d3faf8.

---------

Signed-off-by: Ian Wood <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants