-
Notifications
You must be signed in to change notification settings - Fork 670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Dispatch] extend CollapseDimensionsPass to more cases 2/2 #19326
Conversation
compiler/src/iree/compiler/DispatchCreation/CollapseDimensions.cpp
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/DispatchCreation/CollapseDimensions.cpp
Outdated
Show resolved
Hide resolved
Signed-off-by: Ian Wood <[email protected]>
I also cleaned up the logic a bit in |
for (auto map : fusionInterfaceOp.getIndexingMapsArray()) { | ||
ReassociationIndices range; | ||
AffineExpr preExpr; | ||
|
||
auto appendAndClearRange = [&]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to suggest a similar lambda, but I gave up. Because I did not get a good name, and then I totally forgot it. :p
The only changes I'm seeing for punet is collapsed matmuls. For example, before changes and after changes. Does this require an update to https://github.com/iree-org/iree/blob/main/build_tools/pkgci/external_test_suite/attention_and_matmul_spec_punet.mlir (it doesn't have any specs for regular matmuls)? cc @raikonenfnu since you recently made changes there for your attention work. |
Signed-off-by: Ian Wood <[email protected]>
Signed-off-by: Ian Wood <[email protected]>
dfa11be
to
0b13107
Compare
Signed-off-by: Ian Wood <[email protected]>
Okay, the spec seems to get it back into the "passing" range for int8. The three specs I added cover the three instances of batch mmt getting collapsed into a mmt and tracing confirmed that these dispatches no longer were regressing. I checked again the dispatches were exactly the same, probably noise in the bench |
Thank you @Groverkss for the lambda idea + other cleanup. But, I forgot to add you as a co-author :( |
@IanWood1 @nithinsubbiah FYI, this regress the benchmark for punet-fp16 to consistently(based on original check, manual re run on my end, and post-commit check) 52.1ms on pkgCI where it used to be consistenly ~51.1 ms. Funny enough punet-fp8 stays the same/consistent at 52.1ms. I am not saying that it will definitely cause perf degradation on our canonical perf checker, but it is something to look out for just in case. |
Regresses int8 sdxl with f8 attention. I suspect this is mostly a tuning issue since this change only affects the shape of dispatches. I'll reland this after investigating further (and most likely fixing tuning). Reverts #19326
Re-lands #19326 after adding a few more matmul specs that were missed. I left `@match_mmt_8192x640x2560` commented out because the config was causing a slight regression in perf but the overall time spent was very low. This reverts commit 7177c29. --------- Signed-off-by: Ian Wood <[email protected]>
Reland after fixing sdxl int8 regressions via #19012. Running CI revealed further performance regressions that have pending patches: #19325 and #19326. This reverts commit 8d3faf8. --------- Signed-off-by: Ian Wood <[email protected]>
This is the 2/2 changes needed to reland #18857 (open PR #19113).
There are 2 small subchanges in this pr:
isPossiblySoftmax
and tweak the condition to check for operand indexing maps that are projected permutations but not permutations.getCollapsibleLoops
look at all operations to get contiguous loops.Also, I added test case that needed both these changes to pass (without being collapsed it was causing regressions on the GPU backend).