-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements to loop hoisting #71504
Conversation
- optHoistLoopNest(): (1) no need to check if a pre-header has already been added; pre-headers are unique to a loop if they exist. (2) simplify the "append to list" code. - optHoistThisLoop(): (1) allow adding multiple pre-headers after every "IDom" addition, if there are any in the right order. (2) Simplify addition of remaining pre-headers. - optLoopEntry(): technically, you shouldn't look at `bbJumpDest` unless you know the branch kind is one that sets it. So change the condition. Also, fix the output of why something wasn't hoisted. The concept of "hoistable" implies "invariant" (but not the reverse). So print the least specific failure reason. These were backwards before. For instance, if we couldn't hoist a top node, but it was invariant, we would, oddly, print "not invariant" instead of "not hoistable". In the case where it isn't hoistable, and also isn't invariant, it makes more sense to print "not invariant", which also implies "not hoistable". Add dumping the OpName for the Pre/PostOrderVisit output, to help avoid cross-referencing the node number all the time. No asm diffs.
While the change (1) to optHoistThisLoop() to allow adding multiple pre-headers after every "IDom" addition doesn't lead to any spmi asm diffs, it does lead to a different "Considering hoisting in blocks that either dominate exit block BB09 or preheaders of nested loops, if any" order in the following test case:
|
@kunalspathak @dotnet/jit-contrib PTAL |
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue Details
Also, fix the output of why something wasn't hoisted. The concept of "hoistable" implies Add dumping the OpName for the Pre/PostOrderVisit output, to help avoid cross-referencing No asm diffs.
|
Do you mind sharing before vs. after Jitdump? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the cleanup. Add few suggestions.
while (cur != nullptr && pLoopDsc->lpContains(cur) && cur != pLoopDsc->lpEntry) | ||
{ | ||
JITDUMP(" -- " FMT_BB " (dominate exit block)\n", cur->bbNum); | ||
defExec.Push(cur); | ||
cur = cur->bbIDom; | ||
|
||
if (preHeadersList != nullptr) | ||
for (; preHeadersList != nullptr; preHeadersList = preHeadersList->next) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you find out an example where we didn't add a preheader in this loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you find out an example where we didn't add a preheader in this loop?
Do you mean where we added more than one preheader? Zero or one would happen the same as before. (And the answer is yes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok and the example is #71504 (comment)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct.
Here's the relevant fragment for the test case I included above:
Note how BB04 and BB12 are swapped. In the "diff" case, we loop and add both BB11/BB12 after BB09. |
Can you paste the Block table just before we do the hoisting? |
|
Thanks. It is interesting to see that despite the reordering, we don't see any spmi-diff. How many methods have such cases? |
Here are the 15 functions in SPMI collections that have 2 or more pre-headers to be added in the inner loop (that is, the reason why the inner loop should be a
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
pre-headers are unique to a loop if they exist. (2) simplify the "append to list" code.
if there are any in the right order. (2) Simplify addition of remaining pre-headers.
bbJumpDest
unless you know the branch kind is one that sets it. So change the condition.
Also, fix the output of why something wasn't hoisted. The concept of "hoistable" implies
"invariant" (but not the reverse). So print the least specific failure reason. These were
backwards before. For instance, if we couldn't hoist a top node, but it was invariant, we
would, oddly, print "not invariant" instead of "not hoistable". In the case where it isn't
hoistable, and also isn't invariant, it makes more sense to print "not invariant", which
also implies "not hoistable".
Add dumping the OpName for the Pre/PostOrderVisit output, to help avoid cross-referencing
the node number all the time.
No asm diffs.