-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Move more casts to the late cast expansion phase #98273
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsAnother step towards removing early cast expansion. Brings a couple of diffs, e.g.: static void Test(object o)
{
if (o is Program)
{
Consume((Program)o);
}
} ; Method Program:Test(System.Object) (FullOpts)
G_M3485_IG01:
push rbx
sub rsp, 32
mov rbx, rcx
G_M3485_IG02:
mov rdx, rbx
mov rcx, 0x7FF8E7F3E180 ; Program
call CORINFO_HELP_ISINSTANCEOFCLASS
test rax, rax
je SHORT G_M3485_IG04
G_M3485_IG03:
mov rcx, rbx
- mov rax, 0x7FF8E7F3E180 ; Program
- cmp qword ptr [rcx], rax
- cmovne rcx, rbx
call [Program:Consume(Program)]
G_M3485_IG04:
nop
G_M3485_IG05:
add rsp, 32
pop rbx
ret
-; Total bytes of code: 64
+; Total bytes of code: 47
|
# Conflicts: # src/coreclr/jit/helperexpansion.cpp
@jakobbotsch PTAL (since you reviewed my previous cast-related PRs) @dotnet/jit-contrib This PR:
I inspected a lot of size regressions, some of them are unfortunate because we run the expansion too late and we're not longer can benefit from branch opts, etc. But mostly, it's a size improvement anyway - a lot of casts are now easily hoisted and CSE'd while previously there were expanded into a complex workflow too early. I have some ideas how to fix some of the regressions via assertions, etc. A lot of missing contexts in SPMI diffs mainly due to two EE calls: |
src/coreclr/jit/helperexpansion.cpp
Outdated
@@ -1792,7 +1792,7 @@ bool Compiler::fgVNBasedIntrinsicExpansionForCall_ReadUtf8(BasicBlock** pBlock, | |||
// | |||
PhaseStatus Compiler::fgLateCastExpansion() | |||
{ | |||
if (!doesMethodHaveExpandableCasts() || opts.OptimizationDisabled()) | |||
if (!doesMethodHaveExpandableCasts() || opts.OptimizationDisabled() || lvaHaveManyLocals()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd generally be opposed to this check, which makes it quite easy to hit performance cliffs (change your code slightly -> more locals created for various reasons -> casts are no longer expanded globally in the function). What was the motivation for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakobbotsch repeats the existing check in importer, although, I should probably follow the pattern and only reject non-local objects (before tree split) -
runtime/src/coreclr/jit/importer.cpp
Lines 5459 to 5462 in f582c0b
else if ((op1->gtFlags & GTF_GLOB_EFFECT) && lvaHaveManyLocals()) | |
{ | |
// not worth creating an untracked local variable | |
shouldExpandInline = false; |
This was responsible for size regressions in some tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, worth experimenting how increased threshold of locals we can track affects diffs and TP, but it's irrelevant to this PR 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the importer this check corresponds much more directly to user locals. But this late the correspondence is much less direct since the JIT can have created a number of new locals in many more places (e.g. promotion, call args morphing, CSE, ...). I wonder if instead we can add some bailout for pathological cases where we've expanded tons of casts already, or maybe start reusing locals if we've already created a large amount.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lvaHaveManyLocals
just looks at lvaCount
, so even w/o recycling we can periodically keep track of how many useless locals there are and subtract that part off. Or just use the number of tracked locals instead.
Renumbering or reusing locals seems trickier to pull off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakobbotsch I've changed to logic in 935f6e2 a bit to match importer's behavior, although, I see your point - do you mind if I think about the best way to handle it separately? I'd love to get rid of the importer's expansion as soon as possible so then I can focus on polishing the late one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
Another step towards removing early cast expansion.
Brings a couple of diffs, e.g.: