Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Move more casts to the late cast expansion phase #98273

Merged
merged 13 commits into from
Feb 12, 2024

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Feb 11, 2024

Another step towards removing early cast expansion.

Brings a couple of diffs, e.g.:

static void Test(object o)
{
    if (o is Program)
    {
        Consume((Program)o);
    }
}
; Method Program:Test(System.Object) (FullOpts)
G_M3485_IG01:
       push     rbx
       sub      rsp, 32
       mov      rbx, rcx
G_M3485_IG02:
       mov      rdx, rbx
       mov      rcx, 0x7FF8E7F3E180      ; Program
       call     CORINFO_HELP_ISINSTANCEOFCLASS
       test     rax, rax
       je       SHORT G_M3485_IG04
G_M3485_IG03:
       mov      rcx, rbx
-      mov      rax, 0x7FF8E7F3E180      ; Program
-      cmp      qword ptr [rcx], rax
-      cmovne   rcx, rbx
       call     [Program:Consume(Program)]
G_M3485_IG04:
       nop      
G_M3485_IG05:
       add      rsp, 32
       pop      rbx
       ret      
-; Total bytes of code: 64
+; Total bytes of code: 47

@ghost ghost assigned EgorBo Feb 11, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 11, 2024
@ghost
Copy link

ghost commented Feb 11, 2024

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Another step towards removing early cast expansion.

Brings a couple of diffs, e.g.:

static void Test(object o)
{
    if (o is Program)
    {
        Consume((Program)o);
    }
}
; Method Program:Test(System.Object) (FullOpts)
G_M3485_IG01:
       push     rbx
       sub      rsp, 32
       mov      rbx, rcx
G_M3485_IG02:
       mov      rdx, rbx
       mov      rcx, 0x7FF8E7F3E180      ; Program
       call     CORINFO_HELP_ISINSTANCEOFCLASS
       test     rax, rax
       je       SHORT G_M3485_IG04
G_M3485_IG03:
       mov      rcx, rbx
-      mov      rax, 0x7FF8E7F3E180      ; Program
-      cmp      qword ptr [rcx], rax
-      cmovne   rcx, rbx
       call     [Program:Consume(Program)]
G_M3485_IG04:
       nop      
G_M3485_IG05:
       add      rsp, 32
       pop      rbx
       ret      
-; Total bytes of code: 64
+; Total bytes of code: 47
Author: EgorBo
Assignees: EgorBo
Labels:

area-CodeGen-coreclr

Milestone: -

@EgorBo EgorBo marked this pull request as ready for review February 11, 2024 16:27
@EgorBo
Copy link
Member Author

EgorBo commented Feb 12, 2024

@MihuBot

@EgorBo
Copy link
Member Author

EgorBo commented Feb 12, 2024

@jakobbotsch PTAL (since you reviewed my previous cast-related PRs) @dotnet/jit-contrib

This PR:

  1. Moves all casts from early expansion to the late except a few cases I'll try to handle this week (runtime lookup is not supported, and isinst had too many size regressions)

  2. I forgot to enable speculative expansion for (T[])obj where T is not exact -- importer did expand for it too - this is responsible for some size regressions, but it's expected.

  3. Late expansion is now optimized for cases when castclass definitely fails (marks such blocks as always-throw) - importer did it too

  4. Manual CSE for class handle if it's used for both fast type check and as an argument for the helper call - importer did it too

  5. and 4) were done to reflect what importer did to mitigate size regressions.

I inspected a lot of size regressions, some of them are unfortunate because we run the expansion too late and we're not longer can benefit from branch opts, etc. But mostly, it's a size improvement anyway - a lot of casts are now easily hoisted and CSE'd while previously there were expanded into a complex workflow too early. I have some ideas how to fix some of the regressions via assertions, etc.

A lot of missing contexts in SPMI diffs mainly due to two EE calls: EmbedClassHandle (it just returns its argument in CoreCLR, so we can make SPMI less fragile for it if we want) and GetClassAttribs. But real jit-diff doesn't look bad (regressions due to 2nd point ^)

@@ -1792,7 +1792,7 @@ bool Compiler::fgVNBasedIntrinsicExpansionForCall_ReadUtf8(BasicBlock** pBlock,
//
PhaseStatus Compiler::fgLateCastExpansion()
{
if (!doesMethodHaveExpandableCasts() || opts.OptimizationDisabled())
if (!doesMethodHaveExpandableCasts() || opts.OptimizationDisabled() || lvaHaveManyLocals())
Copy link
Member

@jakobbotsch jakobbotsch Feb 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd generally be opposed to this check, which makes it quite easy to hit performance cliffs (change your code slightly -> more locals created for various reasons -> casts are no longer expanded globally in the function). What was the motivation for it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakobbotsch repeats the existing check in importer, although, I should probably follow the pattern and only reject non-local objects (before tree split) -

else if ((op1->gtFlags & GTF_GLOB_EFFECT) && lvaHaveManyLocals())
{
// not worth creating an untracked local variable
shouldExpandInline = false;

This was responsible for size regressions in some tests

Copy link
Member Author

@EgorBo EgorBo Feb 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, worth experimenting how increased threshold of locals we can track affects diffs and TP, but it's irrelevant to this PR 🙂

Copy link
Member

@jakobbotsch jakobbotsch Feb 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the importer this check corresponds much more directly to user locals. But this late the correspondence is much less direct since the JIT can have created a number of new locals in many more places (e.g. promotion, call args morphing, CSE, ...). I wonder if instead we can add some bailout for pathological cases where we've expanded tons of casts already, or maybe start reusing locals if we've already created a large amount.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lvaHaveManyLocals just looks at lvaCount, so even w/o recycling we can periodically keep track of how many useless locals there are and subtract that part off. Or just use the number of tracked locals instead.

Renumbering or reusing locals seems trickier to pull off.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakobbotsch I've changed to logic in 935f6e2 a bit to match importer's behavior, although, I see your point - do you mind if I think about the best way to handle it separately? I'd love to get rid of the importer's expansion as soon as possible so then I can focus on polishing the late one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@EgorBo EgorBo merged commit 6f6bfc3 into dotnet:main Feb 12, 2024
129 checks passed
@EgorBo EgorBo deleted the more-casts branch February 12, 2024 22:53
@cincuranet
Copy link
Contributor

@github-actions github-actions bot locked and limited conversation to collaborators Mar 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants