Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: update non-null assertion prop to destructure VNs #49238

Merged
merged 1 commit into from
Mar 6, 2021

Conversation

AndyAyersMS
Copy link
Member

In addition to checking for assertions based on the VN of an address, try and
destructure the VN to find the "base" address, and check its VNs as well.

This lets us get rid of some extra null checks, typically ones that are at
an offset from an existing non-null pointer.

Closes #49180.

In addition to checking for assertions based on the VN of an address, try and
destructure the VN to find the "base" address, and check its VNs as well.

This lets us get rid of some extra null checks, typically ones that are at
an offset from an existing non-null pointer.

Closes dotnet#49180.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 5, 2021
@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Mar 5, 2021

cc @dotnet/jit-contrib

About 900 methods (some dups, so total number of impacted methods is less) with diffs across the SPMI bundle. Mostly small improvements.

Bench            : 105 total methods with Code Size differences (102 improved, 3 regressed), 1 unchanged.
Libs -crossgen   : 174 total methods with Code Size differences (171 improved, 3 regressed), 3 unchanged.
Libs -crossgen2  : 138 total methods with Code Size differences (135 improved, 3 regressed), 1 unchanged.
Libs -pmi        : 528 total methods with Code Size differences (523 improved, 5 regressed), 2 unchanged.
Tests            :  33 total methods with Code Size differences ( 33 improved, 0 regressed), 0 unchanged.

For instance, in benchmarks:

Top method regressions (bytes):
           6 ( 2.14% of base) : 6766.dasm - System.Text.Json.Utf8JsonReader:EndArray():this
           6 ( 2.14% of base) : 6857.dasm - System.Text.Json.Utf8JsonReader:EndObject():this
           6 ( 4.38% of base) : 6896.dasm - System.Text.Json.Utf8JsonWriter:ValidateEnd(ubyte):this

Top method improvements (bytes):
         -70 (-1.50% of base) : 19790.dasm - System.Reflection.PortableExecutable.PEBuilder:WritePEHeader(System.Reflection.Metadata.BlobBuilder,System.Reflection.PortableExecutable.PEDirectoriesBuilder,System.Collections.Immutable.ImmutableArray`1[SerializedSection]):this
         -56 (-1.52% of base) : 17261.dasm - Microsoft.CodeAnalysis.CommonReferenceManager`2[__Canon,__Canon][System.__Canon,System.__Canon]:ResolveMetadataReferences(System.__Canon,System.Collections.Generic.Dictionary`2[__Canon,__Canon],byref,byref,byref,byref,byref,Microsoft.CodeAnalysis.DiagnosticBag):System.Collections.Immutable.ImmutableArray`1[ResolvedReference]:this
         -38 (-3.05% of base) : 2515.dasm - Utf8Json.Resolvers.Internal.DynamicObjectTypeBuilder:IsSideEffectFreeConstructorType(System.Reflection.ConstructorInfo):bool
         -32 (-1.19% of base) : 9052.dasm - System.Uri:GetCanonicalPath(byref,int):this
         -21 (-1.39% of base) : 5136.dasm - System.Memory.ReadOnlySequence:GlobalSetup():this
         -19 (-2.59% of base) : 2443.dasm - System.Reflection.Emit.ILGenerator:Emit(System.Reflection.Emit.OpCode,System.Reflection.ConstructorInfo):this
         -18 (-1.75% of base) : 2146.dasm - System.Reflection.Emit.DynamicILGenerator:Emit(System.Reflection.Emit.OpCode,System.Reflection.MethodInfo):this
         -18 (-3.01% of base) : 2461.dasm - System.Reflection.Emit.ILGenerator:Emit(System.Reflection.Emit.OpCode,System.Reflection.MethodInfo):this
         -18 (-1.27% of base) : 5991.dasm - <WriteAsyncInternal>d__183`1[AsyncReadWriteAdapter][System.Net.Security.AsyncReadWriteAdapter]:MoveNext():this
         -17 (-7.36% of base) : 6733.dasm - Newtonsoft.Json.JsonReader:Pop():int:this
         -16 (-2.30% of base) : 10520.dasm - System.Buffers.ReadOnlySequence`1[Byte][System.Byte]:Slice(long,long):System.Buffers.ReadOnlySequence`1[Byte]:this
         -16 (-8.70% of base) : 9544.dasm - Sigil.Impl.ExtensionMethods:IsPrefix(System.Reflection.Emit.OpCode):bool
         -16 (-3.89% of base) : 18816.dasm - Microsoft.CodeAnalysis.CSharp.CodeGen.CodeGenerator:EmitConversionExpression(Microsoft.CodeAnalysis.CSharp.BoundConversion,bool):this
         -15 (-0.89% of base) : 9989.dasm - Microsoft.Extensions.Caching.Memory.MemoryCache:SetEntry(Microsoft.Extensions.Caching.Memory.CacheEntry):this
         -15 (-0.52% of base) : 17036.dasm - Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.LanguageParser:ParseNamespaceBody(byref,byref,byref,ushort):this
         -15 (-9.04% of base) : 12390.dasm - System.Text.Json.Utf8JsonReader:GetBoolean():bool:this
         -14 (-1.23% of base) : 1193.dasm - System.Reflection.Emit.ILGenerator:Emit(System.Reflection.Emit.OpCode,int):this
         -14 (-0.32% of base) : 9539.dasm - Sigil.Emit`1[__Canon][System.__Canon]:ValidateTryCatchFinallyBranches():this
         -12 (-2.74% of base) : 19610.dasm - Microsoft.Cci.FullMetadataWriter:PopulatePropertyMapTableRows():this
         -12 (-1.72% of base) : 2462.dasm - System.Reflection.Emit.ILGenerator:EmitCall(System.Reflection.Emit.OpCode,System.Reflection.MethodInfo,System.Type[]):this

Sample diff. In this method there were several sets of these.

;; SourceLocalSymbol:GetDeclaratorSyntax():Microsoft.CodeAnalysis.SyntaxNode:this

 G_M46873_IG04:        ; gcrefRegs=00000040 {rsi}, byrefRegs=00000000 {}, byref
        ; gcrRegs -[rax] +[rsi]
        ; byrRegs -[rsi]
        lea      rax, bword ptr [rsi+40]
        ; byrRegs +[rax]
-       mov      rcx, gword ptr [rax]
-       ; gcrRegs +[rcx]
-       mov      edx, dword ptr [rax+8]
-       mov      edx, dword ptr [rax+16]
-       mov      edx, dword ptr [rax+20]
-       mov      rax, rcx
+       mov      rax, gword ptr [rax]
        ; gcrRegs +[rax]
        ; byrRegs -[rax]
        jmp      G_M46873_IG24

A handful of regressions where the extra morphing this engenders causes a computation to be narrowed so that it can no longer share the value from a wider computation done earlier.

;; BEFORE

G_M14663_IG08:        ; gcrefRegs=00000000 {}, byrefRegs=00000042 {rcx rsi}, byref, isz
       ; byrRegs -[rax rdx]
       shr      qword ptr [rcx+8], 1
       test     byte  ptr [rcx+8], 1

;; AFTER
G_M14663_IG08:        ; gcrefRegs=00000000 {}, byrefRegs=00000042 {rcx rsi}, byref, isz
       shr      qword ptr [rcx+8], 1
       mov      rcx, qword ptr [rcx+8]
       ; byrRegs -[rcx]
       test     cl, 1

@AndyAyersMS AndyAyersMS self-assigned this Mar 5, 2021
@AndyAyersMS AndyAyersMS added this to the 6.0.0 milestone Mar 5, 2021

// Check each assertion to find if we have a vn == or != null assertion.
while (vnStore->GetVNFunc(vnBase, &funcAttr) && (funcAttr.m_func == (VNFunc)GT_ADD))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally perhaps this while loop would be unnecessary, and VN creation would fold chains of constant adds into a single add.

Also note there is a similar loop on the gen side in optCreateAssertion which given a dereference through a byref, tries to walk the VN back to the inspiring gc ref.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we should be able to get rid of the IR destructuring (see first part of optAssertionProp_Ind for example) for global prop as the VN destructuring should be equivalent, but when I did that I lost some of the diffs, and it creates more divergence between local prop and global prop.

@GrabYourPitchforks
Copy link
Member

I pulled in this private against my local branch and confirmed that the codegen appears optimized now. Thanks! :)

@AndyAyersMS
Copy link
Member Author

Thanks!

Sure, glad this was one we could fix without too much trouble. Diff from your example was over in the issue, I'll copy it here...

;  V01 arg1         [V01,T01] (  4,  3.50)     int  ->  rdx
 ;# V02 OutArgs      [V02    ] (  1,  1   )  lclBlk ( 0) [rsp+0x00]   "OutgoingArgSpace"
 ;  V03 tmp1         [V03,T03] (  3,  2   )   ubyte  ->  rcx         "Inline return value spill temp"
-;  V04 tmp2         [V04,T02] (  3,  4   )   byref  ->  rcx         "Inlining Arg"
+;  V04 tmp2         [V04,T02] (  2,  3   )   byref  ->  rcx         "Inlining Arg"
 ;
 ; Lcl frame size = 0

@@ -22,11 +22,10 @@ G_M44576_IG02:
        jae      SHORT G_M44576_IG04
                                                ;; bbWeight=1    PerfScore 3.50
 G_M44576_IG03:
-       cmp      dword ptr [rcx], ecx
        mov      eax, edx
        movzx    rcx, byte  ptr [rcx+rax]
        jmp      SHORT G_M44576_IG05
-                                               ;; bbWeight=0.50 PerfScore 3.12
+                                               ;; bbWeight=0.50 PerfScore 2.12
 G_M44576_IG04:
        xor      ecx, ecx
                                                ;; bbWeight=0.50 PerfScore 0.12
@@ -34,7 +33,7 @@ G_M44576_IG05:
        jmp      Console:WriteLine(int)
                                                ;; bbWeight=1    PerfScore 2.00

-; Total bytes of code 29, prolog size 0, PerfScore 11.65, instruction count 10, allocated bytes for code 29 (MethodHash=7cd251df) for method C:M(int):this
+; Total bytes of code 27, prolog size 0, PerfScore 10.45, instruction count 9, allocated bytes for code 27 (MethodHash=7cd251df) for method C:M(int):this

Copy link
Contributor

@briansull briansull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good

@AndyAyersMS AndyAyersMS merged commit 1d36a90 into dotnet:main Mar 6, 2021
@AndyAyersMS AndyAyersMS deleted the NullcheckDestructureVN branch March 6, 2021 19:15
@ghost ghost locked as resolved and limited conversation to collaborators Apr 5, 2021
@AndyAyersMS
Copy link
Member Author

Improvement: DrewScoggins/performance-2#4223

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

JIT doesn't elide some "this != null" checks when accessing fixed-size buffers in structs
3 participants