-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: update non-null assertion prop to destructure VNs #49238
Conversation
In addition to checking for assertions based on the VN of an address, try and destructure the VN to find the "base" address, and check its VNs as well. This lets us get rid of some extra null checks, typically ones that are at an offset from an existing non-null pointer. Closes dotnet#49180.
cc @dotnet/jit-contrib About 900 methods (some dups, so total number of impacted methods is less) with diffs across the SPMI bundle. Mostly small improvements.
For instance, in benchmarks:
Sample diff. In this method there were several sets of these. ;; SourceLocalSymbol:GetDeclaratorSyntax():Microsoft.CodeAnalysis.SyntaxNode:this
G_M46873_IG04: ; gcrefRegs=00000040 {rsi}, byrefRegs=00000000 {}, byref
; gcrRegs -[rax] +[rsi]
; byrRegs -[rsi]
lea rax, bword ptr [rsi+40]
; byrRegs +[rax]
- mov rcx, gword ptr [rax]
- ; gcrRegs +[rcx]
- mov edx, dword ptr [rax+8]
- mov edx, dword ptr [rax+16]
- mov edx, dword ptr [rax+20]
- mov rax, rcx
+ mov rax, gword ptr [rax]
; gcrRegs +[rax]
; byrRegs -[rax]
jmp G_M46873_IG24
A handful of regressions where the extra morphing this engenders causes a computation to be narrowed so that it can no longer share the value from a wider computation done earlier. ;; BEFORE
G_M14663_IG08: ; gcrefRegs=00000000 {}, byrefRegs=00000042 {rcx rsi}, byref, isz
; byrRegs -[rax rdx]
shr qword ptr [rcx+8], 1
test byte ptr [rcx+8], 1
;; AFTER
G_M14663_IG08: ; gcrefRegs=00000000 {}, byrefRegs=00000042 {rcx rsi}, byref, isz
shr qword ptr [rcx+8], 1
mov rcx, qword ptr [rcx+8]
; byrRegs -[rcx]
test cl, 1
|
|
||
// Check each assertion to find if we have a vn == or != null assertion. | ||
while (vnStore->GetVNFunc(vnBase, &funcAttr) && (funcAttr.m_func == (VNFunc)GT_ADD)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally perhaps this while loop would be unnecessary, and VN creation would fold chains of constant adds into a single add.
Also note there is a similar loop on the gen side in optCreateAssertion
which given a dereference through a byref, tries to walk the VN back to the inspiring gc ref.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we should be able to get rid of the IR destructuring (see first part of optAssertionProp_Ind
for example) for global prop as the VN destructuring should be equivalent, but when I did that I lost some of the diffs, and it creates more divergence between local prop and global prop.
I pulled in this private against my local branch and confirmed that the codegen appears optimized now. Thanks! :) |
Sure, glad this was one we could fix without too much trouble. Diff from your example was over in the issue, I'll copy it here... ; V01 arg1 [V01,T01] ( 4, 3.50) int -> rdx
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [rsp+0x00] "OutgoingArgSpace"
; V03 tmp1 [V03,T03] ( 3, 2 ) ubyte -> rcx "Inline return value spill temp"
-; V04 tmp2 [V04,T02] ( 3, 4 ) byref -> rcx "Inlining Arg"
+; V04 tmp2 [V04,T02] ( 2, 3 ) byref -> rcx "Inlining Arg"
;
; Lcl frame size = 0
@@ -22,11 +22,10 @@ G_M44576_IG02:
jae SHORT G_M44576_IG04
;; bbWeight=1 PerfScore 3.50
G_M44576_IG03:
- cmp dword ptr [rcx], ecx
mov eax, edx
movzx rcx, byte ptr [rcx+rax]
jmp SHORT G_M44576_IG05
- ;; bbWeight=0.50 PerfScore 3.12
+ ;; bbWeight=0.50 PerfScore 2.12
G_M44576_IG04:
xor ecx, ecx
;; bbWeight=0.50 PerfScore 0.12
@@ -34,7 +33,7 @@ G_M44576_IG05:
jmp Console:WriteLine(int)
;; bbWeight=1 PerfScore 2.00
-; Total bytes of code 29, prolog size 0, PerfScore 11.65, instruction count 10, allocated bytes for code 29 (MethodHash=7cd251df) for method C:M(int):this
+; Total bytes of code 27, prolog size 0, PerfScore 10.45, instruction count 9, allocated bytes for code 27 (MethodHash=7cd251df) for method C:M(int):this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks Good
Improvement: DrewScoggins/performance-2#4223 |
In addition to checking for assertions based on the VN of an address, try and
destructure the VN to find the "base" address, and check its VNs as well.
This lets us get rid of some extra null checks, typically ones that are at
an offset from an existing non-null pointer.
Closes #49180.