Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

propagate empty check to foreach enumeration over Span? #93708

Open
danmoseley opened this issue Oct 18, 2023 · 4 comments
Open

propagate empty check to foreach enumeration over Span? #93708

danmoseley opened this issue Oct 18, 2023 · 4 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@danmoseley
Copy link
Member

danmoseley commented Oct 18, 2023

should these ideally generate the same code?

static void A(Span<char> chars)
{
    if (!chars.IsEmpty)
    {
        foreach (char c in chars)  
        {
            Console.Write(c);
        }
    }
}

static void B(Span<char> chars)
{
    foreach (char c in chars)  
    {
        Console.Write(c);
    }
}

currently (well, whatever sharplab uses) the 1st one generates an extra test.

https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgAYBYAKDIEYabi6kACYlVgQQAoBlAA7YAdgB4wAC2xQAfK0nTcAShoBvGq02sAlgDNWPAIQKouAHQBJXAFF8AjAE8V1La3UvXW3dBjZJBk3kdYXkpUyVNDU9Nd2jogGEIYVwIABsYMwB1KG0MGB4wJQBuKOiAX1LWCupqphZ2TgAhfiExEzkTZTVK71g/CQCwoO0QzojWSti4xOS0jOzc/MKSj01q2tXWIA===

category:cq
theme:redundant-branches
skill-level:expert
cost:medium
impact:low

@danmoseley danmoseley added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Oct 18, 2023
@ghost
Copy link

ghost commented Oct 18, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

should these ideally generate the same code?

static void A(Span<char> chars)
{
    if (!chars.IsEmpty)
    {
        foreach (char c in chars)  
        {
            Console.Write(c);
        }
    }
}

static void B(Span<char> chars)
{
    foreach (char c in chars)  
    {
        Console.Write(c);
    }
}

currently (well, whatever sharplab uses) the 1st one generates an extra test.

https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgAYBYAKDIEYabi6kACYlVgQQAoBlAA7YAdgB4wAC2xQAfK0nTcAShoBvGq02sAlgDNWPAIQKouAHQBJXAFF8AjAE8V1La3UvXW3dBjZJBk3kdYXkpUyVNDU9Nd2jogGEIYVwIABsYMwB1KG0MGB4wJQBuKOiAX1LWCupqphZ2TgAhfiExEzkTZTVK71g/CQCwoO0QzojWSti4xOS0jOzc/MKSj01q2tXWIA===

Author: danmoseley
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Oct 18, 2023
@EgorBo
Copy link
Member

EgorBo commented Oct 19, 2023

Smells like "implied relop" problem in redundant branch optimizer, cc @AndyAyersMS

@AndyAyersMS
Copy link
Member

Code is similar with latest

; Assembly listing for method X:A(System.Span`1[ushort]) (FullOpts)
; Emitting BLENDED_CODE for X64 with AVX - Windows
; FullOpts code
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; 0 inlinees with PGO data; 1 single block inlinees; 0 inlinees without PGO data
; Final local variable assignments
;
;  V00 arg0         [V00,T01] (  4,  7   )   byref  ->  rcx         ld-addr-op single-def
;* V01 loc0         [V01    ] (  0,  0   )  struct (16) zero-ref    ld-addr-op <System.Span`1[ushort]>
;  V02 loc1         [V02,T00] (  5, 16.50)     int  ->  rdi
;  V03 OutArgs      [V03    ] (  1,  1   )  struct (32) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
;* V04 tmp1         [V04    ] (  0,  0   )   byref  ->  zero-ref    "field V00._reference (fldOffset=0x0)" P-INDEP
;* V05 tmp2         [V05    ] (  0,  0   )     int  ->  zero-ref    "field V00._length (fldOffset=0x8)" P-INDEP
;  V06 tmp3         [V06,T03] (  2,  4.50)   byref  ->  rsi         single-def "field V01._reference (fldOffset=0x0)" P-INDEP
;  V07 tmp4         [V07,T02] (  3,  5   )     int  ->  rbx         single-def "field V01._length (fldOffset=0x8)" P-INDEP
;* V08 tmp5         [V08    ] (  0,  0   )  struct (16) zero-ref    "Promoted implicit byref" <System.Span`1[ushort]>
;  V09 cse0         [V09,T04] (  3,  2.50)     int  ->  rbx         "CSE - aggressive"
;
; Lcl frame size = 32

G_M40143_IG01:  ;; offset=0x0000
       push     rdi
       push     rsi
       push     rbx
       sub      rsp, 32
                                                ;; size=7 bbWeight=1 PerfScore 3.25
G_M40143_IG02:  ;; offset=0x0007
       mov      ebx, dword ptr [rcx+0x08]
       test     ebx, ebx
       je       SHORT G_M40143_IG05
                                                ;; size=7 bbWeight=1 PerfScore 3.25
G_M40143_IG03:  ;; offset=0x000E
       mov      rsi, bword ptr [rcx]
       xor      edi, edi
       test     ebx, ebx
       jle      SHORT G_M40143_IG05
                                                ;; size=9 bbWeight=0.50 PerfScore 1.75
G_M40143_IG04:  ;; offset=0x0017
       mov      ecx, edi
       movzx    rcx, word  ptr [rsi+2*rcx]
       call     [System.Console:Write(ushort)]
       inc      edi
       cmp      edi, ebx
       jl       SHORT G_M40143_IG04
                                                ;; size=18 bbWeight=4 PerfScore 27.00
G_M40143_IG05:  ;; offset=0x0029
       add      rsp, 32
       pop      rbx
       pop      rsi
       pop      rdi
       ret

This is another case where by modifying the dominating relop we can remove a dominated relop,

RBO sees these relops are related but doesn't know how to do this sort of transformation; here the GE is reachable from the false branch, and we don't consider "weakening" the predicate from GE to GT to be worth the trouble.

Can infer GE from [true] dominating EQ

Dominator BB01 of BB02 has same VN operands but different relop
N006 (  6,  6) [000041] J---G--N---                         *  EQ        int    <l:$240, c:$241>
N004 (  4,  4) [000045] n---G------                         +--*  IND       int    <l:$1c0, c:$200>
N003 (  2,  2) [000044] -------N---                         |  \--*  ADD       byref  $180
N001 (  1,  1) [000039] -----------                         |     +--*  LCL_VAR   byref  V00 arg0         u:1 $80
N002 (  1,  1) [000043] -----------                         |     \--*  CNS_INT   long   8 $140
N005 (  1,  1) [000040] -----------                         \--*  CNS_INT   int    0 $40
 Redundant compare; current relop:
N003 (  3,  3) [000057] J---G--N---                         *  GE        int    <l:$242, c:$243>
N001 (  1,  1) [000058] -----------                         +--*  LCL_VAR   int    V02 loc1         u:2 $40
N002 (  1,  1) [000059] -----------                         \--*  LCL_VAR   int    V07 tmp4         u:2 <l:$1c0, c:$201>
inference failed -- will keep looking higher

Also there are side effects in the GE block when RBO runs that would be problematic. The side effected locals are dead in the common join block and the possible exception that the side effecting tree raises is anticipated by the dominating branch, so it turns out to be ok to evaluate these side effects eagerly.

This is similar to the situation in #81220 (comment) and it may be the same approach suggested there would work here.

@AndyAyersMS
Copy link
Member

@danmoseley wish this sort of thing were simpler but doesn't seem to be the case....

@AndyAyersMS AndyAyersMS removed the untriaged New issue has not been triaged by the area owner label Oct 20, 2023
@AndyAyersMS AndyAyersMS added this to the Future milestone Oct 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

No branches or pull requests

3 participants