Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add check for sched_yield in librt #31

Merged
merged 1 commit into from
Sep 22, 2014

Conversation

worr
Copy link
Contributor

@worr worr commented Sep 20, 2014

In Solaris, sched_yield lives in librt, rather than libc. This patch adds a
check which will link in librt if necessary.

In Solaris, sched_yield lives in librt, rather than libc. This patch adds a
check which will link in librt if necessary.
@cbsmith
Copy link

cbsmith commented Sep 21, 2014

This looks perfect. Small/compact/goes from "not work" to "work".

@xfxyjwf
Copy link
Contributor

xfxyjwf commented Sep 22, 2014

Thanks!

xfxyjwf added a commit that referenced this pull request Sep 22, 2014
Add check for sched_yield in librt
@xfxyjwf xfxyjwf merged commit a48c08a into protocolbuffers:master Sep 22, 2014
TeBoring pushed a commit to TeBoring/protobuf that referenced this pull request Jan 19, 2019
Moved DynASM to third_party to comply with Google policy.
copybara-service bot pushed a commit that referenced this pull request Nov 28, 2024
Loop body before:
```
.LBB0_2:
        add     w8, w12, #1
        cmp     w8, w11
        b.gt    .LBB0_6 // Predictable branch, ends the loop
.LBB0_3:
        add     w12, w8, w11
        add     w12, w12, w12, lsr #31
        asr     w12, w12, #1
        smaddl  x0, w12, w10, x9
        ldr     w13, [x0]
        cmp     w13, w1
        b.lo    .LBB0_2 // Unpredictable branch here! Will be hit 50/50 in prod
        b.ls    .LBB0_7 // Predictable branch - ends the loop
        sub     w11, w12, #1
        cmp     w8, w11
        b.le    .LBB0_3 // Predictable branch - continues the loop
```

Loop body after:
```
.LBB7_1:
        cmp     w9, w11
        b.hi    .LBB7_4 // Predictable branch - ends the loop
        add     w12, w9, w11
        lsr     w12, w12, #1
        umaddl  x0, w12, w8, x10
        sub     w14, w12, #1
        ldr     w13, [x0]
        cmp     w13, w1
        csel    w11, w14, w11, hs
        csinc   w9, w9, w12, hs
        b.ne    .LBB7_1 // Predictable branch - continues the loop
```

PiperOrigin-RevId: 700864625
copybara-service bot pushed a commit that referenced this pull request Dec 4, 2024
On a Cortex-A55 this resulted in a 28.30% reduction in CPU and wall time for the binary search path.

Loop body before:
```
.LBB0_2:
        add     w8, w12, #1
        cmp     w8, w11
        b.gt    .LBB0_6 // Predictable branch, ends the loop
.LBB0_3:
        add     w12, w8, w11
        add     w12, w12, w12, lsr #31
        asr     w12, w12, #1
        smaddl  x0, w12, w10, x9
        ldr     w13, [x0]
        cmp     w13, w1
        b.lo    .LBB0_2 // Unpredictable branch here! Will be hit 50/50 in prod
        b.ls    .LBB0_7 // Predictable branch - ends the loop
        sub     w11, w12, #1
        cmp     w8, w11
        b.le    .LBB0_3 // Predictable branch - continues the loop
```

Loop body after:
```
.LBB7_1:
        cmp     w9, w11
        b.hi    .LBB7_4 // Predictable branch - ends the loop
        add     w12, w9, w11
        lsr     w12, w12, #1
        umaddl  x0, w12, w8, x10
        sub     w14, w12, #1
        ldr     w13, [x0]
        cmp     w13, w1
        csel    w11, w14, w11, hs
        csinc   w9, w9, w12, hs
        b.ne    .LBB7_1 // Predictable branch - continues the loop
```

PiperOrigin-RevId: 700864625
copybara-service bot pushed a commit that referenced this pull request Dec 4, 2024
On a Cortex-A55 this resulted in a 28.30% reduction in CPU and wall time for the binary search path.

Loop body before:
```
.LBB0_2:
        add     w8, w12, #1
        cmp     w8, w11
        b.gt    .LBB0_6 // Predictable branch, ends the loop
.LBB0_3:
        add     w12, w8, w11
        add     w12, w12, w12, lsr #31
        asr     w12, w12, #1
        smaddl  x0, w12, w10, x9
        ldr     w13, [x0]
        cmp     w13, w1
        b.lo    .LBB0_2 // Unpredictable branch here! Will be hit 50/50 in prod
        b.ls    .LBB0_7 // Predictable branch - ends the loop
        sub     w11, w12, #1
        cmp     w8, w11
        b.le    .LBB0_3 // Predictable branch - continues the loop
```

Loop body after:
```
.LBB7_1:
        cmp     w9, w11
        b.hi    .LBB7_4 // Predictable branch - ends the loop
        add     w12, w9, w11
        lsr     w12, w12, #1
        umaddl  x0, w12, w8, x10
        sub     w14, w12, #1
        ldr     w13, [x0]
        cmp     w13, w1
        csel    w11, w14, w11, hs
        csinc   w9, w9, w12, hs
        b.ne    .LBB7_1 // Predictable branch - continues the loop
```

PiperOrigin-RevId: 700864625
copybara-service bot pushed a commit that referenced this pull request Dec 4, 2024
On a Cortex-A55 this resulted in a 28.30% reduction in CPU and wall time for the binary search path.

Loop body before:
```
.LBB0_2:
        add     w8, w12, #1
        cmp     w8, w11
        b.gt    .LBB0_6 // Predictable branch, ends the loop
.LBB0_3:
        add     w12, w8, w11
        add     w12, w12, w12, lsr #31
        asr     w12, w12, #1
        smaddl  x0, w12, w10, x9
        ldr     w13, [x0]
        cmp     w13, w1
        b.lo    .LBB0_2 // Unpredictable branch here! Will be hit 50/50 in prod
        b.ls    .LBB0_7 // Predictable branch - ends the loop
        sub     w11, w12, #1
        cmp     w8, w11
        b.le    .LBB0_3 // Predictable branch - continues the loop
```

Loop body after:
```
.LBB7_1:
        cmp     w9, w11
        b.hi    .LBB7_4 // Predictable branch - ends the loop
        add     w12, w9, w11
        lsr     w12, w12, #1
        umaddl  x0, w12, w8, x10
        sub     w14, w12, #1
        ldr     w13, [x0]
        cmp     w13, w1
        csel    w11, w14, w11, hs
        csinc   w9, w9, w12, hs
        b.ne    .LBB7_1 // Predictable branch - continues the loop
```

PiperOrigin-RevId: 700864625
copybara-service bot pushed a commit that referenced this pull request Dec 4, 2024
On a Cortex-A55 this resulted in a 28.30% reduction in CPU and wall time for the binary search path.

Loop body before:
```
.LBB0_2:
        add     w8, w12, #1
        cmp     w8, w11
        b.gt    .LBB0_6 // Predictable branch, ends the loop
.LBB0_3:
        add     w12, w8, w11
        add     w12, w12, w12, lsr #31
        asr     w12, w12, #1
        smaddl  x0, w12, w10, x9
        ldr     w13, [x0]
        cmp     w13, w1
        b.lo    .LBB0_2 // Unpredictable branch here! Will be hit 50/50 in prod
        b.ls    .LBB0_7 // Predictable branch - ends the loop
        sub     w11, w12, #1
        cmp     w8, w11
        b.le    .LBB0_3 // Predictable branch - continues the loop
```

Loop body after:
```
.LBB7_1:
        cmp     w9, w11
        b.hi    .LBB7_4 // Predictable branch - ends the loop
        add     w12, w9, w11
        lsr     w12, w12, #1
        umaddl  x0, w12, w8, x10
        sub     w14, w12, #1
        ldr     w13, [x0]
        cmp     w13, w1
        csel    w11, w14, w11, hs
        csinc   w9, w9, w12, hs
        b.ne    .LBB7_1 // Predictable branch - continues the loop
```

PiperOrigin-RevId: 700864625
copybara-service bot pushed a commit that referenced this pull request Dec 5, 2024
On a Cortex-A55 this resulted in a 28.30% reduction in CPU and wall time for the binary search path.

Loop body before:
```
.LBB0_2:
        add     w8, w12, #1
        cmp     w8, w11
        b.gt    .LBB0_6 // Predictable branch, ends the loop
.LBB0_3:
        add     w12, w8, w11
        add     w12, w12, w12, lsr #31
        asr     w12, w12, #1
        smaddl  x0, w12, w10, x9
        ldr     w13, [x0]
        cmp     w13, w1
        b.lo    .LBB0_2 // Unpredictable branch here! Will be hit 50/50 in prod
        b.ls    .LBB0_7 // Predictable branch - ends the loop
        sub     w11, w12, #1
        cmp     w8, w11
        b.le    .LBB0_3 // Predictable branch - continues the loop
```

Loop body after:
```
.LBB7_1:
        cmp     w9, w11
        b.hi    .LBB7_4 // Predictable branch - ends the loop
        add     w12, w9, w11
        lsr     w12, w12, #1
        umaddl  x0, w12, w8, x10
        sub     w14, w12, #1
        ldr     w13, [x0]
        cmp     w13, w1
        csel    w11, w14, w11, hs
        csinc   w9, w9, w12, hs
        b.ne    .LBB7_1 // Predictable branch - continues the loop
```

PiperOrigin-RevId: 703213921
copybara-service bot pushed a commit that referenced this pull request Dec 5, 2024
On a Cortex-A55 this resulted in a 28.30% reduction in CPU and wall time for the binary search path.

Loop body before:
```
.LBB0_2:
        add     w8, w12, #1
        cmp     w8, w11
        b.gt    .LBB0_6 // Predictable branch, ends the loop
.LBB0_3:
        add     w12, w8, w11
        add     w12, w12, w12, lsr #31
        asr     w12, w12, #1
        smaddl  x0, w12, w10, x9
        ldr     w13, [x0]
        cmp     w13, w1
        b.lo    .LBB0_2 // Unpredictable branch here! Will be hit 50/50 in prod
        b.ls    .LBB0_7 // Predictable branch - ends the loop
        sub     w11, w12, #1
        cmp     w8, w11
        b.le    .LBB0_3 // Predictable branch - continues the loop
```

Loop body after:
```
.LBB7_1:
        cmp     w9, w11
        b.hi    .LBB7_4 // Predictable branch - ends the loop
        add     w12, w9, w11
        lsr     w12, w12, #1
        umaddl  x0, w12, w8, x10
        sub     w14, w12, #1
        ldr     w13, [x0]
        cmp     w13, w1
        csel    w11, w14, w11, hs
        csinc   w9, w9, w12, hs
        b.ne    .LBB7_1 // Predictable branch - continues the loop
```

PiperOrigin-RevId: 703214356
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants