Optimize Math.Pow(x, c) where c is 2, 1, -1 or 0 #31978

EgorBo · 2020-02-08T14:09:59Z

Resurrects dotnet/coreclr#26552
Optimizes:

Math.Pow(x,  2) --> x*x
Math.Pow(x,  1) --> x

(same for MathF and float)

This time it's done in the importer.cpp and handles all kinds of the first argument (introduces a temp variable if needed, e.g. for GT_CALL).

Example:

static double Pow2(double x)  => Math.Pow(x, 2);
static double Pow1(double x)  => Math.Pow(x, 1);

Current codegen:

; Method Tests:Pow2(double):double
       vzeroupper
       vmovsd   xmm1, qword ptr [reloc @RWD00]
       jmp      System.Math:Pow(double,double):double


; Method Tests:Pow1(double):double
       vzeroupper
       vmovsd   xmm1, qword ptr [reloc @RWD00]
       jmp      System.Math:Pow(double,double):double

New codegen:

; Method Tests:Pow2(double):double
       vzeroupper
       vmulsd   xmm0, xmm0, xmm0
       ret


; Method Tests:Pow1(double):double
       vzeroupper
       ret  ; just return xmm0

It seems this pattern can be found in gamedev, e.g.. Xenko (a game engine): https://github.com/xenko3d/xenko/search?q=Math.Pow&unscoped_q=Math.Pow
Also the dotnet/performance benchmarks use it: https://github.com/dotnet/performance/blob/8aed638c9ee65c034fe0cca4ea2bdc3a68d2a6b5/src/benchmarks/micro/runtime/Burgers/Burgers.cs
Jitdiff for bcl:

Total bytes of delta: -40 (-0.00% of base)
    diff is an improvement.

Top file improvements (bytes):
         -40 : System.Private.CoreLib.dasm (-0.00% of base)

1 total files with Code Size differences (1 improved, 0 regressed), 267 unchanged.

Top method improvements (bytes):
         -30 (-5.08% of base) : System.Private.CoreLib.dasm - CalendricalCalculationsHelper:EquationOfTime(double):double
         -10 (-3.12% of base) : System.Private.CoreLib.dasm - CalendricalCalculationsHelper:DefaultEphemerisCorrection(int):double

Top method improvements (percentages):
         -30 (-5.08% of base) : System.Private.CoreLib.dasm - CalendricalCalculationsHelper:EquationOfTime(double):double
         -10 (-3.12% of base) : System.Private.CoreLib.dasm - CalendricalCalculationsHelper:DefaultEphemerisCorrection(int):double

2 total methods with Code Size differences (2 improved, 0 regressed), 196451 unchanged.
Completed analysis in 15.23s

The optimization can be extended to handle more cases once some sort of fast-math mode appears in .NET Core.

benaadams · 2020-02-08T15:04:56Z

If you can do it at import with consts, would it be worth going higher? e.g. to 5 in you linked example 3-4 crops up

For 5 I was going to suggest smootherstep

However, you'd probably write it like

x * x * x * (x * (x * 6 - 15) + 10)

EgorBo · 2020-02-08T15:10:26Z

@benaadams If I understand you correctly I can't optimize other constants in "safe math" mode, e.g.
Math.Pow(x, 4) can be optimized to

        vmulsd  xmm0, xmm0, xmm0
        vmulsd  xmm0, xmm0, xmm0

(a single xmm0 register!)

but it might return a slightly different value (and violate the ieee754 spec)
see https://godbolt.org/z/R78Ev-

stephentoub · 2020-02-08T21:42:10Z

cc: @tannergooding

EgorBo · 2020-02-09T13:21:55Z

CI failures are unrelated (#31985)

src/coreclr/src/jit/importer.cpp

…-const

carlossanlop · 2020-03-09T20:48:08Z

What about:
x^0 = 1

EgorBo · 2020-03-09T21:25:55Z

What about:
x^0 = 1

Can be added I guess but should be careful with side-effects, I wanted to optimize Pow(x,2) mainly since it's quite popular.

tannergooding · 2020-03-09T21:50:25Z

For reference, the IEEE spec defines the following behavior for pow:

pow (x, ±0) is 1 if x is not a signaling NaN
pow (±0, y) is ±∞ and signals the divideByZero exception for y an odd integer < 0
pow (±0, −∞) is +∞ with no exception
pow (±0, +∞) is +0 with no exception
pow (±0, y) is ±0 for finite y > 0 an odd integer
pow (−1, ±∞) is 1 with no exception
pow (+1, y) is 1 for any y (even a quiet NaN)
pow (x, +∞) is +0 for −1 < x < 1
pow (x, +∞) is +∞ for x < −1 or for 1 < x (including ±∞)
pow (x, −∞) is +∞ for −1 < x < 1
pow (x, −∞) is +0 for x < −1 or for 1 < x (including ±∞)
pow (+∞, y) is +0 for a number y < 0
pow (+∞, y) is +∞ for a number y > 0
pow (−∞, y) is −0 for finite y < 0 an odd integer
pow (−∞, y) is −∞ for finite y > 0 an odd integer
pow (−∞, y) is +0 for finite y < 0 and not an odd integer
pow (−∞, y) is +∞ for finite y > 0 and not an odd integer
pow (±0, y) is +∞ and signals the divideByZero exception for finite y < 0 and not an odd integer
pow(±0, y) is +0 for finite y > 0 and not an odd integer
pow(x, y) signals the invalid operation exception for finite x < 0 and finite non-integer y.

A couple of the conditions aren't valid because we don't support signalling NaN nor do we support floating-point exceptions.

The C Language Standard also matches this behavior in Annex F - IEC 60559 floating-point arithemtic and I believe .NET Core is also matching this behavior and has tests validating it for these special inputs.

EgorBo · 2020-03-09T22:16:10Z

For reference, the IEEE spec defines the following behavior for pow:

pow (x, ±0) is 1 if x is not a signaling NaN
pow (±0, y) is ±∞ and signals the divideByZero exception for y an odd integer < 0
pow (±0, −∞) is +∞ with no exception
pow (±0, +∞) is +0 with no exception
pow (±0, y) is ±0 for finite y > 0 an odd integer
pow (−1, ±∞) is 1 with no exception
pow (+1, y) is 1 for any y (even a quiet NaN)
pow (x, +∞) is +0 for −1 < x < 1
pow (x, +∞) is +∞ for x < −1 or for 1 < x (including ±∞)
pow (x, −∞) is +∞ for −1 < x < 1
pow (x, −∞) is +0 for x < −1 or for 1 < x (including ±∞)
pow (+∞, y) is +0 for a number y < 0
pow (+∞, y) is +∞ for a number y > 0
pow (−∞, y) is −0 for finite y < 0 an odd integer
pow (−∞, y) is −∞ for finite y > 0 an odd integer
pow (−∞, y) is +0 for finite y < 0 and not an odd integer
pow (−∞, y) is +∞ for finite y > 0 and not an odd integer
pow (±0, y) is +∞ and signals the divideByZero exception for finite y < 0 and not an odd integer
pow(±0, y) is +0 for finite y > 0 and not an odd integer
pow(x, y) signals the invalid operation exception for finite x < 0 and finite non-integer y.

A couple of the conditions aren't valid because we don't support signalling NaN nor do we support floating-point exceptions.

The C Language Standard also matches this behavior in Annex F - IEC 60559 floating-point arithemtic and I believe .NET Core is also matching this behavior and has tests validating it for these special inputs.

So I guess we better skip pow(x, 0) -> 1 case

tannergooding · 2020-03-09T22:18:33Z

No, that is fine to optimize. The point of my comment is that we don't support SNaN and it is treated identically to the QNaN case.
So, pow (x, ±0) returns 1 for all inputs in .NET Core.
If we were to ever start supporting SNaN or floating-point exceptions in the future, that would need to change; but there is a lot more work that would need to be done to support that.

…-const

EgorBo · 2020-03-17T16:09:00Z

@tannergooding any idea why
Math.Pow(x, -1) != 1/x for x = double.MinValue on arm64? 🙁

on arm64 Math.Pow(double.MinValue, -1) (without the opt) returns just -0.0 (bits: 0x8000000000000000). However, 1/double.MinValue returns -5.562684646268003E-309 (0x8004000000000000).

Should I remove the pow(x, -1) optimization?

tannergooding · 2020-03-17T16:21:11Z

any idea why

The result is subnormal (exponent is 0). Depending on the platform (such as ARM32) or hardware configuration (x86, x64, ARM64) subnormal values may be normalized to zero.
Likewise, the implementation (https://github.com/ARM-software/optimized-routines/blob/master/math/pow.c) may end up not special casing the handling or it may have a small bug in this area (I've not investigated to determine which).

EgorBo · 2020-03-17T16:50:01Z

any idea why

The result is subnormal (exponent is 0). Depending on the platform (such as ARM32) or hardware configuration (x86, x64, ARM64) subnormal values may be normalized to zero.
Likewise, the implementation (https://github.com/ARM-software/optimized-routines/blob/master/math/pow.c) may end up not special casing the handling or it may have a small bug in this area (I've not investigated to determine which).

Thanks for explanation, so should I give up on (x, -1) opt or workaround this case for pal_pow ? Or just ignore this corner case?

tannergooding · 2020-03-17T16:58:08Z

Thanks for explanation, so should I give up on (x, -1) opt or workaround this case for pal_pow ? Or just ignore this corner case?

You'll be likely to hit the same types of issues with pow(x, 2) if x is subnormal.

However, 1/double.MinValue returns -5.562684646268003E-309 (0x8004000000000000)

It would be good to make sure this isn't C# or the JIT doing constant folding on x / y and to check what the result is for 1 / x where x isn't a constant.

…-const

EgorBo · 2020-04-20T18:09:51Z

Will back to it later (to keep amount of active PRs smaller )

…-const

EgorBo · 2020-10-24T16:17:22Z

🤔 hm... looks like I have to do this optimization later since LICM is not fgMakeMultiUse friendly, e.g.:

for (int i = 0; i < 1000; i++)
{
    Console.WriteLine(MathF.Pow(x + 2, 2));
}

Without this PR optimization, this Pow() is hoisted.

AndyAyersMS · 2020-10-24T16:38:06Z

Right, it can't hoist assignments (see #35735 for example).

EgorBo added 3 commits February 8, 2020 01:32

Optimize Math(F).Pow(X, C) when C is 0, 1, -1, 2

1c1ca76

Fix pow_cns.cs test

ad05394

Don't introduce a tmp for GT_CNS_DBL

dc2b301

jkotas added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI optimization labels Feb 8, 2020

EgorBo added 2 commits February 8, 2020 17:42

remove Pow(x, 0), add a test for floats passed by ref

ee22b6c

Add tests for side-effects

54378fa

EgorBo added 3 commits February 8, 2020 18:35

Fix tests

267280c

tmp commit (CI test)

6c4307f

Math.Pow(NaN, 2) might return a "different" NaN than NaN*NaN

316ad79

tannergooding reviewed Feb 11, 2020

View reviewed changes

src/coreclr/src/jit/importer.cpp Outdated Show resolved Hide resolved

EgorBo commented Feb 14, 2020

View reviewed changes

src/coreclr/src/jit/importer.cpp Outdated Show resolved Hide resolved

EgorBo added 3 commits February 20, 2020 20:50

Merge branch 'master' of github.com:dotnet/runtime into optimize-sqrt…

6fadafe

…-const

Use fgMakeMultiUse

76013a8

Update pow_cns.cs

a9f5835

EgorBo added 2 commits March 15, 2020 23:17

Merge branch 'master' of github.com:dotnet/runtime into optimize-sqrt…

219bcca

…-const

Optimize pow(x,0) to 1

452ff63

Merge branch 'master' of github.com:dotnet/runtime into optimize-sqrt…

8bc1fd2

…-const

EgorBo closed this Apr 20, 2020

EgorBo added 2 commits October 24, 2020 18:27

Merge branch 'master' of github.com:dotnet/runtime into optimize-sqrt…

148e208

…-const

Remove power=1 and power=0 cases to make it simple

a6d308d

EgorBo reopened this Oct 24, 2020

clean up

caed8c0

EgorBo added 3 commits October 24, 2020 22:36

move to morph

ef785f8

clean up

96ac774

Formatting

26f4be3

EgorBo closed this Oct 27, 2020

ghost locked as resolved and limited conversation to collaborators Dec 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize Math.Pow(x, c) where c is 2, 1, -1 or 0 #31978

Optimize Math.Pow(x, c) where c is 2, 1, -1 or 0 #31978

EgorBo commented Feb 8, 2020 •

edited

Loading

benaadams commented Feb 8, 2020

EgorBo commented Feb 8, 2020 •

edited

Loading

stephentoub commented Feb 8, 2020

EgorBo commented Feb 9, 2020

carlossanlop commented Mar 9, 2020

EgorBo commented Mar 9, 2020

tannergooding commented Mar 9, 2020

EgorBo commented Mar 9, 2020

tannergooding commented Mar 9, 2020

EgorBo commented Mar 17, 2020

tannergooding commented Mar 17, 2020 •

edited

Loading

EgorBo commented Mar 17, 2020 •

edited

Loading

tannergooding commented Mar 17, 2020

EgorBo commented Apr 20, 2020

EgorBo commented Oct 24, 2020

AndyAyersMS commented Oct 24, 2020

Optimize Math.Pow(x, c) where c is 2, 1, -1 or 0 #31978

Optimize Math.Pow(x, c) where c is 2, 1, -1 or 0 #31978

Conversation

EgorBo commented Feb 8, 2020 • edited Loading

Example:

Current codegen:

New codegen:

benaadams commented Feb 8, 2020

EgorBo commented Feb 8, 2020 • edited Loading

stephentoub commented Feb 8, 2020

EgorBo commented Feb 9, 2020

carlossanlop commented Mar 9, 2020

EgorBo commented Mar 9, 2020

tannergooding commented Mar 9, 2020

EgorBo commented Mar 9, 2020

tannergooding commented Mar 9, 2020

EgorBo commented Mar 17, 2020

tannergooding commented Mar 17, 2020 • edited Loading

EgorBo commented Mar 17, 2020 • edited Loading

tannergooding commented Mar 17, 2020

EgorBo commented Apr 20, 2020

EgorBo commented Oct 24, 2020

AndyAyersMS commented Oct 24, 2020

EgorBo commented Feb 8, 2020 •

edited

Loading

EgorBo commented Feb 8, 2020 •

edited

Loading

tannergooding commented Mar 17, 2020 •

edited

Loading

EgorBo commented Mar 17, 2020 •

edited

Loading