Remove sqrt from the volatile list #43786

Keno · 2022-01-12T22:50:55Z

The LLVM IR spec now explicitly says that this intrinsic is required
to be rounded correctly. That means that without fasthmath flag
(which we do not set here), this intrinsic must have the bitwise
correctly rounded answer and should thus not differ between compile
and runtime. If there is still a case where it does differ, that is
likely some other underlying bug that we should fix instead.

Before:

julia> f() = sqrt(2)
f (generic function with 1 method)

julia> @code_typed f()
CodeInfo(
1 ─ %1 = Base.Math.sqrt_llvm(2.0)::Float64
└──      return %1
) => Float64

After:

julia> @code_typed f()
CodeInfo(
1 ─     return 1.4142135623730951
) => Float64

oscardssmith · 2022-01-12T23:21:46Z

Nice to get this fixed!

Keno · 2022-01-13T00:12:43Z

my guess is that the common way this could fail is if a libm uses fqsrt without accounting for double rounding. glibc has test vectors for this case at https://sourceware.org/bugzilla/show_bug.cgi?id=14032, so we could add those as a test to see if anybody reports issues

oscardssmith · 2022-01-13T01:11:20Z

If anyone fails this, it would probably be windows. Their libm isn't great in general.

N5N3 · 2022-01-13T03:18:16Z

Random test passed locally on win64:

for I in 1:10000000
    r = reinterpret(Float64,rand(UInt))
    @test sqrt(abs(r)) === Base.sqrt_llvm(abs(r))
end

Edit: glibc test also passed.
So it should be OK for modern CPU? (with vsqrt support)

Keno · 2022-01-13T03:26:58Z

Base.sqrt is defined as sqrt_llvm, so unfortunately that test does not say anything. As I said, the interesting case are probably the test vectors in that glibc issue.

N5N3 · 2022-01-13T03:34:16Z

IIUC, Base.sqrt_llvm always calls runtime sqrt like Base.fma_float. (#43530 use such feature to test julia_fma(f)'s precision).
Just for example: on offical 1.7.1 Win64

julia> fma(-1.9369631f13, 2.1513551f-7, -1.7354427f-24)
-4.1670958f6

julia> Base.fma_float(-1.9369631f13, 2.1513551f-7, -1.7354427f-24)
-4.1670955f6

Keno · 2022-01-17T00:47:47Z

Alright, I have added a test to detect if there's any platform with the fp80 double rounding problem. Otherwise, I think we should try this and see what happens.

Keno · 2022-01-17T22:34:49Z

Sigh, sqrt test failed on builtkite i686 Linux, so looks like whatever we're using there is buggy.

Keno · 2022-01-19T07:10:56Z

This might be our fault because it looks like openlibm has the double rounding bug on 32bit: https://github.com/JuliaMath/openlibm/blob/master/i387/e_sqrt.S#L12.

@oscardssmith wanna have a go at fixing this in openlibm?

That said, we assume a minimum of SSE2, so our minimum supported ISA has sqrtsd - we should probably just use that.

Keno · 2022-01-19T07:12:04Z

IIUC, Base.sqrt_llvm always calls runtime sqrt like Base.fma_float. (#43530 use such feature to test julia_fma(f)'s precision).

There's three cases that all need to be consistent:

The code generated by LLVM for the intrinsic
The code LLVM uses for constant folding the intrinsic
Our own runtime intrinsic used by the interpreter

Depending on how things are built, these can all pick up different version of the intrinsics.

oscardssmith · 2022-01-19T13:03:05Z

I'd honestly rather us move away from openlibm. can we just make llvm use the intrinsic?

Keno · 2022-01-19T19:01:28Z

Actually, looks like in this case LLVM doesn't actually constant fold sqrt, so we'll be ok once we fix runtime intrinsics (and since we require SSE2, it should be fine to do that using the SSE intrinsic). However, just for completeness, I do think we should also fix openlibm. Should be a 5 line change to set the precision flag in the control word appropriately.

oscardssmith · 2022-01-19T19:17:32Z

That makes sense. (I am, however unlikely to touch the openlibm version).

Keno · 2022-01-19T19:23:14Z

Alright, do you want to change the runtime intrinsics here and I'll make the patch to openlibm (but we won't block on an openlibm upgrade)?

As discussed in JuliaLang/julia#43786, openlibm's sqrt function is incorrectly rounded for i387. IEEE requires correct rounding for these functions and LLVM relies on it. Fix that by setting the precision in the FPU control word (see e.g. e_ceil.S for similar FPU modifications).

Keno · 2022-01-19T22:29:54Z

Openlibm fix is JuliaMath/openlibm#256.

As discussed in JuliaLang/julia#43786, openlibm's sqrt function is incorrectly rounded for i387. IEEE requires correct rounding for these functions and LLVM relies on it. Fix that by setting the precision in the FPU control word (see e.g. e_ceil.S for similar FPU modifications).

This is more of a "Do we want to move in this direction RFC". As mentioned in #43786, we currently have three implementations of these intrinsics: 1. The code generated by LLVM for the intrinsic 2. The code LLVM uses for constant folding the intrinsic 3. Our own runtime intrinsic used by the interpreter This basically removes the third one, which will be required if we want to do something about #26434 because we just forward these to libm. Of course we'll still have to do something to teach LLVM how to constant fold these in a manner compatible with what will actually end up running, but that's a separate issue.

Keno · 2022-01-19T23:58:08Z

As discussed, for now we won't touch runtime_intrinsic and just pull in the openlibm update to enable this. #43869 tracks just getting rid of the intrinsic entirely to reduce the complexity.

This is more of a "Do we want to move in this direction RFC". As mentioned in #43786, we currently have three implementations of these intrinsics: 1. The code generated by LLVM for the intrinsic 2. The code LLVM uses for constant folding the intrinsic 3. Our own runtime intrinsic used by the interpreter This basically removes the third one, which will be required if we want to do something about #26434 because we just forward these to libm. Of course we'll still have to do something to teach LLVM how to constant fold these in a manner compatible with what will actually end up running, but that's a separate issue.

The LLVM IR spec now explicitly says that this intrinsic is required to be rounded correctly. That means that without fasthmath flag (which we do not set here), this intrinsic must have the bitwise correctly rounded answer and should thus not differ between compile and runtime. If there is still a case where it does differ, that is likely some other underlying bug that we should fix instead. Before: ``` julia> f() = sqrt(2) f (generic function with 1 method) julia> @code_typed f() CodeInfo( 1 ─ %1 = Base.Math.sqrt_llvm(2.0)::Float64 └── return %1 ) => Float64 ``` After: ``` julia> @code_typed f() CodeInfo( 1 ─ return 1.4142135623730951 ) => Float64 ```

This reverts commit cc96240.

The LLVM IR spec now explicitly says that this intrinsic is required to be rounded correctly. That means that without fasthmath flag (which we do not set here), this intrinsic must have the bitwise correctly rounded answer and should thus not differ between compile and runtime. If there is still a case where it does differ, that is likely some other underlying bug that we should fix instead. Before: ``` julia> f() = sqrt(2) f (generic function with 1 method) julia> @code_typed f() CodeInfo( 1 ─ %1 = Base.Math.sqrt_llvm(2.0)::Float64 └── return %1 ) => Float64 ``` After: ``` julia> @code_typed f() CodeInfo( 1 ─ return 1.4142135623730951 ) => Float64 ```

oscardssmith added the performance Must go faster label Jan 12, 2022

oscardssmith approved these changes Jan 12, 2022

View reviewed changes

Keno force-pushed the kf/sqrtnovolatile branch from 7922ce9 to b7a9051 Compare January 17, 2022 00:47

Keno changed the title ~~RFC: Remove sqrt from the volatile list~~ Remove sqrt from the volatile list Jan 17, 2022

Keno mentioned this pull request Jan 19, 2022

Correctly round double precision sqrt JuliaMath/openlibm#256

Merged

Keno mentioned this pull request Jan 19, 2022

RFC: Remove sqrt_llvm intrinsic #43869

Closed

Keno mentioned this pull request Jan 20, 2022

Bump openlibm to 0.8.1 #43870

Merged

Keno force-pushed the kf/sqrtnovolatile branch from b7a9051 to bb2e1fb Compare January 20, 2022 15:18

Keno merged commit cc96240 into master Jan 20, 2022

Keno deleted the kf/sqrtnovolatile branch January 20, 2022 22:34

simeonschaub mentioned this pull request Jan 31, 2022

revert #43796, disable test failing on linux32 #43997

Closed

simeonschaub added a commit that referenced this pull request Jan 31, 2022

Revert "Remove sqrt from the volatile list (#43786)"

3520059

This reverts commit cc96240.

simeonschaub added a commit that referenced this pull request Feb 4, 2022

Revert "Remove sqrt from the volatile list (#43786)"

e52d912

This reverts commit cc96240.

simeonschaub added a commit that referenced this pull request Feb 4, 2022

Revert "Remove sqrt from the volatile list (#43786)"

70d8e4a

This reverts commit cc96240.

Keno mentioned this pull request Dec 31, 2022

Allow constant-folding intrinsics that are non-pure for inference only #31193

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove sqrt from the volatile list #43786

Remove sqrt from the volatile list #43786

Keno commented Jan 12, 2022

oscardssmith commented Jan 12, 2022

Keno commented Jan 13, 2022

oscardssmith commented Jan 13, 2022

N5N3 commented Jan 13, 2022 •

edited

Loading

Keno commented Jan 13, 2022

N5N3 commented Jan 13, 2022 •

edited

Loading

Keno commented Jan 17, 2022

Keno commented Jan 17, 2022

Keno commented Jan 19, 2022

Keno commented Jan 19, 2022

oscardssmith commented Jan 19, 2022

Keno commented Jan 19, 2022

oscardssmith commented Jan 19, 2022

Keno commented Jan 19, 2022

Keno commented Jan 19, 2022

Keno commented Jan 19, 2022

Remove sqrt from the volatile list #43786

Remove sqrt from the volatile list #43786

Conversation

Keno commented Jan 12, 2022

oscardssmith commented Jan 12, 2022

Keno commented Jan 13, 2022

oscardssmith commented Jan 13, 2022

N5N3 commented Jan 13, 2022 • edited Loading

Keno commented Jan 13, 2022

N5N3 commented Jan 13, 2022 • edited Loading

Keno commented Jan 17, 2022

Keno commented Jan 17, 2022

Keno commented Jan 19, 2022

Keno commented Jan 19, 2022

oscardssmith commented Jan 19, 2022

Keno commented Jan 19, 2022

oscardssmith commented Jan 19, 2022

Keno commented Jan 19, 2022

Keno commented Jan 19, 2022

Keno commented Jan 19, 2022

N5N3 commented Jan 13, 2022 •

edited

Loading

N5N3 commented Jan 13, 2022 •

edited

Loading