Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement rem(float, float, RoundNearest) in Julia #42380

Merged
merged 2 commits into from
Sep 29, 2021
Merged

Conversation

ararslan
Copy link
Member

One of the few remaining uses of openlibm is implementing rem with RoundNearest for Float32 and Float64. This commit translates the msun libm implementations of remainder and remainderf to Julia to avoid relying on openlibm or the local system libm.

The implementations of the Float32 and Float64 methods are quite similar and could be combined with a bit more work. I'd be happy to do that but would need some guidance for some of the less obvious magic numbers.

Checks off the rem box in #26434.

@ararslan ararslan added the maths Mathematical functions label Sep 25, 2021
@KristofferC
Copy link
Member

Out if curiosity, any performance numbers to share?

@oscardssmith oscardssmith self-requested a review September 25, 2021 20:47
@ararslan
Copy link
Member Author

any performance numbers to share?

Good idea! Seems to do pretty well (first here is this implementation):

julia> @benchmark rem(x, p, RoundNearest) setup=((x, p) = rand(Float64, 2) .* rand(Bool, 2))
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  3.696 ns … 84.758 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     5.093 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   8.287 ns ±  4.779 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇█▄▄▅                 ▂▂▅▇▆▅▃▂▂▂▁▂▂▃▃  ▁▁▂▁▁   ▁▁      ▁   ▂
  █████▄▁▁▄▅▁▁▁▁▁▁▁▁▁▁▁▇███████████████████████▇▇███▆▇▇█▇██▆ █
  3.7 ns       Histogram: log(frequency) by time     20.5 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark Base.rem(x, p, RoundNearest) setup=((x, p) = rand(Float64, 2) .* rand(Bool, 2))
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
 Range (min … max):   9.454 ns … 108.064 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     35.748 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   31.370 ns ±   9.344 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇▃                               ▇▃▆▃▁▂   █▇▅▁▄▃ ▁▁       ▃  ▂
  ███▇█▇▄▄▄▁▁▁▁▁▃▁▃▁▁▁▁▁▁▁▄▃▁▃▄▁▁▄▃██████▇█▅█████████▇▆█▆██▅█▇ █
  9.45 ns       Histogram: log(frequency) by time      46.5 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

@oscardssmith
Copy link
Member

Any idea why this is faster? Does a ccall have noticable overhead?

@ararslan
Copy link
Member Author

Any idea why this is faster?

None whatsoever

@simonbyrne
Copy link
Contributor

This is perhaps one case where it would be fine to use the llvm intrinsic (since all implementations should return the same value), but having our own probably makes sense. Thanks @ararslan!

@ararslan
Copy link
Member Author

I was looking into doing this directly with LLVM initially but AFAICT the available LLVM intrinsics don't have the RoundNearest behavior.

@ViralBShah
Copy link
Member

Only modf to go after this!

@ararslan
Copy link
Member Author

Only modf to go after this!

I have a local implementation of that one too, will be doing a bit more testing then making a PR. 🙂

@oscardssmith
Copy link
Member

Don't we still have fma to go?

@simonbyrne
Copy link
Contributor

I was looking into doing this directly with LLVM initially but AFAICT the available LLVM intrinsics don't have the RoundNearest behavior.

They don't? What do they use.

@ararslan
Copy link
Member Author

ararslan commented Sep 26, 2021

It's not specified in the LLVM documentation but based on experimentation LLVM's frem seems to give the same result as our rem with RoundToZero. EDIT: They say it's the same as fmod, which explains that.

@simonbyrne
Copy link
Contributor

Oh right, LLVM doesn't offer a remainder intrinsic 😞

@simonbyrne
Copy link
Contributor

you may need some sort of copyright acknowledgement?

@stevengj
Copy link
Member

Needs tests?

@ararslan
Copy link
Member Author

There are existing tests for these methods which I assumed were sufficient but I can always add more.

@ararslan
Copy link
Member Author

Test failures:

  • Linux x64: segfault in linear algebra tridiag tests, existing issue
  • FreeBSD x64: unwinding-related segfault, existing issue
  • Windows x86: timeout

All of these platforms successfully ran the numbers tests, which is where rem tests happen.

@simonbyrne
Copy link
Contributor

It would be nice if we could get rid of some of the bit twiddling and use plain Julia functions (isnan, isfinite, etc), and combine them into one function

@simonbyrne
Copy link
Contributor

I don't quite understand why it does rem(x, 2p) instead of rem(x, p). Any idea?

@simonbyrne
Copy link
Contributor

Ah, because of the ties-to-even.

One of the few remaining uses of openlibm is implementing `rem` with
`RoundNearest` for `Float32` and `Float64`. This commit translates the
msun libm implementations of `__ieee754_remainder` and
`___ieee754_remainderf` to Julia to avoid relying on openlibm.
@ararslan
Copy link
Member Author

ararslan commented Sep 28, 2021

Made some changes, current state:

  • No more bit twiddling or reinterpreting, it's plain functions and direct comparisons of floats all the way down
  • Just one method definition, no longer separated by Float32 vs. Float64
  • Float16 now goes through the same code path as for Float32 and Float64 without needing to first promote to Float32
  • Added some extra tests for the edge cases
  • Performance still seems to be slightly better than current master

Summary of test failures:

  • Linux: Segfault from OpenBLAS
  • macOS: Segfault from OpenBLAS
  • FreeBSD: Segfault from libunwind

base/math.jl Outdated Show resolved Hide resolved
base/math.jl Outdated Show resolved Hide resolved
base/math.jl Outdated Show resolved Hide resolved
base/math.jl Outdated Show resolved Hide resolved
base/math.jl Outdated Show resolved Hide resolved
Also make sure that signed zero is checked in the test
@ararslan
Copy link
Member Author

Simon's feedback was incorporated and the only CI failure is FreeBSD, which is a known issue, so I'll go ahead and merge. Thanks!

@ararslan ararslan merged commit c8b5904 into master Sep 29, 2021
@ararslan ararslan deleted the aa/remainder branch September 29, 2021 21:40
@ViralBShah ViralBShah mentioned this pull request Sep 29, 2021
17 tasks
@KristofferC
Copy link
Member

Ended up very clean and nice!

@barucden barucden mentioned this pull request Oct 6, 2021
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Feb 22, 2022
One of the few remaining uses of openlibm is implementing `rem` with
`RoundNearest` for `Float32` and `Float64`. This commit translates the
msun libm implementations of `__ieee754_remainder` and
`___ieee754_remainderf` to Julia to avoid relying on openlibm.
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022
One of the few remaining uses of openlibm is implementing `rem` with
`RoundNearest` for `Float32` and `Float64`. This commit translates the
msun libm implementations of `__ieee754_remainder` and
`___ieee754_remainderf` to Julia to avoid relying on openlibm.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maths Mathematical functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants