Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve accuracy of conversions from floating-point numbers (Fixes #102) #131

Merged
merged 2 commits into from
Nov 2, 2019

Conversation

kimikage
Copy link
Collaborator

@kimikage kimikage commented Oct 28, 2019

This fixes a problem with the input range checking (#102).
Since this PR includes low-level operations, special attention is required. The additional tests are for range checking, not for accuracy or integrity.
This PR has little effect on the conversions to Normed{UInt8} and Normed{UInt16}.

@kimikage
Copy link
Collaborator Author

kimikage commented Oct 28, 2019

On the 32-bit versions, some pre-calculated constants do not agree with the dynamically calculated values.

julia> versioninfo()
Julia Version 1.4.0-DEV.378
Commit dbaa6ff660 (2019-10-28 12:55 UTC)
Platform Info:
  OS: Windows (i686-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
  WORD_SIZE: 32
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

julia> a(x) = bitstring(x*(typemax(UInt64)/(FixedPointNumbers.rawone(N11f53)))); # pre-calculated constant

julia> b(x) = bitstring((x*typemax(UInt64))/(FixedPointNumbers.rawone(N11f53))); # dynamically calculated

julia> a(1) # 1 is an identity element. On 64-bit systems, `a(1)` agrees with `b(1)`.
"0100000010100000000000000000000000000000000000000000000000000000"

julia> b(1)
"0100000010100000000000000000000000000000000000000000000000000001"

julia> Float64(BigFloat(typemax(UInt64))/BigFloat(FixedPointNumbers.rawone(N11f53))) # ideal result, but "usually", not obtained in Float64 calculations.
2048.0

julia> Float64(typemax(UInt64))/Float64(FixedPointNumbers.rawone(N11f53)) # usual result in Float64 calculations.
2048.0000000000005

julia> BigFloat(typemax(UInt64))/BigFloat(FixedPointNumbers.rawone(N11f53)) # this should be rounded to `2048.0`, so the ideal result is correct.
2048.000000000000227262653140769569055940417886493182912978948283864490173983107

Is this a problem that should be solved by modifying the tests?

@kimikage
Copy link
Collaborator Author

To avoid impact on the performance, this PR does not change the rem.

rem(x::T, ::Type{T}) where {T <: Normed} = x
rem(x::Normed, ::Type{T}) where {T <: Normed} = reinterpret(T, _unsafe_trunc(rawtype(T), round((rawone(T)/rawone(x))*reinterpret(x))))
rem(x::Real, ::Type{T}) where {T <: Normed} = reinterpret(T, _unsafe_trunc(rawtype(T), round(rawone(T)*x)))
rem(x::Float16, ::Type{T}) where {T <: Normed} = rem(Float32(x), T) # avoid overflow

Therefore, this does not fix the problems such as:

julia> using Colors

julia> white = parse(RGB{Float32}, "white")
RGB{Float32}(1.0f0,1.0f0,1.0f0)

julia> convert(RGB{N0f32}, white)
RGB{N0f32}(0.0,0.0,0.0)

Comment on lines +70 to +74
if T == UInt128 && f == 53
0 <= x <= Tf(3.777893186295717e22) || throw_converterror(U, x)
else
0 <= x <= Tf((typemax(T)-rawone(U))/rawone(U)+1) || throw_converterror(U, x)
end
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(typemax(T)-rawone(U))/rawone(U)+1 may have slightly higher accuracy than typemax(T)/rawone(U) because the 1s in the dividend are reduced. However, this have no effect for UInt128. So, the specialization for Normed{UInt128, 53} is needed (f == 53 is one of singular points). FYI, using BigFloat hinders the constant pre-calculation and inline expansion.

Although this can be a temporary measure, I think the problem with pre-calculating constants is potentially troublesome.

@codecov-io
Copy link

Codecov Report

Merging #131 into master will increase coverage by 2.67%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #131      +/-   ##
=========================================
+ Coverage   79.72%   82.4%   +2.67%     
=========================================
  Files           3       3              
  Lines         217     233      +16     
=========================================
+ Hits          173     192      +19     
+ Misses         44      41       -3
Impacted Files Coverage Δ
src/FixedPointNumbers.jl 76.92% <ø> (ø) ⬆️
src/normed.jl 86.23% <100%> (+5.8%) ⬆️
src/fixed.jl 82.6% <0%> (-0.38%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update da39318...d98d60c. Read the comment docs.

Copy link
Member

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that's an impressive conversion. Does this have negative consequences for performance?

src/normed.jl Outdated
@@ -41,27 +41,61 @@ rawone(v) = reinterpret(one(v))
function Normed{T,f}(x::Normed{T2}) where {T <: Unsigned,T2 <: Unsigned,f}
U = Normed{T,f}
y = round((rawone(U)/rawone(x))*reinterpret(x))
(0 <= y) & (y <= typemax(T)) || throw_converterror(U, x)
0 <= y <= typemax(T) || throw_converterror(U, x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are actually subtly different, and at least at one point there was a performance difference (doesn't seem to be one now, at least not on my current machine):

julia> function f1(x)
           0.25 <= x <= 0.75 || throw(DomainError(x))
           return x
       end
f1 (generic function with 1 method)

julia> function f2(x)
           ((0.25 <= x) & (x <= 0.75)) || throw(DomainError(x))
           return x
       end
f2 (generic function with 1 method)

julia> @code_lowered f1(0.5)
CodeInfo(
1%1 = 0.25 <= x
└──      goto #3 if not %1
2@_3 = x <= 0.75
└──      goto #4
3@_3 = false
4%6 = @_3
└──      goto #6 if not %6
5 ─      goto #7
6%9 = Main.DomainError(x)
└──      Main.throw(%9)
7return x
)

julia> @code_lowered f2(0.5)
CodeInfo(
1%1 = 0.25 <= x
│   %2 = x <= 0.75%3 = %1 & %2
└──      goto #3 if not %3
2 ─      goto #4
3%6 = Main.DomainError(x)
└──      Main.throw(%6)
4return x
)

julia> using BenchmarkTools

julia> @btime f1(x) setup=(x=rand()/10 + 0.3)
  1.980 ns (0 allocations: 0 bytes)
0.34079130696459065

julia> @btime f2(x) setup=(x=rand()/10 + 0.3)
  1.977 ns (0 allocations: 0 bytes)
0.30934306198706746

Copy link
Collaborator Author

@kimikage kimikage Oct 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I guess, using & instead of && may reduce conditional branches. However, throw_converterror(U, x) is annotated with @noinline and throws the exception. This means that we cannot avoid conditional branches, and this is not SIMD-suitable. Depending on the type (and CPU), the comparison with zero can be specialized (this is the reason why I did not use zero()).

Of course, I follow the law of "if it works, don't touch it".:smile:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you definitely need one branch, but one is usually better than two. For example when Matt & I were developing the array infrastructure we noticed it was better to check bounds on all indexes and combine with &, and then throw a BoundsError if necessary, rather than having one branch per dimension.

But this is definitely something that depends on the CPU and its ability to predict branches, so it can even be hard to benchmark. I would generally suspect that fewer branches is a good thing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. Nevertheless, I think it is another good practice to avoid something tricky and leave it to the compiler (LLVM) now, because this is usually not a pair of integer comparisons.

However, I do never intend to dispute about the comparison manner, which has no definitive answer. Instead I would care about the /, but it is another matter.

@kimikage
Copy link
Collaborator Author

kimikage commented Oct 31, 2019

Does this have negative consequences for performance?

Unfortunately, my software floating-point calculation is ~2.5x slower.
The following are the benchmark results based on the (inverse) Vec4 version of #129 (comment)

x86_64 before after
Float32->N0f8 15.999 μs 14.099 μs
Float32->N8f24 20.000 μs 48.199 μs
Float32->N0f32 20.100 μs 49.100 μs
Float64->N0f8 15.999 μs 13.999 μs
Float64->N0f32 16.099 μs 14.299 μs
Float64->N11f53 41.899 μs 49.500 μs
Float64->N0f64 40.999 μs 68.900 μs

However, we cannot measure the conversion speed of what cannot be converted. As implied above, in applications where the speed is needed, rem is used and the conversions with range checking should not be used.

Edit:
As I mentioned in #125 (comment), * and / already have performance problems.

*(x::T, y::T) where {T <: Normed} = convert(T,convert(floattype(T), x)*convert(floattype(T), y))
/(x::T, y::T) where {T <: Normed} = convert(T,convert(floattype(T), x)/convert(floattype(T), y))

At least for *, I want to avoid the floating-point arithmetic. Although the discussion of overflow (#41) is undecided, since currently +, - and Fixed's * are the wraparound style, even breaking changes are reasonable.

The order of bug fixes is not so important, but this problem (#102) disturbs the checks or benchmarks for the modifications to mitigate the issue #129, and the modifications for #129 may be helpful for the PR #123 (issue #120).

@timholy
Copy link
Member

timholy commented Oct 31, 2019

Thanks for posting those numbers. Correctness is infinitely more important than speed, so I would support merging this.

@kimikage
Copy link
Collaborator Author

kimikage commented Nov 1, 2019

For the lines based on the current source code, I reverted the comparison manner. (cf. #131 (comment))

In the new conversion method, I used the a <= x <= b style. This decision is based on my policy and the rough benchmark. Honestly, the reason is that it is shorter in source codes. 😃
Although the inconsistency makes me somewhat itchy, I think it will also make more people notice this comparison issue.

@timholy
Copy link
Member

timholy commented Nov 1, 2019

I'm fine with whatever you choose here. Very grateful for this change!

@kimikage
Copy link
Collaborator Author

kimikage commented Nov 1, 2019

I am ready to rebase and merge this PR, but not yet ready for other issues. When do you bump the version number up?

@timholy
Copy link
Member

timholy commented Nov 1, 2019

Version numbers are cheap: any time it's potentially breaking, we should bump the minor version number. If the other fixes will follow quickly I'd say let's consolidate the churn and wait to tag a release until it's all done. If not, let's get a release out for the fixes we have.

As the community has moved to putting upper version bounds on packages, it's also more likely that breaking changes won't break things for users.

@kimikage kimikage merged commit 70ae1d6 into JuliaMath:master Nov 2, 2019
@kimikage kimikage deleted the issue102 branch November 2, 2019 03:22
@kimikage
Copy link
Collaborator Author

kimikage commented Nov 2, 2019

If this fix causes problems, consider using rem (% Nxfy) instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants