Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizing Fixed --> Float conversions #171

Closed
kimikage opened this issue Jan 22, 2020 · 2 comments · Fixed by #172
Closed

Optimizing Fixed --> Float conversions #171

kimikage opened this issue Jan 22, 2020 · 2 comments · Fixed by #172

Comments

@kimikage
Copy link
Collaborator

See: #129 (comment)

Although this has been postponed, this is a issue on conversion, not arithmetic. So, it might be better to include this in the next release (v0.8.0).

@kimikage
Copy link
Collaborator Author

(::Type{Tf})(x::Fixed{T,f}) where {Tf <: AbstractFloat, T, f} = Tf(Tf(x.i) * Tf(@exp2(-f)))
Base.Float16(x::Fixed{T,f}) where {T, f} = Float16(Float32(x))
Base.Float32(x::Fixed{T,f}) where {T, f} = Float32(x.i) * Float32(@exp2(-f))
Base.Float64(x::Fixed{T,f}) where {T, f} = Float64(x.i) * @exp2(-f)

@kimikage
Copy link
Collaborator Author

kimikage commented Jan 22, 2020

Benchmark

There seems to be no significant difference between Julia versions or between operating systems.
There is a slowdown in converting Vec3{Fixed{Int16}} arrays to Vec3{Float32} arrays, but this is mainly a problem with the LLVM backend.

Script

using BenchmarkTools
using FixedPointNumbers
struct Vec3{T <: Real}
    x::T; y::T; z::T
end
struct Vec4{T <: Real}
    x::T; y::T; z::T; w::T
end
Vec3{T}(v::Vec3{T}) where {T} = v
Vec3{T}(v::Vec3{U}) where {T, U} = Vec3{T}(v.x, v.y, v.z) 
Vec4{T}(v::Vec4{T}) where {T} = v
Vec4{T}(v::Vec4{U}) where {T, U} = Vec4{T}(v.x, v.y, v.z, v.w) 
Base.rand(::Type{Vec3{T}}) where {T} = Vec3{T}(rand(T), rand(T), rand(T))
Base.rand(::Type{Vec4{T}}) where {T} = Vec4{T}(rand(T), rand(T), rand(T), rand(T))
function Base.rand(::Type{T}, sz::Dims) where {T <: Union{Vec3, Vec4}}
    A = Array{T}(undef, sz)
    for i in eachindex(A); A[i] = rand(T); end
    return A
end
Ts = (Q0f7, Q4f3, Q0f15, Q12f3, Q0f31, Q28f3, Q0f63, Q60f3)

mat3s = [rand(Vec3{T}, 64, 64) for T in Ts];
mat4s = [rand(Vec4{T}, 64, 64) for T in Ts];

for mat in mat3s
    println(eltype(mat), "-> Float32")
    @btime Vec3{Float32}.(view($mat,:,:))
end

for mat in mat3s
    println(eltype(mat), "-> Float64")
    @btime Vec3{Float64}.(view($mat,:,:))
end

for mat in mat4s
    println(eltype(mat), "-> Float32")
    @btime Vec4{Float32}.(view($mat,:,:))
end

for mat in mat4s
    println(eltype(mat), "-> Float64")
    @btime Vec4{Float64}.(view($mat,:,:))
end

Julia v1.3.1 x86_64-w64-mingw32

julia> versioninfo()
Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

Matrix of Vec3 (unit: μs)

w64 Float32
master
Float32
optimized
Float64
master
Float64
optimized
Q0f7 3.943 3.350 10.700 5.575
Q4f3 3.886 3.362 10.900 5.420
Q0f15 4.043 4.850 11.100 5.480
Q12f3 4.057 4.850 11.001 5.840
Q0f31 5.960 5.000 9.199 5.520
Q28f3 6.000 5.000 9.199 5.640
Q0f63 30.600 5.183 48.299 6.150
Q60f3 25.799 5.617 47.401 6.150

Matrix of Vec4 (unit: μs)

w64 Float32
master
Float32
optimized
Float64
master
Float64
optimized
Q0f7 16.399 3.643 17.199 5.700
Q4f3 15.500 3.614 17.200 6.100
Q0f15 15.100 3.643 17.100 5.801
Q12f3 15.000 3.750 20.699 6.200
Q0f31 13.899 3.783 17.001 5.499
Q28f3 13.800 3.743 17.199 7.133
Q0f63 57.300 13.499 65.000 10.800
Q60f3 60.800 13.699 66.699 10.600

Julia v1.0.5 x86_64-pc-linux-gnu on WSL

julia> versioninfo()
Julia Version 1.0.5
Commit 3af96bcefc (2019-09-09 19:06 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)

Matrix of Vec3 (unit: μs)

linux Float32
master
Float32
optimized
Float64
master
Float64
optimized
Q0f7 3.914 3.362 10.900 5.400
Q4f3 4.000 3.425 11.300 5.620
Q0f15 4.057 5.033 11.800 5.800
Q12f3 4.057 5.050 11.700 5.875
Q0f31 6.100 5.133 9.800 5.640
Q28f3 6.120 5.150 9.600 5.700
Q0f63 48.700 5.450 51.900 6.180
Q60f3 44.400 5.417 46.900 6.375

Matrix of Vec4 (unit: μs)

linux Float32
master
Float32
optimized
Float64
master
Float64
optimized
Q0f7 14.200 3.471 16.500 6.467
Q4f3 14.700 3.414 16.700 6.633
Q0f15 14.000 3.614 16.100 6.667
Q12f3 13.500 3.557 16.300 5.967
Q0f31 12.300 3.486 14.400 7.000
Q28f3 12.200 3.543 14.300 7.300
Q0f63 62.900 14.300 64.400 11.400
Q60f3 56.500 14.200 56.500 11.600

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant