Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exp2(i::Int64) is slower than 2.0^(i::Int64) #17412

Closed
oxinabox opened this issue Jul 14, 2016 · 6 comments
Closed

exp2(i::Int64) is slower than 2.0^(i::Int64) #17412

oxinabox opened this issue Jul 14, 2016 · 6 comments
Labels
maths Mathematical functions performance Must go faster

Comments

@oxinabox
Copy link
Contributor

I noticed today that exp2(i) is slower that 2.0^(i) for i and Int64.
(exp2(f) for f a Float64 is still faster

function t1()
    @inbounds for jj in 1:10_000
        @inbounds for ii in -1_000:1_000
            exp2(ii)
        end
    end
end

function t2()
    @inbounds for jj in 1:10_000
        @inbounds for ii in -1_000:1_000
            2.0^(ii)
        end
    end
end

Results are:

  • @time t1() 0.280422 seconds (4 allocations: 160 bytes)
  • @time t2() 0.016330 seconds (4 allocations: 160 bytes)

The integer case is the easy case for finding the value.
Since it can be accomplished by some bit masking on the binary representation.


For comparison same test by cycling ii though: -1_000.0:1.0:1_000.0

  • @time t1() 0.362359 seconds (4 allocations: 160 bytes)
  • @time t2() 3.474701 seconds (4 allocations: 160 bytes)
    This doesn't change much whether or not I enabled @fastmath
@oxinabox
Copy link
Contributor Author

oxinabox commented Jul 14, 2016

Because I was having trouble describing the bitmasking operation, I ended up implementing it, to be satisfied I understood.

The final result is from some back and forth with @ScottPJones

@inline function myexp2_direct_subnormals(x::Int64)
    if x > 1023 
        Inf 
    elseif x < -1074
        0.0
    else 
        reinterpret(Float64,x > -1023 ? ((1<<62)+(x-1)<<52) : 1<<(x+1074))
    end
end

The more extensive testing is at: https://gist.github.com/oxinabox/6e72640222357fa10ca94bdd8c2140a6

This is just kinda cool. Significant speedup. About 6x
(I suspect more, there is a lot of overhead in the test function)

Shall I make a PR?

@simonbyrne
Copy link
Contributor

Thanks, this would be great! However before you do, how does it compare to exp2(x) = ldexp(1.0,x)? (it should be possible to be slightly faster, as we can play a bit faster with subnormals).

Can I suggest though that to make it a bit clearer, you use the (non-exported) functions exponent_bias and significand_bits defined in base/float.jl

@kshyatt kshyatt added performance Must go faster maths Mathematical functions labels Jul 14, 2016
@oxinabox
Copy link
Contributor Author

oxinabox commented Jul 15, 2016

I've added a few more comparasons to the gist
https://gist.github.com/oxinabox/6e72640222357fa10ca94bdd8c2140a6

Metric is:

  • t_power(-2048:2048) -------------------------- 1.577959 seconds
  • t_exp2(-2048:2048) ---------------------------- 1.320970 seconds
  • t_ldexp(-2048:2048) --------------------------- 0.416793 seconds
  • t_myexp2_convert_inv(-2048:2048) ----------- 0.199010 seconds
  • t_myexp2_direct_subnormals(-2048:2048) --- 0.195087 seconds

The earlier higher comparative performance of the power based method (to exp2), I think related to the particular range, and/or to it dumping subnormals to 0.

@simonbyrne I will take a look into those functions. Thanks

@eschnett
Copy link
Contributor

If you are interested, you can also produce a fast_exp2 (to be used by @fastmath) that can ignore subnormals, infinities, and not-a-numbers.

@oxinabox
Copy link
Contributor Author

oxinabox commented Jul 16, 2016

@eschnett I was planning to, but it is not on the list of allowed transformations in fastmath.jl, so not sure now

@simonbyrne I put in the nicer way with exponent_bias and significand_bits. It is much nicer.
It incurs a notable slowdown, because exponent_bias does not inline (even though it is in-fact a constant). I'll include a @inline around it when iI make the PR. Just waiting for julia to rebuild on my machine.

I updated the gist

@eschnett
Copy link
Contributor

@oxinabox Indeed, LLVM's fastmath flag doesn't allow flushing subnormals to zero. Apparently this is not a compile-time but rather a run-time setting, similar to a rounding mode. I think that this is an optimization well worth considering, but as you say, it's not in scope here.

Well, you can still avoid handling inf.

simonbyrne added a commit that referenced this issue Aug 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maths Mathematical functions performance Must go faster
Projects
None yet
Development

No branches or pull requests

4 participants