Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Float16+Integer does extra conversions #29889

Closed
PeterJacko opened this issue Nov 1, 2018 · 6 comments
Closed

Float16+Integer does extra conversions #29889

PeterJacko opened this issue Nov 1, 2018 · 6 comments
Labels
performance Must go faster

Comments

@PeterJacko
Copy link

These performance times make a lot of sense:
@timed value_16 = zeros( Float16 , 10^8 ) takes about 0.1 sec

@timed value_32 = zeros( Float32 , 10^8 ) takes about 0.2 sec

@timed value_64 = zeros( Float64 , 10^8 ) takes about 0.4 sec

But these ones don't (Float16 is extremely slow while it should be faster than Float32):

mult = 1
@timed value_16 .*= mult takes about 2.1 sec

@timed value_32 .*= mult takes about 0.06 sec

@timed value_64 .*= mult takes about 0.08 sec

Float16 gets faster, but Float32 gets slower, when defining mult as Float64:

mult = 1.0
@timed value_16 .*= mult takes about 1.2 sec

@timed value_32 .*= mult takes about 0.09 sec

@timed value_64 .*= mult takes about 0.08 sec

I have tried mult = Float16( 1.0 ), but it makes it even slower...

The same happens with .+= .

julia> versioninfo()
Julia Version 1.0.1
Commit 0d713926f8 (2018-09-29 19:05 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
Environment:
  JULIA_EDITOR = "C:\JuliaPro-1.0.1.1\app-1.29.0\atom.exe" -a
  JULIA_NUM_THREADS = 2
  JULIA_PKG_SERVER = https://pkg.juliacomputing.com/
  JULIA_PKG_TOKEN_PATH = C:\Users\Peter\.julia\token.toml
@KristofferC
Copy link
Member

But these ones don't (Float16 is extremely slow while it should be faster than Float32):

Why?

@JeffBezanson
Copy link
Member

Float16 lacks hardware support, so a lot of time is spent in software routines to convert between Float16 and other types. There is a conversion from Float16 to other to do a multiply, and then a conversion back to Float16 for storage.

@PeterJacko
Copy link
Author

But these ones don't (Float16 is extremely slow while it should be faster than Float32):

Why?

I guess if the performance doesn't deteriorate by changing 64 to 32, it should be possible to achieve a similar performance (at least not to make it 20x slower) when moving from 32 to 16.

At least for me a reason to use Float16 is to save memory and time at a cost of precision.

@timholy
Copy link
Member

timholy commented Nov 1, 2018

You can only save memory, not time. Because of the lack of hardware support, it will perforce be slower. zeros is a special case because there's a generic bitwise representation for floating-point 0.

https://en.wikipedia.org/wiki/Floating-point_unit. We don't get to pick what precisions they support. Float64 and Float32 is all we've got.

@KristofferC
Copy link
Member

KristofferC commented Nov 1, 2018

I guess if the performance doesn't deteriorate by changing 64 to 32, it should be possible to achieve a similar performance (at least not to make it 20x slower) when moving from 32 to 16.

That is a bit optimistic generalization.

#26381 might be interesting to look at.

@JeffBezanson JeffBezanson changed the title Very slow operations on Float16 Float16+Integer does extra conversions Nov 6, 2018
@JeffBezanson JeffBezanson added the performance Must go faster label Nov 6, 2018
@JeffBezanson
Copy link
Member

Renamed to reflect what I believe is the remaining actionable item, which is that combining Float16s with integers is slower than necessary due to first converting the integer to Float16, then to Float32 to compute. We'll need specialized methods to fix that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

No branches or pull requests

5 participants