speed up Float16 conversions a bit #29891

JeffBezanson · 2018-11-01T17:21:40Z

For me this speeds up the first benchmark in #29889 by about 20%, by removing all the unnecessary error checks. Hardly game-changing, but might as well.

JeffBezanson · 2018-11-01T17:23:42Z

Before:

julia> @time value_16 .*= mult;
  2.315576 seconds (6 allocations: 208 bytes)

After:

julia> @time value_16 .*= mult;
  1.801816 seconds (6 allocations: 208 bytes)

JeffBezanson · 2018-11-01T18:15:16Z

The tables are quite small, and can be made immutable and inlined, leading to

julia> @time value_16 .*= mult;
  1.206085 seconds (6 allocations: 208 bytes)

Almost 2x speedup. Now we're cooking!

PeterJacko · 2018-11-02T09:07:07Z

The tables are quite small, and can be made immutable and inlined, leading to
julia> @time value_16 .*= mult;
  1.206085 seconds (6 allocations: 208 bytes)
Almost 2x speedup. Now we're cooking!

Many thanks! Are you achieving the same speed-up also for mult = 1.0 rather then mult = 1? (See the second benchmark in #29889.) It seems that it might make sense to convert an integer multiplier to float to perform the multiplication.

StefanKarpinski · 2018-11-02T13:38:46Z

It seems that it might make sense to convert an integer multiplier to float to perform the multiplication.

This has to happen internally anyway—there's no way to directly multiply an int and a float (except for special cases with constants like 2x which can get implemented as x+x instead of a multiplication). The thing to avoid is converting the integer to a Float16 first via promotion and then back to Float32 for the multiplication. It will be really nice when LLVM supports Float16 directly.

JeffBezanson · 2018-11-02T17:44:43Z

There should also be some speedup for mult = 1.0 since it also requires Float16 conversions. But it needs fewer (which is why it's faster than the integer case), since:

The thing to avoid is converting the integer to a Float16 first via promotion and then back to Float32 for the multiplication.

Indeed we seem to be doing that. I guess we'll need a bunch of special methods to handle it better.

JeffBezanson added the performance Must go faster label Nov 1, 2018

speed up Float16 conversions a bit

817de1d

JeffBezanson force-pushed the jb/float16conv branch from 99e58b2 to 817de1d Compare November 1, 2018 18:14

JeffBezanson merged commit 0d4edb3 into master Nov 6, 2018

JeffBezanson deleted the jb/float16conv branch November 6, 2018 23:15

tkf pushed a commit to tkf/julia that referenced this pull request Nov 21, 2018

speed up Float16 conversions a bit (JuliaLang#29891)

f313d98

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speed up Float16 conversions a bit #29891

speed up Float16 conversions a bit #29891

JeffBezanson commented Nov 1, 2018

JeffBezanson commented Nov 1, 2018

JeffBezanson commented Nov 1, 2018

PeterJacko commented Nov 2, 2018

StefanKarpinski commented Nov 2, 2018

JeffBezanson commented Nov 2, 2018

speed up Float16 conversions a bit #29891

speed up Float16 conversions a bit #29891

Conversation

JeffBezanson commented Nov 1, 2018

JeffBezanson commented Nov 1, 2018

JeffBezanson commented Nov 1, 2018

PeterJacko commented Nov 2, 2018

StefanKarpinski commented Nov 2, 2018

JeffBezanson commented Nov 2, 2018