-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Float16+Integer does extra conversions #29889
Comments
Why? |
Float16 lacks hardware support, so a lot of time is spent in software routines to convert between Float16 and other types. There is a conversion from Float16 to other to do a multiply, and then a conversion back to Float16 for storage. |
I guess if the performance doesn't deteriorate by changing 64 to 32, it should be possible to achieve a similar performance (at least not to make it 20x slower) when moving from 32 to 16. At least for me a reason to use Float16 is to save memory and time at a cost of precision. |
You can only save memory, not time. Because of the lack of hardware support, it will perforce be slower. https://en.wikipedia.org/wiki/Floating-point_unit. We don't get to pick what precisions they support. |
That is a bit optimistic generalization. #26381 might be interesting to look at. |
Renamed to reflect what I believe is the remaining actionable item, which is that combining Float16s with integers is slower than necessary due to first converting the integer to Float16, then to Float32 to compute. We'll need specialized methods to fix that. |
These performance times make a lot of sense:
@timed value_16 = zeros( Float16 , 10^8 )
takes about 0.1 sec@timed value_32 = zeros( Float32 , 10^8 )
takes about 0.2 sec@timed value_64 = zeros( Float64 , 10^8 )
takes about 0.4 secBut these ones don't (Float16 is extremely slow while it should be faster than Float32):
mult = 1
@timed value_16 .*= mult
takes about 2.1 sec@timed value_32 .*= mult
takes about 0.06 sec@timed value_64 .*= mult
takes about 0.08 secFloat16 gets faster, but Float32 gets slower, when defining mult as Float64:
mult = 1.0
@timed value_16 .*= mult
takes about 1.2 sec@timed value_32 .*= mult
takes about 0.09 sec@timed value_64 .*= mult
takes about 0.08 secI have tried
mult = Float16( 1.0 )
, but it makes it even slower...The same happens with
.+=
.The text was updated successfully, but these errors were encountered: