Skip to content

A number format that you can count with your fingers.

License

Notifications You must be signed in to change notification settings

JuliaMath/Float8s.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

author
Milan K
Feb 13, 2020
0eac15f · Feb 13, 2020

History

26 Commits
Feb 13, 2020
Feb 13, 2020
Feb 11, 2020
Feb 11, 2020
Feb 11, 2020
Feb 11, 2020
Feb 3, 2020
Feb 11, 2020
Feb 11, 2020
Feb 13, 2020

Repository files navigation

Project Status: Active – The project has reached a stable, usable state and is being actively developed. Travis AppVeyor Cirrus CI

Float8s.jl

Finally a number type that you can count with your fingers. Super Mario and Zelda would be proud.

Comes in two flavours: Float8 has 3 exponent bits and 4 fraction bits, Float8_4 has 4 exponent bits and 3 fraction bits. Both rely on conversion to Float32 to perform any arithmetic operation, similar to Float16.

Example use

julia> using Float8s

julia> a = Float8(4)
Float8(4.0)

julia> b = Float8(3.14159)
Float8(3.125)

julia> a+b
Float8(7.0)

julia> sqrt(a)
Float8(2.0)

julia> a^2
Inf8

Most arithmetic operations are implemented. If you would like to have an additional feature, raise an issue.

Installation

Float8s.jl is not yet registered, for the time being do

(v1.3) pkg> add https://github.com/milankl/Float8s.jl

Benchmarking

julia> using BenchmarkTools

julia> A = Float8.(randn(300,300));

julia> @btime Float32.($A);
  413.303 μs (2 allocations: 351.64 KiB)

julia> 413.303/300^2*1000
4.592255555555555

Conversions from Float8 to Float32 take about 4.5ns, conversions in the other direction are about 2x slower and slightly slower than for Float16.

julia> A = Float32.(randn(300,300));

julia> @btime Float16.($A);
  674.123 μs (2 allocations: 175.89 KiB)

julia> @btime Float8.($A);
  955.196 μs (2 allocations: 88.02 KiB)