-
Notifications
You must be signed in to change notification settings - Fork 5
Floating Point Precision Issues
Julia Sloan edited this page Oct 19, 2023
·
8 revisions
We aim to use Float64 and Float32 precision in CliMA's ESM. Here is a summary of some challenges / things to be aware of:
- Float32 (single precision): a 32-bit float can represent up to 7 decimal numbers (log10(2^24))
- 1 sign bit, 8 exponent bits, 23 mantissa/fraction bits
- Float64 (double precision): precision around 16 decimal numbers (log10(2^53))
- 1 sign bit, 11 exponent bits, 52 mantissa/fraction bits
julia> eps(zero(Float64))
5.0e-324
julia> eps(one(Float64))
2.220446049250313e-16
julia> eps(zero(Float32))
1.0f-45
julia> eps(one(Float32))
1.1920929f-7
https://github.com/CliMA/ClimaCoupler.jl/issues/271
- (discovered using the dss! callback)
- setting all time to Float32, but
integrator.t
gets converted somewhere to Float64 during step!. For now,t
always needs to be stored as aFloat64
becauseFloat32
does not have enough bits to accurately track time without roundoff error.
Refs
- see Julia docs for more info on this property of
eps()
- mixed precision: https://blogs.nvidia.com/blog/2019/11/15/whats-the-difference-between-single-double-multi-and-mixed-precision-computing/