-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Could we have a RoundingMode
that doesn't throw InexactError
#51069
Comments
https://docs.julialang.org/en/v1/base/math/#Base.unsafe_trunc |
So perhaps this functionality should be incorporated into |
I think it would be appropriate to define this in a package with struct NoThrow{R} end
Base.round(::Type{T}, x, ::RoundingMode{NoThrow{R}}) where {T, R} =
convert(T, clamp(round(x, RoundingMode{R}()), typemin(T), typemax(T)))
const Nearest = RoundingMode{NoThrow{:Nearest}}()
using Test
@test round(UInt8, +Inf, Nearest) === 0xff
@test round(UInt8, +1000.0, Nearest) === 0xff
@test round(UInt8, -1000.0, Nearest) === 0x00
@test round(UInt8, -Inf, Nearest) === 0x00 |
The behavior of
|
In regards to |
With the implementation I gave, In signal processing and control systems, you may not want to have to manually deal with values that are too big or too small, but a NaN should probably still be explicitly dealt with. With graphics, 0 is probably okay. |
So it's context dependent - then I'd vote for not having this in Base, since picking a preference when Base doesn't use it internally seems odd. |
That's a good argument, but it's also good for not have rounding in Base at all? :) [In the sense: I didn't check carefully, but I'm pretty sure Julia doesn't use round itself, maybe one exception, except to implement functions it also doesn't use.]
Don't we have packages for graphics, or fixed-point, where this belongs then? For the former also, if not same package. |
Base primarily uses rounding to implement mathematical functions like integer division, trig, exp, and rem_pio2. If I were doing it all over, I would probably not have any of those in base. |
The problem here is that this causes duplicate overflow/clamp test branches: one in clamp and a redundant one in convert. Are we confident the compiler is able to reliably eliminate these in all cases? [P.S.: another problem of this kind of construct, where rounding and converting are separate operations, is that it doesn't provide control over the rounding method used when converting e.g. a I still believe it makes a lot of sense to treat rounding, clamping and converting as one single basic operation, both in terms of performance, flexibility and accuracy. (Treatment of |
Treating all three as a single operation significantly expands the cross-product of possible inputs. Whenever possible, it is better to provide simple, generic, implementations that can compose to be as efficient and precise as optimized integrated implementations. If we swap out struct Clamped{R} end
Base.round(::Type{T}, x, ::RoundingMode{Clamped{R}}) where {T, R} =
unsafe_trunc(T, clamp(round(x, RoundingMode{R}()), typemin(T), typemax(T)))
const Nearest = RoundingMode{Clamped{:Nearest}}()
@btime round(UInt8, x, Nearest) setup=(x=rand()*1000-500)
# 1.417 ns (0 allocations: 0 bytes)
@btime Base.fptoui(UInt8, x) setup=(x=rand()*100)
# 1.416 ns (0 allocations: 0 bytes) |
To throw on NaN would hurt performance, but to return zero on NaN inputs would be possible without performance overhead with Base.round(::Type{UInt8}, x::Float64, ::RoundingMode{Clamped{R}}) where R =
ccall("llvm.fptoui.sat.i8.f64", llvmcall, UInt8, (Float64,), round(x, RoundingMode{R}())) integrating truncation and clamping (but still not needing to integrate rounding) All this can still be implemented in a package. |
Could we also have a mode that doesn't throw
InexactError
, and always simply gives us the nearest element of the output type, as inwhere
0xff
and0x00
are the nearestUInt8
values to+Inf
,1000.0
,-1000.0
and-Inf
, respectively. I can think of a lot of applications (signal processing, control systems, graphics), where you never want to have to manually deal with anInexactError
, and just want to use whatever value is nearest. (That's how MATLAB converts floats to integers.)Originally posted by @mgkuhn in #50812 (comment)
The text was updated successfully, but these errors were encountered: