You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In PIConGPU we discussed the influence of fast math for important functions such a trigonometric functions in ComputationalRadiationPhysics/picongpu#1489. The main problem is, that some restrictions like
For x in [-π,π], the maximum absolute error is 2^-21.19, and larger otherwise.
are only know to the implementer writing that cosine (since he/she knows the definition range of the input). The cleanest way would be to expose that to the user without using the shotgun approach of compiler flags could be:
An additional template argument for math functions like sqrt and sin to explicitly enable fast-math if the range of the argument's range and expected value range is 100% satisfying the reduced precision: default PMacc::math::sin<T_FastMath=False>().
Of course, the individual, absolute hardware specific fast-math errors are hardware dependent, but usually follow at least a similar scheme (such as expected input range).
The text was updated successfully, but these errors were encountered:
In PIConGPU we discussed the influence of fast math for important functions such a trigonometric functions in ComputationalRadiationPhysics/picongpu#1489. The main problem is, that some restrictions like
cosf
==__cosf(x)
:are only know to the implementer writing that cosine (since he/she knows the definition range of the input). The cleanest way would be to expose that to the user without using the shotgun approach of compiler flags could be:
An additional template argument for math functions like
sqrt
andsin
to explicitly enable fast-math if the range of the argument's range and expected value range is 100% satisfying the reduced precision: defaultPMacc::math::sin<T_FastMath=False>()
.Of course, the individual, absolute hardware specific fast-math errors are hardware dependent, but usually follow at least a similar scheme (such as expected input range).
The text was updated successfully, but these errors were encountered: