Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make uniformRM (0, 1) for floats equivalent to uniformDouble01M / uniformFloat01M #166

Closed
Flupp opened this issue Aug 9, 2024 · 19 comments · Fixed by #172
Closed

make uniformRM (0, 1) for floats equivalent to uniformDouble01M / uniformFloat01M #166

Flupp opened this issue Aug 9, 2024 · 19 comments · Fixed by #172

Comments

@Flupp
Copy link

Flupp commented Aug 9, 2024

uniformRM (l, h) for Float and Double is defined by drawing x from [0, 1] using uniformDouble01M / uniformFloat01M and then returning x * l + (1 - x) * h.

One might expect that uniformRM (0, 1) may produce the same values as uniformDouble01M / uniformFloat01M. However, this is not the case because of rounding errors when calculating 1 - x.

For visualization of the rounding problem try:

mapM_ print [ (x == x', x', x)
            | let m = fromIntegral (maxBound :: Word64) :: Double
            , i <- [0, 2 ^ 8 .. 2 ^ 14]
            , let x = fromIntegral i / m  -- resembles `uniformDouble01M`
            , let x' = 1 - (1 - x)
            ]

Note that the x' values have a lot of repetitions while each x is different.

An easy fix would be changing the calculation to (1 - x) * l + x * h. However, then the possible results of uniformRM (-1, 0) are reduced instead.

A possible solution could be to first (mathematically) transform the two calculations as follows:

  1. x * l + (1 - x) * h = h + x * (l - h)
  2. (1 - x) * l + x * h = l + x * (h - l)

Then use calculation 2 when l is closer to zero than h, and calculation 2 otherwise. This may be implemented as follows:

instance UniformRange Float where  -- analogously for `Double`
  uniformRM (l, h)
    | l == h                       = const $ return l
    | isInfinite l || isInfinite h = const $ return $! h + l
      -- Optimisation exploiting absorption:
      --   (-Infinity) + (anything but +Infinity) = -Infinity
      --   (anything but -Infinity) + (+Infinity) = +Infinity
      --                (-Infinity) + (+Infinity) = NaN
    | let f = if abs l < abs h then (let d = h - l in \ x -> l + x * d)
                               else (let d = l - h in \ x -> h + x * d)
      = fmap f . uniformFloat01M
  {-# INLINE uniformRM #-}

(This also drops the binding for the second parameter of uniformRM in order to allow memoization of the bound preprocessing.)

This approach also slightly reduces the documented floating point number caveats, because this implementation guarantees that the bound that is closer to 0 is not exceeded.

Note: Obviously, this would break backwards compatibility as it changes the result of uniformRM for a given state of the RNG.

@Bodigrim
Copy link
Contributor

Bodigrim commented Aug 9, 2024

One might expect that uniformRM (0, 1) may produce the same values as uniformDouble01M / uniformFloat01M. However, this is not the case because of rounding errors when calculating 1 - x.

I don't really feel that there are good reasons to expect that, especially if the only mismatches are numeric noise.

Note: Obviously, this would break backwards compatibility as it changes the result of uniformRM for a given state of the RNG.

Which means a major version bump and lots of busy work for everyone to bump their upper bounds. IMHO not worth it.

@chreekat
Copy link

It would be interesting to see how the change affects statistical quality benchmarks like Diehard (used in splitmix).

@lehins
Copy link
Contributor

lehins commented Dec 23, 2024

I was not originally involved in implementing floating point number generation, it was mostly @Shimuuar and @curiousleo that handled it when we did the rewrite in 2020. So, I had to do a bit of archaeology and tried to figure out why current implementation is the way it is.

What I found out was that (h - l) * x + l was the original implementation for scaling floating point values. It is common knowledge that it is a bad approach for generate floating point values, but everyone uses it. According to references in documentation it is also what is used in C++, Java and Python, so many languages seem to suffer from it. Furthermore, I discovered @Bodigrim's comment that we should change that to h * x + (1 - x) * l
What is missing from that comment is what were the surprising results that went away with this change?

Ironically, it was pointed out in that conversation that we should record this decision, but I can't find it being recorded anywhere in the documentation. Here is the PR that introduced the change: https://github.com/idontgetoutmuch/random/pull/169/files Looking though changes to documentation still did not clarify it for me which of the problems were solved by changing the formula.

@Bodigrim Could you shed some light for my own sanity. What did we get with h * x + (1 - x) * l that we didn't with (h - l) * x + l? We can ignore performance conciderations and all of the other problems that both of those calculations have for the purpose of this conversation. I am only interested in why former was suggested as a better than the latter.

Here is a draft PR that implements the change: #172

My response to other comments in this ticket:

It would be interesting to see how the change affects statistical quality benchmarks like Diehard (used in splitmix).

Diehard tests, from what I know, check floating point generation in the (0, 1) range, while this issue is about rounding error of scaling to a different range once a floating point value in (0,1) has already been generated. So, I can't see how diehard tests could be relevant here.

Which means a major version bump and lots of busy work for everyone to bump their upper bounds. IMHO not worth it.

I'll be releasing major version 1.3.0 in the next few days, which will have a bunch of other breaking changes, so this is no longer a problem.

I don't really feel that there are good reasons to expect that, especially if the only mismatches are numeric noise.

I do agree with this argument, however there are other parts of this ticket that I do like. In particular I like the simplification of the scaling formula and avoidance of some corner cases.

This also drops the binding for the second parameter of uniformRM in order to allow memoization of the bound preprocessing.

I dropped this minor optimization for the sake of clarity of the code. We can bring it back at any time if need be.

@lehins
Copy link
Contributor

lehins commented Dec 23, 2024

I decided to play around a bit with sbv and was able to verify this claim, which is an improvement over what we have today IMHO:

This approach also slightly reduces the documented floating point number caveats, because this implementation guarantees that the bound that is closer to 0 is not exceeded.

Posting here my proof, so others can double check that I hopefully did not make any mistakes:

sat $ do
  let scalingFloat l h x = ite (abs l .< abs h) (l + x * (h - l)) (h + x * (l - h))

  (l, h, x) <- (,,) <$> free "l" <*> free "h" <*> free "x"
  constrain (x .>= 0)
  constrain (x .<= 1)
  let y = scalingFloat l h (x :: SFloat)
  -- Only care about normal values
  constrain (sNot (fpIsInfinite y))
  constrain (sNot (fpIsNaN y))
  constrain $
    ite (abs l .< abs h)
      -- l is closer to zero:
      (ite (l .< h)
        (y .< l) -- l is the true low  (find y that is lower than l)
        (l .< y) -- l is the high      (find y that is higher than l)
        )
      -- h is closer to zero:
      (ite (l .< h)
        (y .> h) -- h is the true high (find y that is higher than h)
        (y .< h) -- l is the low       (find y that is lower than h)
      )

@Flupp
Copy link
Author

Flupp commented Dec 23, 2024

[…] Furthermore, I discovered @Bodigrim's comment that we should change that to h * x + (1 - x) * l What is missing from that comment is what were the surprising results that went away with this change?

Very interesting finding. I would be very interested in the details too, because this comment argues exactly in the opposite direction of my proposal.

Ironically, it was pointed out in that conversation that we should record this decision, but I can't find it being recorded anywhere in the documentation. Here is the PR that introduced the change: https://github.com/idontgetoutmuch/random/pull/169/files Looking though changes to documentation still did not clarify it for me which of the problems were solved by changing the formula.

The commit message of 0a36e76 argues: “x * a + (1 - x) * b is slightly more numerically stable than the previous formula.” However, I do not understand how it is more stable.

[…]

Here is a draft PR that implements the change: #172

Looks good to me.

My response to other comments in this ticket:

It would be interesting to see how the change affects statistical quality benchmarks like Diehard (used in splitmix).

Diehard tests, from what I know, check floating point generation in the (0, 1) range, while this issue is about rounding error of scaling to a different range once a floating point value in (0,1) has already been generated. So, I can't see how diehard tests could be relevant here.

Actually, I argue that even for the range (0, 1) some bits of randomness might be lost due to rounding.

[…]

I don't really feel that there are good reasons to expect that, especially if the only mismatches are numeric noise.

I do agree with this argument, […].

I could imagine that the “numeric noise” might actually be relevant if a user skews the distribution by applying a function to the drawn numbers. For example applying the n-th root (for large n). Not sure if this is a good example or if there are good examples at all. In general, I would be hesitant to throw away some randomness if there is no good reason.

This also drops the binding for the second parameter of uniformRM in order to allow memoization of the bound preprocessing.

I dropped this minor optimization for the sake of clarity of the code. We can bring it back at any time if need be.

Fair enough. The optimizer might be clever enough to do this anyways.

I decided to play around a bit with sbv and was able to verify this claim […]

Posting here my proof, so others can double check that I hopefully did not make any mistakes: […]

Nice! Unfortunately, I do not know sbv, still, it kind of looks sensible to me. As far as I can tell, you formulated the constraints in a way such that they are unsatisfyable and I guess that was intentional so a SAT solver can be used to verify this.

@Bodigrim
Copy link
Contributor

Furthermore, I discovered @Bodigrim's comment that we should change that to h * x + (1 - x) * l What is missing from that comment is what were the surprising results that went away with this change?

Ironically, it was pointed out in that conversation that we should record this decision, but I can't find it being recorded anywhere in the documentation. Here is the PR that introduced the change: https://github.com/idontgetoutmuch/random/pull/169/files Looking though changes to documentation still did not clarify it for me which of the problems were solved by changing the formula.

@Bodigrim Could you shed some light for my own sanity. What did we get with h * x + (1 - x) * l that we didn't with (h - l) * x + l?

One difference I can recall is that when l = -Inf and h = 0 the first formula gives -Inf, but the second gives NaN.

@Shimuuar
Copy link
Contributor

Shimuuar commented Dec 24, 2024

We have solution for problem with loss of precision above. But we still another numeric problem: uniformRM may go out of range for floats. Here is possible solution. Idea is to generate pair of single bit and y∈[0,0.5] and treat it as either x or 1-x depending on bit.

So code for float would looks like (omitting finiteness checks):

uniformRM (l,h) g = do
  w <- uniformWord32 g
  let x = fromIntegral (clearBit w 31) / 2^32 :: Float
  if | testBit w 31 -> return $! l + (h-l)*x
     | otherwise -> return $! h + (l-h)*x

This approach should not exceed range on either end. In exchange we may get very slight nonuniformity near x=0.5. But I think that shouldn't be a problem

P.S. That's vaguely inspired by https://arxiv.org/abs/1704.07949

@Flupp
Copy link
Author

Flupp commented Dec 25, 2024

@Bodigrim wrote:

One difference I can recall is that when l = -Inf and h = 0 the first formula gives -Inf, but the second gives NaN.

Indeed, when using infinity for the bounds, the proposed implementation behaves differently:

-- scale floating old
sfOld :: Num a => a -> a -> a -> a
sfOld l h x = x * l + (1 - x) * h

-- scale floating as proposed
sfNew :: (Ord a, Num a) => a -> a -> a -> a
sfNew l h x = if abs l < abs h then l + x * (h - l) else h + x * (l - h)

sfOldVsNew
  = [(l, h, x, sfOld l h x, sfNew l h x)
    | let bounds = [(-1) / 0, 0, 1 / 0]
    , l <- bounds
    , h <- bounds
    , x <- [0, 0.5, 1 :: Float]
    ]

-- >>> sfOldVsNew
-- [ (-Infinity, -Infinity, 0.0,       NaN,       NaN)
-- , (-Infinity, -Infinity, 0.5, -Infinity,       NaN)  -- old better
-- , (-Infinity, -Infinity, 1.0,       NaN,       NaN)
-- , (-Infinity,       0.0, 0.0,       NaN,       NaN)
-- , (-Infinity,       0.0, 0.5, -Infinity, -Infinity)
-- , (-Infinity,       0.0, 1.0, -Infinity, -Infinity)
-- , (-Infinity,  Infinity, 0.0,       NaN,       NaN)
-- , (-Infinity,  Infinity, 0.5,       NaN,       NaN)
-- , (-Infinity,  Infinity, 1.0,       NaN,       NaN)
-- , (      0.0, -Infinity, 0.0, -Infinity,       NaN)  -- old better(?)
-- , (      0.0, -Infinity, 0.5, -Infinity, -Infinity)
-- , (      0.0, -Infinity, 1.0,       NaN, -Infinity)  -- new better(?)
-- , (      0.0,       0.0, 0.0,       0.0,       0.0)
-- , (      0.0,       0.0, 0.5,       0.0,       0.0)
-- , (      0.0,       0.0, 1.0,       0.0,       0.0)
-- , (      0.0,  Infinity, 0.0,  Infinity,       NaN)  -- old better(?)
-- , (      0.0,  Infinity, 0.5,  Infinity,  Infinity)
-- , (      0.0,  Infinity, 1.0,       NaN,  Infinity)  -- new better(?)
-- , ( Infinity, -Infinity, 0.0,       NaN,       NaN)
-- , ( Infinity, -Infinity, 0.5,       NaN,       NaN)
-- , ( Infinity, -Infinity, 1.0,       NaN,       NaN)
-- , ( Infinity,       0.0, 0.0,       NaN,       NaN)
-- , ( Infinity,       0.0, 0.5,  Infinity,  Infinity)
-- , ( Infinity,       0.0, 1.0,  Infinity,  Infinity)
-- , ( Infinity,  Infinity, 0.0,       NaN,       NaN)
-- , ( Infinity,  Infinity, 0.5,  Infinity,       NaN)  -- old better
-- , ( Infinity,  Infinity, 1.0,       NaN,       NaN)
-- ]

However, there is a lot NaN in both cases.

Also, infinite bounds are handled separately anyways in the current and the proposed implementations. If this special handling is kept, no infinite value will reach the scaling calculation.


@Shimuuar wrote:

We have solution for problem with loss of precision above. But we still another numeric problem: uniformRM may go out of range for floats. Here is possible solution. Idea is to generate pair of single bit and y∈[0,0.5] and treat it as either x or 1-x depending on bit.

So code for float would looks like (omitting finiteness checks): […]

This approach should not exceed range on either end. In exchange we may get very slight nonuniformity near x=0.5. But I think that shouldn't be a problem

P.S. That's vaguely inspired by https://arxiv.org/abs/1704.07949

I do not think that the linked paper is applicable here (or it does not make sense to apply it) because the scaling from [0, 1] to [l, h] is perfectly linear. Note that some loss of precision is already mitigated by choosing the scaling formula depending on which bound is closer to zero.

However, the paper is a good argument why we should keep as much precision (“numeric noise”) as possible.

And there actually are cases where the current as well as the proposed solution loses precision and where the paper does not help: When using negative and positive bounds simultaneously, e.g., uniformRM (-1, 1). In those cases, the problematic area is somewhere in the middle. I think this could be mitigated by splitting such ranges at 0 and randomly choosing the part to draw from (weighted by the size of the parts).

Regarding the problem of exceeding the bound that is farther from zero, it might make sense to simply clamp the result to the bound.

@lehins
Copy link
Contributor

lehins commented Dec 25, 2024

As far as I can tell, you formulated the constraints in a way such that they are unsatisfyable and I guess that was intentional so a SAT solver can be used to verify this.

Correct

One difference I can recall is that when l = -Inf and h = 0 the first formula gives -Inf, but the second gives NaN.

@Bodigrim I see, your suggestion was made before the explicit check on infinity was added. So, there is no longer a need to keep this computation. Thank you, that clarifies things for me.

Here is possible solution.

@Shimuuar That does look interesting. Question for you: why did you use 2 ^ 32 for the denominator, instead of 2 ^ 32 - 1, i.e. maxBound :: Word32? I don't think it will actually make a difference, since the difference will be lost due to rounding, but I wanted to ask out of curiosity anyways. I did try using the calculation below:

uniformRM (l,h) g = do
  w <- uniformWord32 g
  let x = fromIntegral (clearBit w 31) / fromIntegral (maxBound :: Word32) :: Float
  return $!
    if testBit w 31
      then l + (h - l) * x
      else h + (l - h) * x

I can confirm that it does indeed resolves the floating point caveats.

However, for some mysterious reason it is a bit slower than using the calculation that @Flupp suggested, but also I believe it is slightly inferior since we are loosing one bit of randomness, that otherwise would be used to generate more floating point noise.

I really like @Flupp's suggestion of just clamping the values, which will effectively eliminate those nasty caveats:

scaleFloating l h x =
  if abs l < abs h
    then let !y = l + x * (h - l)
          in if l < h -- `l` is closer to zero
               then min y h -- `l` is the true low, ensure `y` is not higher than `h`
               else max y h -- `l` is the high, ensure `y` is not lower than `h`
    else let !y = h + x * (l - h)
          in if l < h -- `h` is closer to zero
               then max y l -- `h` is the true high, ensure `y` is not lower than `l`
               else min y l -- `h` is the low, ensure `y` is not higher than `l`

This clamping has negligent impact on performance, but removes the strange behavior. Win-win in my books.

Any final thoughts on this, since I want to make a release with this today. (A Christmas present, so to speak 😄)

@Flupp
Copy link
Author

Flupp commented Dec 25, 2024

While investigating some ideas, I actually noticed an actual problem with the proposed solution: The sub-term h - l (or l - h) might overflow to infinity if l and h have different signs. This could probably also be mitigated by the splitting idea mentioned in my previous post, however, my first implementation attempts make me doubt that this is easy to implement without introducing other subtle problems.

@lehins
Copy link
Contributor

lehins commented Dec 25, 2024

While investigating some ideas, I actually noticed an actual problem with the proposed solution: The sub-term h - l (or l - h) might overflow to infinity if l and h have different signs.

You have something like this in mind, right:

ghci> let maxFloat = 3.402823466e+38 :: Float
ghci> maxFloat - negate maxFloat
Infinity

That is a good point. In fact, I don't think that x * l + (1 - x) * h suffers from this problem

@Shimuuar
Copy link
Contributor

I do not think that the linked paper is applicable here (or it does not make sense to apply it) because the scaling from [0, 1] to [l, h] is perfectly linear. Note that some loss of precision is already mitigated by choosing the scaling formula depending on which bound is closer to zero.

That's why I said "vaguely inspired". But we're doing addition and that's enough to get in trouble. Root of problem is l + (h-l) ≠ h so computation may get out of range for x=1. One way is to clamp output range. My idea was to work with small numbers: x if x<0.5 and 1-x otherwise. That way we won't run into problems at the borders.

Question for you: why did you use 2 ^ 32 for the denominator, instead of 2 ^ 32 - 1, i.e. maxBound :: Word32? I don't think it will actually make a difference, since the difference will be lost due to rounding, but I wanted to ask out of curiosity anyways.

Just general laziness. '2^32 == 2^32-1forFloat` anyway, its mantissa is smaller than 32 bits.

slightly inferior since we are loosing one bit of randomness, that otherwise would be used to generate more floating point noise.

On the contrary it's slightly better wrt to randomness. Highest bit is flag 0 means x<=0.5 and 1 is x>=1. When flag is zero algorithm works in the same way as unformFloat01. When it's 1 I instead interpret lower 31 bits as 1-x which is small and can utilize full floating point range. It's not very important in this case but could help for inversion method

As for performance. Probably either codegen doesn't do good job at mixing bit operations and FP or CPU doesn't like to mix them. It looks like very low level detail

@lehins
Copy link
Contributor

lehins commented Dec 25, 2024

On the contrary it's slightly better wrt to randomness.

@Shimuuar Oh yeah you are right. I do like that!

However, it also has the same problem of overflowing: #166 (comment)

As for performance. Probably either codegen doesn't do good job at mixing bit operations and FP or CPU doesn't like to mix them. It looks like very low level detail

Probably. The difference is not that big, so if we could get better quality of the generator from it, the penalty is worth paying.

@Shimuuar
Copy link
Contributor

So we have two algorithms each with its own problem:

  1. x * l + (1 - x) * h does not overflow but has loses randomness for l << h
  2. l + x*(h-l) when |l|<|h|. It has better numeric properties but may overflow.

Maybe we should pick 2nd but check whether h-l overflow. If it does revert to first one?

@lehins
Copy link
Contributor

lehins commented Dec 25, 2024

@Shimuuar are you suggesting something along these lines:

uniformFRM :: StatefulGen g m => (Float, Float) -> g -> m Float
uniformFRM (l, h) g
  | l == h = return l
  | isInfinite l || isInfinite h
    -- Optimisation exploiting absorption:
    --    (+Infinity) + (-Infinity) = NaN
    --    (-Infinity) + (+Infinity) = NaN
    --    (+Infinity) + _           = +Infinity
    --    (-Infinity) + _           = -Infinity
    --              _ + (+Infinity) = +Infinity
    --              _ + (-Infinity) = -Infinity
   = return $! h + l
  | otherwise = do
    w <- uniformWord32 g
    let x =
          fromIntegral (clearBit w 31) / fromIntegral (maxBound :: Word32) :: Float
        diff = h - l
    return
      $! if isInfinite diff
           then if testBit w 31
                  then l * (1 - x) + h * x
                  else h * (1 - x) + l * x
           else if testBit w 31
                  then l + diff * x
                  else h + negate diff * x

@lehins
Copy link
Contributor

lehins commented Dec 25, 2024

We could even further clamp the values for the isInfinite diff case in order to ensure that generated values do not go out of supplied h and l bounds:

         if isInfinite diff
           then 
             let y =
                  if testBit w 31
                    then l * (1 - x) + h * x
                    else h * (1 - x) + l * x
             in max (min y (max l h)) (min l h)
          ...

@Shimuuar
Copy link
Contributor

Yes. Although I'm not sure that trick with testBit will buy us anything in isInfinite diff==True. We still need to calculate 1-x which will lose precision. But it's not harmful either

And clamping is very good idea. If either h or l is maximum/minimum representable number it could be possible to overflow to infinity.

@lehins
Copy link
Contributor

lehins commented Dec 25, 2024

Although I'm not sure that trick with testBit will buy us anything in isInfinite diff==True

Good point.

This is the final draft that I have:

  w <- uniformWord32 g
  if isInfinite diff
    then let !x = fromIntegral w / m
             !y = x * l + (1 - x) * h
          in max (min y (max l h)) (min l h)
    else let !x = fromIntegral (clearBit w 31) / m
          in if testBit w 31
               then l + diff * x
               else h + negate diff * x
  where
    !diff = h - l
    m = fromIntegral (maxBound :: w) :: a

@lehins
Copy link
Contributor

lehins commented Dec 25, 2024

I've adjusted my PR: #172

Please leave feedback and if there will be no objections, I will make a new release with a fix later on today

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants