-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicit vs implicit state passing #3
Comments
I should note that this problem is dealt with to some extent by mcw-probability. Given that there is talk of such an interface already being useful for generation of floating point numbers I think adopting other ideas here would a good idea. I should say this doesn't answer all the questions about this aspect of an interface. For example I would like to be able to write something like: -- Generate an increasing sequence of numbers bounded above
-- by 100 and starting from the initial argument.
generate l acc | l >= 100 = acc
| otherwise = do
d1 <- uniformR (l , 100)
generate (d1 : acc) d1 In other words if I am in a monadic state context I would like the state threading to be done for me! This is one of the purposes of the state monad and do notation after all. |
The current proposal should cover #3 (comment) - there will no longer be a |
For #3 (comment) the proposal is that the "base" package does not generate floating point numbers (from a uniform distribution). For that you should use another package e.g. |
Edit: The comment below is preserved but in fact I think that you can just directly have a state monad interface: class Random a where
random :: (RandomGen g) => State g a
randomR :: (RandomGen g) => (a, a) -> State g a
-- etc. I am glad we are moving away from (>>=)
:: (RandomGen g, Random a, Random b)
=> RandomState g a -> (a -> RandomState g b) -> RandomState b
-- RandomState is just a newtype wrapper around a state monad:
newtype RandomState g a = RandomState {runRandomState :: g -> (g, a)} Does anyone have any ideas for a good design here? |
I think I might have a decent solution in for an API that solves both spectrums, stateful and pure (optionally married with state monad) RNGs. The changes I will describe below are already in #1 I haven't run any tests or benchmarks on it. I'll try to do at least the latter tonight. The gist of this is. We introduce class Monad m => MonadRandom g m where
type Seed g :: *
restore :: Seed g -> m g
save :: g -> m (Seed g)
-- | Generate `Word32` up to and including the supplied max value
uniformWord32R :: Word32 -> g -> m Word32
-- | Generate `Word64` up to and including the supplied max value
uniformWord64R :: Word64 -> g -> m Word64
uniformWord8 :: g -> m Word8
uniformWord8 = fmap fromIntegral . uniformWord32R (fromIntegral (maxBound :: Word8))
uniformWord16 :: g -> m Word16
uniformWord16 = fmap fromIntegral . uniformWord32R (fromIntegral (maxBound :: Word16))
uniformWord32 :: g -> m Word32
uniformWord32 = uniformWord32R maxBound
uniformWord64 :: g -> m Word64
uniformWord64 = uniformWord64R maxBound Note that we can also provide default implementations for Cool part about this monad is that it can probably be used for something like a system random generator (eg. /dev/urandom): data SysRandom
instance MonadIO m => MonadRandom SysRandom m where
... We can create instances for stateful generators like instance (s ~ PrimState m, PrimMonad m) => MonadRandom (MWC.Gen s) m where
type Seed (MWC.Gen s) = MWC.Seed
restore = MWC.restore
save = MWC.save
uniformWord32R u = MWC.uniformR (0, u)
uniformWord64R u = MWC.uniformR (0, u)
uniformWord8 = MWC.uniform
uniformWord16 = MWC.uniform
uniformWord32 = MWC.uniform
uniformWord64 = MWC.uniform But most importantly, I think it can not only solve @Boarders concerns as well as reduce duplication drastically, but it also allows us to use functions that work for both stateful and pure generators by the means of We can write a general function like randomListM :: (Random a, MonadRandom g m, Num a) => g -> Int -> m [a]
randomListM gen n = replicateM n (randomRM (1, 6) gen)
rlist :: Int -> ([Word64], [Word64])
rlist n = (xs, ys)
where
xs = runStateGen_ (mkGen 217 :: StdGen) (randomListM GenState n)
ys = runST $ MWC.create >>= (`randomListM` n)
-- Potential helper functions that can be added to the API:
runStateGen :: RandomGen g => g -> State g a -> (a, g)
runStateGen g = flip runState g
runStateGen_ :: RandomGen g => g -> State g a -> a
runStateGen_ g = fst . flip runState g
runStateTGen :: RandomGen g => g -> StateT g m a -> m (a, g)
runStateTGen g = flip runStateT g
runStateTGen_ :: (RandomGen g, Functor f) => g -> StateT g f a -> f a
runStateTGen_ g = fmap fst . flip runStateT g Here is a sample: λ> rlist 5
([1,2,3,5,5],[3,4,3,1,4]) The fun part about data GenState g = GenState
instance (MonadState g m, RandomGen g) => MonadRandom (GenState g) m where
type Seed (GenState g) = GenSeed g
restore s = GenState <$ put (mkGen s)
save _ = saveGen <$> get
uniformWord32R r _ = state (genWord32R r)
uniformWord64R r _ = state (genWord64R r)
uniformWord8 _ = state genWord8
uniformWord16 _ = state genWord16
uniformWord32 _ = state genWord32
uniformWord64 _ = state genWord64 Where the class RandomGen g where
type GenSeed g :: *
type GenSeed g = Word64
mkGen :: GenSeed g -> g
saveGen :: g -> GenSeed g
next :: g -> (Int, g) -- `next` can be deprecated over time
genWord8 :: g -> (Word8, g)
genWord8 = first fromIntegral . genWord32R (fromIntegral (maxBound :: Word8))
... What this means for RNG package maintainers:
to define default implementation: class Random a where
randomM :: MonadRandom g m => g -> m a
randomRM :: MonadRandom g m => (a, a) -> g -> m a
randomR :: RandomGen g => (a, a) -> g -> (a, g)
randomR r g = runStateGen g (genRandomR r)
random :: RandomGen g => g -> (a, g)
random g = runStateGen g genRandom All is left is checking the performance (which I suspect should be pretty good with strict Sorry for the long comment, but I feel pretty excited about it, hopefully there won't be too much pushback ;) @idontgetoutmuch @curiousleo @Shimuuar @Boarders @cartazio Please, guys, let me know what you think? PS. I explicitly omitted other issues that can be handled separately (splittable vs non-splittable, bounded vs unbounded numbers etc.) |
This is along the lines of what I’ve been planning or suggesting. I’ll share my flavor / sibling of this in a span. But along this line yess. |
It suddenly occurred to me that pure and stateful generators are not that different. They have same state monad underneath. If we look at For stateful PRNG nextX :: PRNG (State# s) -> State# s -> (# State# s, X #) For pure (Proxy is added just to make shape of type similar): nextX :: Proxy PRNG -> PRNG -> (# PRNG, X #) We have to thread something along which is state token for stateful PRNGs and full state for pure ones. Stateful ones also need some stateful value. I need to think some more about this. Also is there something in between? P.S. @lehins I didn't look at your proposal in details yet |
I feel like the class should just have m Word64 We can have a child class that has stuff like ‘m (vec 64 Bool)’ or whatever for optimal bit complexity. Step zero is a way to write portable rng parametric algorithms |
These are just monads. :) |
The commonality is they are monads. It’s exactly the name for that interface. We can’t write that one normally in Haskell because it conflicts with the reader instance for function types. Plus unboxed tuple trickiness. But write a mock fucntion type or newtype of fucntions and you could make bare state monad code pattern an instance. I’m sure there’s a fun bit of historical context for why we don’t have that instance. And perhaps ghc and base should have it. / me wearing clc hat. |
@cartazio I think there was already some agreement on set of primitives in #5
@Shimuuar That's right IO and ST is just a way to pass an opaque state token around. The difference with state monad + pure RNG is that we pass the actual generator (two Another way to simulate the stateful interface for pure RNGs would be to stick the generator values into a small |
@cartazio whether interface should be monadic or not is question that's completely orthogonal to the question which primitives we should use @lehins Yes they're different but they share similarity. Question is whether this could be exploited. After all semantics on high level is same: threading of state. |
@lehins : I think your suggestions here are seriously good and would definitely move us in all the directions we want API-wise:
Thank you for thinking about it, I think this direction of travel will really improve the library and moreover make it modular enough to be a good basis for all RNG libraries. |
I just checked the performance of generating GHC is able to optimize the |
@lehins I finally read your proposal. First thoughts:
With this change your proposal could be written as: class Monad m => MonadRandom g m where
type Seed g :: *
restore :: Seed g -> m g
save :: g -> m (Seed g)
uniformWhatever :: g -> m Whatever
class RandomGen g where
genWhatever :: g -> (Whatever, g)
data GenState g = GenState
instance (MonadState g m, RandomGen g) => MonadRandom (GenState g) m where
type Seed (GenState g) = g
restore s = GenState <$ put s
save _ = get
uniformWhatever = state genWhatever I added magic This API does provide unified API for both stateful and pure generators. Which is absolutely great! We need that sort of thing Now things I don't like:
|
That's a good observation. I concur. Removed.
I don't think we should care about it in MonadRandom, since
I thought about a bit, except
Question on the latter point is that where do we get initial randomness, do we add ability to use I was pretty stoked about it too 🤘
|
But Also I'm not sure that it impossible to use MonadReader. But I need to experiment before I can say anything
Portability is actually one of the reasons I think such API should be added. It's not reasonable to expect that PRNG implementors will jump through flaming hoops in order to get good platform independent initialization. It's complicated thing and thus should be centralized. |
That's great that there is already a decent solution. I am not that fond of falling back onto time and printing various warnings to the terminal, a better solution would be to return hPutStrLn stderr $ "Warning: Couldn't use randomness source " ++ randomSourceName
hPutStrLn stderr ("Warning: using system clock for seed instead " ++
"(quality will be lower)")
If we can agree on reliable solution for this (I haven't looked at the one in mwc-random too closely yet), I agree that |
Fallback on time remains from the time when it was used on Windows instead of proper cryptographic API. It was contributed only much later
I think it would be better to just throw exception. After all there isn't many way to handle failure expect falling down and cry. But yes, that's not very important now |
It come to me that instance (MonadState g m, RandomGen g) => MonadRandom (GenState g) m where
uniformWord64 = state genWord64 It could be that programmer wants to use I'm not arguing against such instance. It is quite useful. But probably we should add another variant of |
This is a pretty vague description of a problem. Could you provide a concrete example?
If I understand the problem you describing correctly, which I probably don't, then you could solve it by stacking up multiple |
Say I want to work with some complex state like On other hand this problem could be solved by adding single instance without incurring any breakage. Something along the lines (names are pretty much arbitrary): data RandGenState g = RandGenState
instance (MonadRandState g m, RandomGen g) => MonadRandom (RandState g) m where
...
uniformWord64 _ = randState genWord64 |
I think I understand your concern. What I am saying the solution for this problem is for the user to not stick the generator into the complex state like data Foo ...
runComplexState :: StdGen -> Foo -> StateT StdGen (StateT Foo IO) a -> IO (a, Foo)
runComplexState g foo action = runStateT (runGenStateT_ g action) foo or if you wish other the way around: runComplexState' :: StdGen -> Foo -> StateT Foo (StateT StdGen IO) a -> IO (a, Foo)
runComplexState' g foo action = runGenStateT_ g (runStateT action foo) I personally don't want to complicate the API much more than needed just to accommodate lens users. For example what this |
To be clear at the moment I'm running around with sharp stick and try to poke hole in proposed API. |
I believe that this ticket has been addressed through changes made by @lehins. @Boarders, I've made a preview of the current Haddocks here: https://htmlpreview.github.io/?https://raw.githubusercontent.com/idontgetoutmuch/random/haddock-preview/docs/index.html. We're actually using your "rolling a die" example a lot! I'm interested in your thoughts on the explicit / implicit ("pure / monadic" in the docs) APIs and their documentation. Since the implementation has moved on quite a lot since the discussion on this ticket, I'm going to close this ticket for now. @Boarders if you find issues with the API or its documentation or have further suggestions, please create new tickets. Thank you! |
I'm unsure about the specifics of the current random proposal but I am interested in getting a discussion going on the plans for explicit vs implicit state (given that there is talk of pure vs impure RNG I presume this encompasses some of the same issues?). Currently if we wish to generate 1000000 dice rolls (to choose an inconsequential example) we have two options:
In the second we generate state and then explicitly pass it around. I haven't benchmarked these particular functions but upon doing so one will quickly learn that the second is significantly faster than the first! Beyond GHC dealing better with pure code storing two
Int32
pieces of state in anIORef
and then reading and writing is a terrible idea!This is a shame because I feel like the second piece of code is not necessarily as accessible to beginners as possible and it may not appear obvious without knowing the internals of the library that the second should always be preferred (I should also note that the version with explicit recursion typically performs worse than
replicateM
when used in real code though I am unsure of what fusion related details make that the case). I think the library should expose common combinators for things like this or come up with some better plan for not having a state monad interface. On top of this if the library is going to offer implicit state in the form of anIORef
then an unboxed variant like this should probably be used because chasing pointers to something that should never be a thunk anyway is a bad idea!I should also say that implicit state is not thread-safe and is quite error prone in such contexts so we might want to take that into account in any such discussion.
Note: I use here the culprit random functions which should probably face some sort of chop in a newer version of the library but it is irrelevant to this particular issue.
The text was updated successfully, but these errors were encountered: