-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Threaded RNG should use same stream at different locations not different streams #34386
Comments
I was just wondering what exactly should be done. Shall Should It seems to me that using |
If you look at the citation there, it actually says care must be taken that they are not initialized from a linear recurrence. Yes, one method is call randjump, which is what got published in that paper. I'm not a cryptography, but my interpretation of that information is that also just not using a linear recurrence for initialization should be statistically acceptable. Our implementation therefore lacks the provable guarantee for the other approaches (and corresponding robustness against bad seed design), but I'm thinking it must be statistically immensely improbable for two unrelated seeds to be correlated (the state space of MT is huge). But, to reveal my trick, some care must be taken in my use of the world "unrelated". Simply feeding in sequential numbers, such as you might get from another standard PRNG (including MT) does not give unrelated seeds. But we don't do that. If the current mechanism of using the kernel's randomness is discovered to be insufficient, we might be able to use a secure hash of the seed, such as This was discussed during the implementation of the current work, though I'm not sure any one source covers the entirety of the design rationale. And I've done a bit of hand-waving in my analysis based on what I'm assuming are reasonable assumptions about statistics, but am no expert, or even that well-read of an amateur. |
For your information. Perhaps you should consider alternatives to Mersenne Twister for the parallel PRNG, especially because it uses such a big state.
|
Yes, as described there, we use option 3 by default and assume that you don't normally want reproducible randomness. For the MT we use, the probability this results in the same seed on two threads is indistinguishable from zero (which might actually be too low, since any two true RNG should have a small chance of appearing to be correlated). I think we have other existing issues proposing changing from MT. The "big" state is tiny relative to the rest of the machine, so it's not really a convincing factor. But there may be other better PRNGs now that would be good to look into switching to, if someone provided an implementation of it. |
Quoting what I mentioned in the discourse thread:
There is a Julia implementation of counter-based PRNG: https://github.com/sunoru/Random123.jl |
Just to say it clearly, I think those researches are generally talking about reproducable (seeded) streams. We want something a bit different (unpredictablity) as the default. So these are not contradictory, just different needs. |
In my opinion, if unpredictability of random numbers (or information security in general) is a goal, then only a cryptographic RNG is appropriate and an application should have only one thread-safe instance of the cryptographic RNG for the whole application to access. If reproducibility is important, the API can provide a separate (perhaps parallel) PRNG for that purpose, where the user can supply a seed (see also "Implementing New RNG APIs"). |
My bad. This issue is not about reproducibility. My comment was off-topic.
Isn't this issue about making sure Julia has "independent" streams across threads? IIUC you don't need "unpredictability" in the sense of CSPRNG within a single thread. You just need that different threads produce independent random numbers. (Or maybe my understanding of the word "unpredictability" is not accurate.) |
That was a good comment, I was replying as much for my own benefit as others to clarify why I don’t think that research is applicable to the default (non-seeded) case. Thus, I don’t know what this issue is about—I think it’s about asking your question to request more documentation maybe. It might be just about independence, but it seems to be mistaking independence for “unpredictability”. And yes, I deliberately picked an ambiguous word here to try to separate it from the usual statistical vocabulary and focus instead of the usage. A CSPRNG is not required within an application for statistical purposes (it is required for cryptography of course), but the CSPRGN is necessary to create seeds for the PRGN (else the values on one thread may be statistically predictable from the values seen on another thread). The independence is only broken if you use a flawed seeding approach, such as using sequentially generated numbers, such as 1,2,3 or rand(),rand(),rand(). Using either randjump or a cryptographic seed should be equally safe for the purposes of splitting a stream into uncorrelated streams. The randjump approach is fraught with complexities and difficulties, while using different cryptographic seeds is trivial. (Actually, I think the cryptographic seed is probably actually more “safe” by simple argument that it seems it would severely compromised the integrity of MT if two completely unrelated machines were more likely to be correlated than two threads on one machine). |
Note that the CSPRNG in that case may be initialized from the PRNG, or even from known sequential numbers, so the question regarding the use of a CSPRNG for cryptographic security is fully orthogonal to the question of setting a seed for reproducibility. |
Ok, so reading through this thread, my original question:
can be answered with "yes". I wasn't aware that using different (truly) random numbers as seeds will guarantee uncorrelated streams. Documentation is needed though on how to seed threaded RNGs such that the streams are uncorrelated. |
Although, from a UI point of view, it would be nice if only one seed would need to be set and that all the thread-local seeds would be updated consistently (and predictably). |
@mauro3 the chance of coprime numbers approaches pi^2/6 as the numbers get bigger |
This is related to my first comment. Reproducible computation using thread-local RNG is not possible in general with dynamic scheduling. Since there is no reliable way to turn-off dynamic scheduling, I don't think consistent seeding across threads has to get a high priority. |
At my company, financial/insurance simulations, we typically break up the simulations into predefined fixed "chunks", and assign a seed to each chunk. Then let the dynamic scheduler do its thing. For the most part this makes our simulations repeatable no matter the machine or number of cores. We have been doing this in c# with success. I ported our engine to Julia hoping to get the same type of repeatability, and I did, but the randjump calc is slower than the time it takes to simulate a "chunk". Depending on the size of the model it makes sense to either not use threading at all, or now create a chunk size that is dependent on model size just so the randjump makes threading improve the run time. Just sharing some recent observations. |
Thank you for the feedback, @stephenll. That's very helpful! |
Maybe that lies in the fact that Mersenne Twister (MT19937) has a big state compared to other modern PRNGs (about 2500 bytes compared to 16 or 32 bytes), so that jumpahead is less trivial? (By the way, different PRNGs support different strategies to support "streams" of numbers that behave like independent random numbers. For linear PRNGs, that is jumpahead. For counter-based PRNGs, that's the use of "nearby" seeds.) |
I should of mentioned that the single threaded Julia runs about the same speed as the multithreaded on 4 cores c# code I converted from. So huge win using Julia even without the threading. Less more readable code. I don’t need to rely on the c# devs to experiment or research. Can’t wait to see how the experimental threading develops. |
If I understand correctly, this is fixed/superseded by #40546. |
The new, thread-safe RNG (PR #32407) uses separate RNG streams (link to code) instead of using the "trick" advocated in the docs of using just one chain but sampling from different locations (implemented using
Future.randjump
).Wikipedia (fourth bullet) suggests that at least for Monte-Carlo the randjump strategy should be used. Or are the seeds of the separate streams chosen such that inter-stream numbers are non-correlated too? This line suggest that no such approach is taken.
Is the randjump the right way to go about this? If so, I could do a PR.
X-refs:
Cc: @rfourquet
The text was updated successfully, but these errors were encountered: