-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Make MersenneTwister equivalent in 32- and 64-bit architectures #6018
Conversation
@@ -9,6 +9,12 @@ | |||
@test length(randn(4, 5)) == 20 | |||
@test length(randbool(4, 5)) == 20 | |||
|
|||
@test (rand(MersenneTwister()) - 0.8236475079774124) < 0.01 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like @StefanKarpinski said before, this test doesn't seem to make sense. Where does the .01
value come from? Why would it be ok to get a value different by .001
but not by .1
?
@JeffBezanson I misread that, thanks. Updated |
MersenneTwister(seed) = MersenneTwister(reinterpret(Uint32, [seed])) | ||
MersenneTwister() = MersenneTwister(make_seed(uint32(0))) | ||
MersenneTwister(seed::Uint32) = MersenneTwister(make_seed(seed)) | ||
MersenneTwister(seed) = MersenneTwister(make_seed(seed)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is sufficient with the last line only if you use default value, i.e. MersenneTwister(seed=uint(0))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, you don't even need the uint
because the whole point of this change is that only the value of the seed should matter, not it's type.
As mentioned in issue JuliaLang#5999, reproduce the solution in 67f6fee for the MersenneTwister. Note that users on 64-bit architectures will see their randomness change with no change in their seed.
Julia is indeed quite powerful. Updated. |
Isn't that only the case if they were invoking the MersenneTwister constructor directly? I suspect that is not a terribly common case. Although arguably, we should insist on explicitly choosing an RNG whenever setting an RNG seed. Otherwise we're locked into using MersenneTwister exactly this way forever. |
Add more unit tests for test/random.jl in preparation for refactor
@StefanKarpinski Okay so you're suggesting I put up another diff that requires srand to receive a concrete RNG rather than just an integer seed? If the explicit MersenneTwister use is not common today, is this diff good to go then? |
That's a different design issue, but yes, that's the idea. The principle being that there are two basic ways to use random numbers:
In the former case, the system should be free to choose and RNG for you and generate a seed with good entropy. In the latter case, the user should provide enough information to unambiguously specify the random stream – which entails not only a seed value, but also an RNG to use. As it is, we can't change our default RNG without breaking programs that need reproducibility. If people are required to do that, then we will have future-proofed programs that need reproducibility against changes in the default system RNG. |
If you can set a different RNG globally, then we risk |
Ah, that's a fair point. It's the default output stream problem all over again and we know how that turned out. I guess that means that we're permanently stuck with MT as the default RNG and anyone who wants to use anything else will have to pass the RNG object around explicitly. |
Why are we stuck with MT as default? If LLVM decides to give access to the random instruction on some CPUs, wouldn't it be expected that we enable usage. Lots of benchmarks rely on fast random numbers, and for most applications we do not care if NSA or FSB or China knows something about the numbers. A compile time (make variable) should at least be possible without dynamic dispatch. |
Because if someone uses |
Sorry, I thought the idea was to deprecate |
That approach would leave only two modes of operation:
Although, in some sense this is the "correct" API, I'm not sure that this is acceptable from a UX standpoint. People want to be able to just call |
As a point of reference, MATLAB provides the |
Selecting an RNG at build time might be a viable option for both fast and reproducible numbers. |
If julia is going to be used for cryptography in web applications, we should not have an API that force us to never change RNG implementation. The suggested api is also very similar to the IO API, so I would consider the approach to be Julian. I can see that it might be a hassle to pass a RNG around in a random reproducible walk simulation, but it is fairly easy to do const rng = MT(5)
my_random() = random(rng) I'm often better off when I explicitly use a global, than when that API hides it for me. |
Right, but the issue is that peoples programs will break if they use the wrong build option. The only way to make that future proof is to require them to pass in the type of RNG they want to use and raise an error (or warning?) if it's not what their Julia has as its built-in default. |
The thing is that most users do not know anything about RNGs or MersenneTwister and just want to do |
This argument has nothing to do with crypto, IMO. It has to do with reproducibility of pseudo-random sequences. It's entirely plausible that someone will figure out something really wrong with MT in the next five years and yet we could be stuck with it as a default. I suppose that could be handled by major language version changes (e.g. v1 => v2) and sufficient documentation. |
Yes - that is what I was going to say as well - we should handle it along with a major number change. The tradeoff is burdening every user with the details vs. having reproducibility across major releases. Even now, we can document that if you want guaranteed reproducibility with |
I don't know what is the right solution here, but maybe it can be useful to have MATLAB's unfortunate state in mind. In order to have backward reproducibility |
Interaction between the old MATLAB API and new API:
|
We could put |
In the mean time this PR seems totally reasonable; really just cleaning up a couple silly definitions. Everybody ok with merging it? |
RFC: Make MersenneTwister equivalent in 32- and 64-bit architectures
As mentioned in issue #5999, reproduce the solution in 67f6fee for the MersenneTwister.
Note that users on 64-bit architectures will see their randomness change with no change in their seed after this commit.