Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

srand(123); rand(1:10, 100) produces different random numbers on different machines. #5549

Closed
mschauer opened this issue Jan 26, 2014 · 5 comments · Fixed by #5741
Closed

Comments

@mschauer
Copy link
Contributor

This is of course not completely unexpected, as the rand(1:10, 100) returns an array of Ints, which are WORD_SIZE dependent, but it will bite users from time to time, e.g. #5548

But on the other hand, if length(therange) < typemax(Uint32) it is a bit wasteful to

  • generate a number between 1 and length(therange)
  • by generating a Uint64
  • by generating two Uint32s.

Would that justify a switch, given additional transferability between systems?

@StefanKarpinski
Copy link
Member

This is pretty deeply problematic and I really don't know how to solve this problem. cc: @ViralBShah

@ViralBShah
Copy link
Member

The underlying library we use, DSFMT, is designed for generating double precision random numbers, and has only 53 bits of entropy. It is difficult to get random integers from DSFMT.

Perhaps the best thing to do is document?

@lindahua
Copy link
Contributor

A more efficient way to generate random integers from DSFMT:

We keep a cache of 256 random bits, which can be obtained by generating 5 double-precision random numbers and extracting the mantissa bits (in total 53 x 5 = 265 bits > 256 bits). Whenever the cache is used up, we refill it by generating another 5 doubles. An additional C function might be needed to make this efficient.

In this way, we can obtain four 64-bit integers using 5 doubles (currently we need 8 doubles).

@StefanKarpinski
Copy link
Member

I think that what we should do here is only use up 32 bits of entropy, regardless of architecture if n ≤ typemax(Int32) and use up 64 bits of entropy otherwise. Obviously, doing rand(Int) is still going to be platform dependent, but at least this approach allows someone to write code that will work the same on 32-bit and 64-bit machines, if they avoid that. @lindahua's performance improvement is a good idea too, but a bit unrelated, afaict.

@mschauer
Copy link
Contributor Author

I updated my pullrequest, what do you think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants