Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Xoshiro: allow any non-negative integer as a seed, via SHA2_256 #41558

Merged
merged 1 commit into from
Sep 23, 2021

Conversation

rfourquet
Copy link
Member

@rfourquet rfourquet commented Jul 12, 2021

This converts any integer to a vector, the vector is hashed with SHA2_256, and the resulting 256 bits are used to initialize the RNG state.

This also fixes a

TODO: Consider a less ad-hoc construction

But of course, seeding becomes much slower: e.g. seed!(123) goes from 18ns to 959ns (which can be reduced easily to 800ns with a couple of tricks). Also this means that all the random streams change again.

@rfourquet rfourquet added needs decision A decision on this change is needed randomness Random number generation and the Random stdlib labels Jul 12, 2021

function seed!(rng::Union{TaskLocalRNG,Xoshiro}, seed::Vector{UInt32})
c = SHA.SHA2_256_CTX()
SHA.update!(c, reinterpret(UInt8, seed))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is endian-dependent, which I don't think we want. Maybe do hton.(seed)?

Copy link
Member Author

@rfourquet rfourquet Jul 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks. I will update if this PR gets support.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we've ever run on a big endian system, and probably never will.

function seed!(rng::Union{TaskLocalRNG,Xoshiro}, seed::Vector{UInt32})
c = SHA.SHA2_256_CTX()
SHA.update!(c, reinterpret(UInt8, seed))
s0, s1, s2, s3 = reinterpret(UInt64, SHA.digest!(c))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, use ntoh.(reinterpret(...))?

@StefanKarpinski StefanKarpinski added the triage This should be discussed on a triage call label Jul 22, 2021
@KristofferC
Copy link
Member

If we want to get this in it should probably be done for 1.7 since the rng stream had to change again due to #42150

@StefanKarpinski
Copy link
Member

I'm in favor. I think this is a significantly safer and better way to do seeding.

One question: it struck me that the SHA hashing should maybe go in the make_seed function instead, but I'm assuming you didn't change that because it's used for seeding MersenneTwister and you want to keep that RNG stream unchanged? Otherwise I was thinking of changing make_seed so that it can take an arbitrary set of input values and turn then into a seed vector of whatever size is wanted using SHA256 hashing. Of course, that also makes the from_seed function impossible to implement since it would require inverting that one-way hash.

@JeffBezanson
Copy link
Member

We just changed the random stream anyway, so might as well do this too.

@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Sep 16, 2021
@JeffBezanson JeffBezanson added this to the 1.7 milestone Sep 16, 2021
@JeffBezanson JeffBezanson removed the needs decision A decision on this change is needed label Sep 21, 2021
@JeffBezanson JeffBezanson force-pushed the rf/rand/xoshiro-hash-seed branch from 6177901 to 248fcc0 Compare September 21, 2021 19:42
@JeffBezanson JeffBezanson marked this pull request as ready for review September 21, 2021 19:44
@JeffBezanson JeffBezanson force-pushed the rf/rand/xoshiro-hash-seed branch 4 times, most recently from c025875 to 0535975 Compare September 22, 2021 05:55
@JeffBezanson JeffBezanson force-pushed the rf/rand/xoshiro-hash-seed branch from 0535975 to f69daf1 Compare September 22, 2021 17:47
@test cond(a,2) ≈ 78.44837047684149 atol=0.5
@test cond(a,Inf) ≈ 174.10761543202744 atol=0.4
@test cond(a[:,1:5]) ≈ 6.7492840150789135 atol=0.01
@test cond(a,1) ≈ 198.3324294531168 atol=0.5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are the worst...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seriously. What a badly designed test...

KristofferC added a commit that referenced this pull request Oct 9, 2021
@rfourquet
Copy link
Member Author

One question: it struck me that the SHA hashing should maybe go in the make_seed function instead, but I'm assuming you didn't change that because it's used for seeding MersenneTwister and you want to keep that RNG stream unchanged?

That sounds like a good idea to me, it would make make_seed more generally useful. Now that MersenneTwister is the default no more, changing the streams is probably less disruptive to users. Concerning from_seed, we could probably just store the input seed inside the rng before passing it to make_seed.
Follow-up on this can happen at #37766.

@rfourquet
Copy link
Member Author

And thanks @JeffBezanson for finishing this up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
randomness Random number generation and the Random stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants