Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial fix for RNG on 1.7 #42158

Closed
wants to merge 3 commits into from

Conversation

PallHaraldsson
Copy link
Contributor

See #42150. SIMD code seems to have same error, not considered here.

See JuliaLang#42150. SIMD code seems to have same error, not considered here.
@PallHaraldsson
Copy link
Contributor Author

PallHaraldsson commented Sep 8, 2021

It seems obvious both versions (in Xoshiro.jl and task.c, i.e. already merged code) can't be right, as contradictory. So a fix needs to be backported to 1.7. Can anyone tell me why there are two codes, doing the same (for sure both used?)? And the third SIMD code, and take a lot at that one, and add a commit here for that.

@PallHaraldsson
Copy link
Contributor Author

PallHaraldsson commented Sep 8, 2021

Doc-tests have many predictable failures, not really indicating the change invalid (rather random streams of numbers different, presumably right now, wrong before). Some odd looking until you look closer (e.g. timing difference a red herring):

��� julia> function strange_twos(n)
���            a = Vector{rand(Bool) ? Int64 : Float64}(undef, n)
[..]
��� julia> strange_twos(3)
��� 3-element Vector{Float64}:
���  2.0
���  2.0
���  2.0
��� ```
��� 
��� Subexpression:
��� 
��� strange_twos(3)
��� 
��� Evaluated output:
��� 
��� 3-element Vector{Int64}:
���  2
���  2
���  2
��� 
��� Expected output:
��� 
��� 3-element Vector{Float64}:
���  2.0
���  2.0
���  2.0

Line 311: @time sum_arg(x)

Line 95:

rand(Die, 3)
��� 
��� Evaluated output:
��� 
��� 3-element Vector{Die}:
���  Die(13)
���  Die(8)
���  Die(20)
��� 
��� Expected output:
��� 
��� 3-element Vector{Die}:
���  Die(15)
���  Die(19)
���  Die(4)

@KristofferC
Copy link
Member

You can fix the doctests automatically with make html doctest=fix

@PallHaraldsson
Copy link
Contributor Author

PallHaraldsson commented Sep 9, 2021

Can someone review and fix this file (it seems wrong in at least 4 places):

res = _rotl23(_plus(s0,s3))

If it's time-critical to get 1.7 out, is it an option to drop/disable this SIMD-variant module, and introduce later? In 1.7.x would then be ok IF you get the same stream, see #42165 OR if changing the stream in that case is considered ok for a minor release (we should at least document well in either case that the exact stream is not to be relied upon, I think we already do not commit to that, at least between major version). If not might have to wait for 1.8.

I may not have time to look at this, so best would be if someone looks at and fixes this (and discovers there's no non-determinisitc SIMD issue...).

I haven't looked into how, but it should be possible to use that module and compare with the other (two) implementations, and confirm it doesn't give you the same stream of numbers. I'm not sure, but I think it should give the same, even if SIMD, but even if you see (after fix) the same, it seems would depend on the length with the SIMD code more complex.

@JeffBezanson
Copy link
Member

JeffBezanson commented Sep 9, 2021

The fix seems pretty simple AFAICT. Go ahead and add this patch:

diff --git a/stdlib/Random/src/XoshiroSimd.jl b/stdlib/Random/src/XoshiroSimd.jl
index e115533bb6..9fb03f9572 100644
--- a/stdlib/Random/src/XoshiroSimd.jl
+++ b/stdlib/Random/src/XoshiroSimd.jl
@@ -158,7 +158,7 @@ end
 
     i = 0
     while i+8 <= len
-        res = _rotl23(_plus(s0,s3))
+        res = _plus(_rotl23(_plus(s0,s3)),s0)
         unsafe_store!(reinterpret(Ptr{UInt64}, dst + i), f(res, T))
         t = _shl17(s1)
         s2 = _xor(s2, s0)
@@ -170,7 +170,7 @@ end
         i += 8
     end
     if i < len
-        res = _rotl23(_plus(s0,s3))
+        res = _plus(_rotl23(_plus(s0,s3)),s0)
         t = _shl17(s1)
         s2 = _xor(s2, s0)
         s3 = _xor(s3, s1)
@@ -200,7 +200,7 @@ end
 
     i = 0
     while i+8 <= len
-        res = _rotl23(_plus(s0,s3))
+        res = _plus(_rotl23(_plus(s0,s3)),s0)
         shift = 0
         while i+8 <= len && shift < 8
             resLoc = _and(_lshr(res, shift), 0x0101010101010101)
@@ -219,7 +219,7 @@ end
     end
     if i < len
         # we may overgenerate some bytes here, if len mod 64 <= 56 and len mod 8 != 0
-        res = _rotl23(_plus(s0,s3))
+        res = _plus(_rotl23(_plus(s0,s3)),s0)
         resLoc = _and(res, 0x0101010101010101)
         ref = Ref(resLoc)
         ccall(:memcpy, Ptr{Cvoid}, (Ptr{UInt8}, Ptr{UInt64}, Csize_t), dst+i, ref, len-i)
@@ -245,7 +245,7 @@ end
 
     i = 0
     while i + 8*N <= len
-        res = _rotl23(_plus(s0,s3))
+        res = _plus(_rotl23(_plus(s0,s3)),s0)
         t = _shl17(s1)
         s2 = _xor(s2, s0)
         s3 = _xor(s3, s1)
@@ -264,7 +264,7 @@ end
     msk = ntuple(i->VecElement(0x0101010101010101), Val(N))
     i = 0
     while i + 64*N <= len
-        res = _rotl23(_plus(s0,s3))
+        res = _plus(_rotl23(_plus(s0,s3)),s0)
         t = _shl17(s1)
         s2 = _xor(s2, s0)
         s3 = _xor(s3, s1)

@PallHaraldsson
Copy link
Contributor Author

Thanks Kristofer (and Jeff). It seems you fixed the rest, not just typos, that's an understatement.

I also applied Jeff's changes, I thought might be this simple, but wanted to be sure nothing overlooked.

Maybe a typo, at least I assume intentional change (that I would have missed):
randn() -> rand()

@PallHaraldsson PallHaraldsson deleted the patch-5 branch September 10, 2021 13:32
@PallHaraldsson PallHaraldsson restored the patch-5 branch September 11, 2021 16:29
@oscardssmith
Copy link
Member

Why reopen this? Isn't it superseded?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants