-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random stream per task structure and same seed reproducibility #3322
Comments
My first thought is to go back to having the same initial seed for all the task private random streams and fast-forward their state by how many elements came before them, but we don't know how many iterations the inner algorithm will take. So this gets us back to the potentially repeated values / non-independent problem |
I think making a random stream per element would work since this would only depend on the size of the return array which would be the same across different machines, but as we've discussed previously this would take up a significant amount of memory |
What about making a random stream per This way, you could have a consistent number of seeds for a given array size without having to generate a unique stream for each array element (which sounds like a lot of overhead). |
@ajpotts and I had discussed this a bit a while back and if we have some sort of guarantee that the inner loop will complete after EDIT: |
I thought about this a bit. I didn't count it out, but I was afraid it could really limit the performance on large machines if it leaves lots of threads idle. And it seems much harder to take advantage of the way data is distributed because we can't factor number of locales into how many seeds we use. But something along these lines might just be the only way even if it's not ideal Also there is the edge cases like a small number of elements to think about EDIT: The only thing is we'd need to keep up with the remainder from previous locales. So say local subdomain of locale so if that's true, we might have to do the locales in order (using a blocking loop with atomics or something) but be able to do the computation on those locales in parallel, which does seem much better still |
also if it's a non-set seed we can still do the more optimal stream per task way, so theres a perf tradeoff for reproducibility. Okay i'm really warming up to this idea |
Instead, I think you could have the last generator on locale 0 handle the last |
ooooo great idea!! Thanks for all the insights @jeremiah-corrado 🎉 :) |
This PR fixes Bears-R-Us#3322. The previous implementation of poisson created a random stream per task, but this lead to different results on different machines or with a different number of locales. To address this @jeremiah-corrado suggested a fixed number of elements per random stream. Since this depends only on the total number of elements to be generated (which won't change like number of locales or num tasks per locale), this should always give the same results given the same seed. I added tests with pre-generated data from a 2 locale run with multiple orders of magnitude to test the cases where all the data is pulled down to locale 0 (total elements < `elementsPerRandomStream`) and where each locales is responsible for multiple chunks which are not evenly divisible by our chunk size.
This PR fixes Bears-R-Us#3322. The previous implementation of poisson created a random stream per task, but this lead to different results on different machines or with a different number of locales. To address this @jeremiah-corrado suggested a fixed number of elements per random stream. Since this depends only on the total number of elements to be generated (which won't change like number of locales or num tasks per locale), this should always give the same results given the same seed. I added tests with pre-generated data from a 2 locale run with multiple orders of magnitude to test the cases where all the data is pulled down to locale 0 (total elements < `elementsPerRandomStream`) and where each locales is responsible for multiple chunks which are not evenly divisible by our chunk size.
This PR fixes Bears-R-Us#3322. The previous implementation of poisson created a random stream per task, but this lead to different results on different machines or with a different number of locales. To address this @jeremiah-corrado suggested a fixed number of elements per random stream. Since this depends only on the total number of elements to be generated (which won't change like number of locales or num tasks per locale), this should always give the same results given the same seed. I added tests with pre-generated data from a 2 locale run with multiple orders of magnitude to test the cases where all the data is pulled down to locale 0 (total elements < `elementsPerRandomStream`) and where each locales is responsible for multiple chunks which are not evenly divisible by our chunk size.
This PR fixes Bears-R-Us#3322. The previous implementation of poisson created a random stream per task, but this lead to different results on different machines or with a different number of locales. To address this @jeremiah-corrado suggested a fixed number of elements per random stream. Since this depends only on the total number of elements to be generated (which won't change like number of locales or num tasks per locale), this should always give the same results given the same seed. I added tests with pre-generated data from a 2 locale run with multiple orders of magnitude to test the cases where all the data is pulled down to locale 0 (total elements < `elementsPerRandomStream`) and where each locales is responsible for multiple chunks which are not evenly divisible by our chunk size.
This PR fixes Bears-R-Us#3322. The previous implementation of poisson created a random stream per task, but this lead to different results on different machines or with a different number of locales. To address this @jeremiah-corrado suggested a fixed number of elements per random stream. Since this depends only on the total number of elements to be generated (which won't change like number of locales or num tasks per locale), this should always give the same results given the same seed. I added tests with pre-generated data from a 2 locale run with multiple orders of magnitude to test the cases where all the data is pulled down to locale 0 (total elements < `elementsPerRandomStream`) and where each locales is responsible for multiple chunks which are not evenly divisible by our chunk size.
@brandon-neth encountered some failures with
test_poisson_hypothesis_testing
. My first thought was that I must have missed fixing the seed, but I doubled checked and that wasn't the case.Then I realized the way I structured the algorithm (where we have a random stream per task) could give different results on different machines despite them using the same seed. Using the same seed on a machine with the same number of locales and tasks per locale should give the same results. But different machines with a different number of total tasks wouldn't. In fact, running your code on even the same HPC but with a different number of nodes wouldn't give the same answer.
The same seed not giving the same results kinda takes away the usefulness of having a seed. I'm not sure yet how to handle this since random streams aren't thread safe, but I think I gotta revisit the implementation of
poisson
andexponential
with the zig method@Bears-R-Us/arkouda-core-dev, @jeremiah-corrado, @bmcdonald3, @ShreyasKhandekar let me know if y'all have any ideas on how to address this
EDIT:
I was able to verify this locally by running the following line of code with different number of locales
The output is:
The text was updated successfully, but these errors were encountered: