Random stream per task structure and same seed reproducibility #3322

stress-tess · 2024-06-12T23:20:02Z

@brandon-neth encountered some failures with test_poisson_hypothesis_testing. My first thought was that I must have missed fixing the seed, but I doubled checked and that wasn't the case.

Then I realized the way I structured the algorithm (where we have a random stream per task) could give different results on different machines despite them using the same seed. Using the same seed on a machine with the same number of locales and tasks per locale should give the same results. But different machines with a different number of total tasks wouldn't. In fact, running your code on even the same HPC but with a different number of nodes wouldn't give the same answer.

The same seed not giving the same results kinda takes away the usefulness of having a seed. I'm not sure yet how to handle this since random streams aren't thread safe, but I think I gotta revisit the implementation of poisson and exponential with the zig method

@Bears-R-Us/arkouda-core-dev, @jeremiah-corrado, @bmcdonald3, @ShreyasKhandekar let me know if y'all have any ideas on how to address this

EDIT:
I was able to verify this locally by running the following line of code with different number of locales

ak.random.default_rng(10).poisson(size = 10)

The output is:

# with 2 locales
array([2 0 3 2 1 1 2 0 1 0])

# with 3 locales
array([2 0 3 2 1 2 0 2 0 1])

# with 4 locales
array([2 0 3 1 2 2 0 1 2 0])

The text was updated successfully, but these errors were encountered:

stress-tess · 2024-06-13T13:56:53Z

My first thought is to go back to having the same initial seed for all the task private random streams and fast-forward their state by how many elements came before them, but we don't know how many iterations the inner algorithm will take. So this gets us back to the potentially repeated values / non-independent problem

stress-tess · 2024-06-13T14:00:10Z

I think making a random stream per element would work since this would only depend on the size of the return array which would be the same across different machines, but as we've discussed previously this would take up a significant amount of memory

jeremiah-corrado · 2024-06-13T14:08:41Z

What about making a random stream per n elements, where n is a fixed constant (e.g., 4096)?

This way, you could have a consistent number of seeds for a given array size without having to generate a unique stream for each array element (which sounds like a lot of overhead).

stress-tess · 2024-06-13T14:09:03Z

My first thought is to go back to having the same initial seed for all the task private random streams and fast-forward their state by how many elements came before them, but we don't know how many iterations the inner algorithm will take. So this gets us back to the potentially repeated values / non-independent problem

@ajpotts and I had discussed this a bit a while back and if we have some sort of guarantee that the inner loop will complete after k iterations with a probability p then we could give each value k iterations worth of non-overlapping state, so we fast-forward each task's state by (here.id * nTasksPerLoc + tid) * k. This would allow us to make some sort of statement of the probability that the values are independent

EDIT:
wait nevermind this doesn't fix the problem at all lol, cause the state of the generators would still be dependent on number of tasks.... ignore me 😅

stress-tess · 2024-06-13T14:17:28Z

What about making a random stream per n elements, where n is a fixed constant (e.g., 4096)?

This way, you could have a consistent number of seeds for a given array size without having to generate a unique stream for each array element (which sounds like a lot of overhead).

I thought about this a bit. I didn't count it out, but I was afraid it could really limit the performance on large machines if it leaves lots of threads idle. And it seems much harder to take advantage of the way data is distributed because we can't factor number of locales into how many seeds we use. But something along these lines might just be the only way even if it's not ideal

Also there is the edge cases like a small number of elements to think about

EDIT:
once again I commented before really thinking things through hahah. Yah okay so we just loop over each locale's local subdomain and chunk it up by n.

The only thing is we'd need to keep up with the remainder from previous locales. So say local subdomain of locale 0 isn't divisible by n then the first generator on locale 1 needs to have the same seed and be fast-forwarded to pick up where locale 0 left off right? but we won't know how much to fast forward until locale 0 finishes

so if that's true, we might have to do the locales in order (using a blocking loop with atomics or something) but be able to do the computation on those locales in parallel, which does seem much better still

stress-tess · 2024-06-13T14:29:19Z

also if it's a non-set seed we can still do the more optimal stream per task way, so theres a perf tradeoff for reproducibility. Okay i'm really warming up to this idea

jeremiah-corrado · 2024-06-13T14:49:21Z

The only thing is we'd need to keep up with the remainder from previous locales. So say local subdomain of locale 0 isn't divisible by n...

Instead, I think you could have the last generator on locale 0 handle the last remainder elements on locale 0 and the first n-remiander elements on locale 1. This would incur some communication cost for those boundary elements, but would avoid making the outer loop sequential as you mentioned.

stress-tess · 2024-06-13T14:50:46Z

ooooo great idea!! Thanks for all the insights @jeremiah-corrado 🎉 :)

@jeremiah-corrado

This PR fixes Bears-R-Us#3322. The previous implementation of poisson created a random stream per task, but this lead to different results on different machines or with a different number of locales. To address this @jeremiah-corrado suggested a fixed number of elements per random stream. Since this depends only on the total number of elements to be generated (which won't change like number of locales or num tasks per locale), this should always give the same results given the same seed. I added tests with pre-generated data from a 2 locale run with multiple orders of magnitude to test the cases where all the data is pulled down to locale 0 (total elements < `elementsPerRandomStream`) and where each locales is responsible for multiple chunks which are not evenly divisible by our chunk size.

@jeremiah-corrado

This PR fixes Bears-R-Us#3322. The previous implementation of poisson created a random stream per task, but this lead to different results on different machines or with a different number of locales. To address this @jeremiah-corrado suggested a fixed number of elements per random stream. Since this depends only on the total number of elements to be generated (which won't change like number of locales or num tasks per locale), this should always give the same results given the same seed. I added tests with pre-generated data from a 2 locale run with multiple orders of magnitude to test the cases where all the data is pulled down to locale 0 (total elements < `elementsPerRandomStream`) and where each locales is responsible for multiple chunks which are not evenly divisible by our chunk size.

@jeremiah-corrado

This PR fixes Bears-R-Us#3322. The previous implementation of poisson created a random stream per task, but this lead to different results on different machines or with a different number of locales. To address this @jeremiah-corrado suggested a fixed number of elements per random stream. Since this depends only on the total number of elements to be generated (which won't change like number of locales or num tasks per locale), this should always give the same results given the same seed. I added tests with pre-generated data from a 2 locale run with multiple orders of magnitude to test the cases where all the data is pulled down to locale 0 (total elements < `elementsPerRandomStream`) and where each locales is responsible for multiple chunks which are not evenly divisible by our chunk size.

@jeremiah-corrado

This PR fixes Bears-R-Us#3322. The previous implementation of poisson created a random stream per task, but this lead to different results on different machines or with a different number of locales. To address this @jeremiah-corrado suggested a fixed number of elements per random stream. Since this depends only on the total number of elements to be generated (which won't change like number of locales or num tasks per locale), this should always give the same results given the same seed. I added tests with pre-generated data from a 2 locale run with multiple orders of magnitude to test the cases where all the data is pulled down to locale 0 (total elements < `elementsPerRandomStream`) and where each locales is responsible for multiple chunks which are not evenly divisible by our chunk size.

@jeremiah-corrado

This PR fixes Bears-R-Us#3322. The previous implementation of poisson created a random stream per task, but this lead to different results on different machines or with a different number of locales. To address this @jeremiah-corrado suggested a fixed number of elements per random stream. Since this depends only on the total number of elements to be generated (which won't change like number of locales or num tasks per locale), this should always give the same results given the same seed. I added tests with pre-generated data from a 2 locale run with multiple orders of magnitude to test the cases where all the data is pulled down to locale 0 (total elements < `elementsPerRandomStream`) and where each locales is responsible for multiple chunks which are not evenly divisible by our chunk size.

stress-tess self-assigned this Jun 12, 2024

stress-tess added the bug Something isn't working label Jun 12, 2024

stress-tess mentioned this issue Jun 13, 2024

Closes #3183, #3364: Add exponential distribution and aggregation to random generator loop #3310

Merged

stress-tess mentioned this issue Jun 14, 2024

Fixes #3322: poisson same seed reproducibility bug #3331

Merged

stress-tess closed this as completed in #3331 Jun 20, 2024

stress-tess closed this as completed in 6712da3 Jun 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Random stream per task structure and same seed reproducibility #3322

Random stream per task structure and same seed reproducibility #3322

stress-tess commented Jun 12, 2024 •

edited

Loading

stress-tess commented Jun 13, 2024

stress-tess commented Jun 13, 2024

jeremiah-corrado commented Jun 13, 2024

stress-tess commented Jun 13, 2024 •

edited

Loading

stress-tess commented Jun 13, 2024 •

edited

Loading

stress-tess commented Jun 13, 2024

jeremiah-corrado commented Jun 13, 2024 •

edited

Loading

stress-tess commented Jun 13, 2024

Random stream per task structure and same seed reproducibility #3322

Random stream per task structure and same seed reproducibility #3322

Comments

stress-tess commented Jun 12, 2024 • edited Loading

stress-tess commented Jun 13, 2024

stress-tess commented Jun 13, 2024

jeremiah-corrado commented Jun 13, 2024

stress-tess commented Jun 13, 2024 • edited Loading

stress-tess commented Jun 13, 2024 • edited Loading

stress-tess commented Jun 13, 2024

jeremiah-corrado commented Jun 13, 2024 • edited Loading

stress-tess commented Jun 13, 2024

stress-tess commented Jun 12, 2024 •

edited

Loading

stress-tess commented Jun 13, 2024 •

edited

Loading

stress-tess commented Jun 13, 2024 •

edited

Loading

jeremiah-corrado commented Jun 13, 2024 •

edited

Loading