Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Ziggurat as a method for standard_normal #3278

Closed
stress-tess opened this issue Jun 3, 2024 · 2 comments · Fixed by #3596
Closed

Add Ziggurat as a method for standard_normal #3278

stress-tess opened this issue Jun 3, 2024 · 2 comments · Fixed by #3596
Assignees

Comments

@stress-tess
Copy link
Member

stress-tess commented Jun 3, 2024

Right now our implementation of standard_normal does a box-muller transform, but numpy uses the Ziggurat algorithm. Ziggurat algorithm's wikipedia page says:

Nevertheless, the algorithm is computationally much faster than the two most commonly used methods of generating normally distributed random numbers, the Marsaglia polar method and the Box–Muller transform, which require at least one logarithm and one square root calculation for each pair of generated values. However, since the ziggurat algorithm is more complex to implement it is best used when large quantities of random numbers are required.

considering it's used by numpy and we're likely to need to generate large quantities of numbers, I think it would be worthwhile. It's also used in numpy's implementation of other distributions such as exponential which will be added in #3183

Numpy's c based implementation can be found here

This will require saving off some constants . The nature of chpl means we'll want these constants on each locale or else we'll kill performance by going back and forth to the lookup tables on locale 0. This has caused us issues previously (see #2354)

@stress-tess stress-tess self-assigned this Jun 3, 2024
@stress-tess stress-tess changed the title modify standard_normal to use Ziggurat algorithm Modify standard_normal to use Ziggurat algorithm Jun 3, 2024
@stress-tess stress-tess changed the title Modify standard_normal to use Ziggurat algorithm Should we modify standard_normal to use Ziggurat algorithm? Jun 3, 2024
@stress-tess stress-tess added the question Further information is requested label Jun 3, 2024
@stress-tess
Copy link
Member Author

stress-tess commented Jun 3, 2024

hmm it seems like box-muller's wikipedia page argues for it's use:

The Box–Muller transform was developed as a more computationally efficient alternative to the inverse transform sampling method.The ziggurat algorithm gives a more efficient method for scalar processors (e.g. old CPUs), while the Box–Muller transform is superior for processors with vector units (e.g. GPUs or modern CPUs).

So it's not immediately obvious which would be better for our use case. It is worth noting that box-muller can't produce values more than 9.419 standard deviations from the mean

@stress-tess
Copy link
Member Author

stress-tess commented Jun 3, 2024

I think we're going to have to add the stuff need for ziggurat for exponential anyway, so we could always add a method argument like in standard_exponential. Then do a performance test to decide which should be the default

we also be implementing the inverse cdf method / inverse transform sampling method for standard_exponential. So it seems like we'll end up using them all eventually

@stress-tess stress-tess changed the title Should we modify standard_normal to use Ziggurat algorithm? Add Ziggurat as a method for standard_normal Jul 1, 2024
@stress-tess stress-tess removed the question Further information is requested label Jul 1, 2024
@stress-tess stress-tess added this to the Random module alignment milestone Jul 1, 2024
stress-tess added a commit to stress-tess/arkouda that referenced this issue Jul 29, 2024
This PR (closes Bears-R-Us#3278) adds `ziggurat` method for calculating standard normal based on numpys implemenation.

I also realized that even though the zig method for both `standard_normal` and `standard_exponential` can't handle multi dimensional arrays yet, their other methods can. I created two versions of each proc with `where array_nd > 1` and `where array_nd == 1`, so now we can generate multi-dim exponentially and normally arrays if we use the compatible method.

I added parameterization for the methods in the tests
stress-tess added a commit to stress-tess/arkouda that referenced this issue Jul 29, 2024
This PR (closes Bears-R-Us#3278) adds `ziggurat` method for calculating standard normal based on numpys implemenation.

I also realized that even though the zig method for both `standard_normal` and `standard_exponential` can't handle multi dimensional arrays yet, their other methods can. I created two versions of each proc with `where array_nd > 1` and `where array_nd == 1`, so now we can generate multi-dim exponentially and normally arrays if we use the compatible method.

I added parameterization for the methods in the tests
stress-tess added a commit to stress-tess/arkouda that referenced this issue Jul 29, 2024
This PR (closes Bears-R-Us#3278) adds `ziggurat` method for calculating standard normal based on numpys implemenation.

I also realized that even though the zig method for both `standard_normal` and `standard_exponential` can't handle multi dimensional arrays yet, their other methods can. I created two versions of each proc with `where array_nd > 1` and `where array_nd == 1`, so now we can generate multi-dim exponentially and normally arrays if we use the compatible method.

I added parameterization for the methods in the tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant