Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rand: implement an unbiased random integer from a range #22499

Closed
wants to merge 5 commits into from

Conversation

paulidale
Copy link
Contributor

Refer: swiftlang/swift#39143 for a description of the algorithm.

It is optimal in the sense of having:

  • no divisions
  • minimal number of blocks of random bits from the generator
  • documentation is added or updated
  • tests are added or updated

@paulidale paulidale added branch: master Merge to master branch triaged: feature The issue/pr requests/adds a feature labels Oct 25, 2023
@paulidale paulidale self-assigned this Oct 25, 2023
@paulidale paulidale mentioned this pull request Oct 25, 2023
1 task
@github-actions github-actions bot added the severity: fips change The pull request changes FIPS provider sources label Oct 25, 2023
Copy link
Member

@t8m t8m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also have a convenience function with lower bound? I.e., lower + ossl_rand_uniform_uint32(upper - lower) ? Or just add lower parameter here?

crypto/rand/rand_uniform.c Outdated Show resolved Hide resolved
@paulidale
Copy link
Contributor Author

paulidale commented Oct 25, 2023

[0, max) is the commonest range required.

A helper function is a reasonable idea still.

@sftcd
Copy link
Contributor

sftcd commented Oct 25, 2023

I temporarily copied this into my HPKE code and it seems to work when picking GREASE values.

There's a snippet of code I used to informally test this here. Not sure how one could write a good test that could be part of the test harness without generating some bogus test fails.

@paulidale
Copy link
Contributor Author

The pathological case will be around 2³¹+1 where just under half the values are in the rejection region. Given the test only needs the third random sample less than one time in 2³², testing isn't practical using less than (at a guess) 2⁶⁴ samples.

It would be possible to test this better by reducing the size. E.g. generating 4 bit numbers under 9. A mistake should become obvious in reasonable time. Not useful for our test suite but good for confirmation. χ² is probably good enough here.

@paulidale
Copy link
Contributor Author

paulidale commented Oct 26, 2023

I've run a χ² test (p=0.99) for the output of this algorithm modified to work with 4 bit "words" and it seems good. Test harness is attached.

There seems to be a slight problem with the χ² test for this discrete uniform distribution. I'm seeing test failures for small samples at a higher than expected rate (gut feel, not measured). For large samples it comes good and the test statistic reduces nicely. I'm pretty sure this is because the sum of uniform distributions is pretty non-normal in nature. As the number increases the central limit theorem kicks in and it tends towards normal. Convergence to normal being, proportional to the square root of the number of samples accumulated (i.e. 100 gives approx 1 digit, 10,000 gives approx 2 digits, 1,000,000 gives approx 3, etc).

Given the instability for small sample sizes and the time required to generate large samples, I do not see how this can reasonably be directly tested as part of our CIs.

Run using 1,000,000 samples (note the jmax of -1 for powers of two which means no extra sampling was required which is optimal):

max =   2: 5000209,4999791,
max: 2, chi_sq: 0.017472, exp: 5000000.000000, jmax: -1
max =   3: 3334176,3333923,3331901,
max: 3, chi_sq: 0.932812, exp: 3333333.333333, jmax: 5
max =   4: 2499708,2501113,2499781,2499398,
max: 4, chi_sq: 0.693759, exp: 2500000.000000, jmax: -1
max =   5: 1999827,1998874,2000430,2001946,1998923,
max: 5, chi_sq: 3.214775, exp: 2000000.000000, jmax: 5
max =   6: 1668577,1665917,1664706,1666297,1667846,1666657,
max: 6, chi_sq: 5.749897, exp: 1666666.666667, jmax: 5
max =   7: 1430465,1428065,1427089,1431038,1428160,1427327,1427856,
max: 7, chi_sq: 10.047356, exp: 1428571.428571, jmax: 6
max =   8: 1249737,1249194,1249188,1250900,1250879,1247944,1250256,1251902,
max: 8, chi_sq: 8.696853, exp: 1250000.000000, jmax: -1
max =   9: 1110782,1112213,1110730,1109803,1111333,1110297,1112125,1110449,1112268,
max: 9, chi_sq: 6.026075, exp: 1111111.111111, jmax: 6
max =  10: 1001779,999831,1000708,999221,999167,999181,1000838,1000199,999904,999172,
max: 10, chi_sq: 7.102802, exp: 1000000.000000, jmax: 6
max =  11: 909015,909267,908319,909278,909771,909127,909099,909859,908548,909467,908250,
max: 11, chi_sq: 3.151274, exp: 909090.909091, jmax: 5
max =  12: 833981,833851,832271,833759,831721,834955,833605,831482,833055,833270,833272,834778,
max: 12, chi_sq: 15.480190, exp: 833333.333333, jmax: 5
max =  13: 769947,769383,769361,768888,769172,768927,770154,770032,769173,768483,769544,769447,767489,
max: 13, chi_sq: 7.802330, exp: 769230.769231, jmax: 6
max =  14: 714751,714341,715299,715759,715366,714639,715462,714295,713665,713116,713868,713069,712071,714299,
max: 14, chi_sq: 20.168345, exp: 714285.714286, jmax: 5
max =  15: 666945,666580,666357,666583,666835,665888,666941,666078,666901,667459,666406,667807,666785,665420,667015,
max: 15, chi_sq: 7.477271, exp: 666666.666667, jmax: 5

Run using 100,000,000 samples:

max =   2: 50000227,49999773,
max: 2, chi_sq: 0.002061, exp: 50000000.000000, jmax: -1
max =   3: 33333736,33335688,33330576,
max: 3, chi_sq: 0.399284, exp: 33333333.333333, jmax: 6
max =   4: 24999959,24991483,25010198,24998360,
max: 4, chi_sq: 7.169191, exp: 25000000.000000, jmax: -1
max =   5: 19997333,19999866,19998849,20005677,19998275,
max: 5, chi_sq: 2.182980, exp: 20000000.000000, jmax: 6
max =   6: 16662808,16667571,16670404,16665877,16662650,16670690,
max: 6, chi_sq: 3.757151, exp: 16666666.666667, jmax: 6
max =   7: 14292468,14286611,14284433,14281405,14284730,14290095,14280258,
max: 7, chi_sq: 8.159124, exp: 14285714.285714, jmax: 7
max =   8: 12501541,12502211,12498975,12494419,12499109,12498508,12503983,12501254,
max: 8, chi_sq: 4.793451, exp: 12500000.000000, jmax: -1
max =   9: 11115182,11112197,11108881,11111277,11111626,11110032,11107010,11109843,11113952,
max: 9, chi_sq: 4.561170, exp: 11111111.111111, jmax: 6
max =  10: 10003841,9996950,9997881,10000856,10006798,9996979,9997489,10003917,9998301,9996988,
max: 10, chi_sq: 11.822468, exp: 10000000.000000, jmax: 6
max =  11: 9082970,9093585,9093643,9088370,9092530,9093158,9090855,9093816,9090446,9092860,9087767,
max: 11, chi_sq: 12.555633, exp: 9090909.090909, jmax: 7
max =  12: 8333088,8336416,8331807,8332156,8332507,8333341,8331224,8337499,8332227,8336944,8327919,8334872,
max: 12, chi_sq: 9.804860, exp: 8333333.333333, jmax: 6
max =  13: 7696332,7692870,7692381,7688343,7697700,7691600,7689949,7693650,7697364,7690380,7694020,7684898,7690513,
max: 13, chi_sq: 20.737228, exp: 7692307.692308, jmax: 7
max =  14: 7145699,7144015,7140442,7139170,7146915,7145519,7140261,7147240,7141788,7141980,7139704,7141792,7143237,7142238,
max: 14, chi_sq: 12.860781, exp: 7142857.142857, jmax: 6
max =  15: 6664666,6667752,6665883,6667425,6671139,6667166,6666848,6669380,6662383,6664535,6663906,6663461,6665028,6668392,6672036,
max: 15, chi_sq: 16.394865, exp: 6666666.666667, jmax: 7

@paulidale paulidale marked this pull request as ready for review October 26, 2023 02:37
@paulidale paulidale added the tests: exempted The PR is exempt from requirements for testing label Oct 26, 2023
mspncp
mspncp previously requested changes Oct 26, 2023
include/internal/common.h Outdated Show resolved Hide resolved
@mspncp
Copy link
Contributor

mspncp commented Oct 26, 2023

nit: unbiassed -> unbiased

@paulidale paulidale changed the title rand: implement an unbiassed random integer from a range rand: implement an unbiased random integer from a range Oct 26, 2023
@paulidale
Copy link
Contributor Author

Addressed comments...

@paulidale paulidale added the project Add item to the Project Board label Oct 26, 2023
@paulidale
Copy link
Contributor Author

Ping @openssl/committers for reviews.

The algorithm isn't as daunting as it first appears.

@tom-cosgrove-arm
Copy link
Contributor

@mspncp Are you happy that your requested changes have been made?

@mspncp
Copy link
Contributor

mspncp commented Oct 28, 2023

Yes I am. You may dismiss my previous change request if you like. (I didn't find a way how to do it from the mobile app.)

@tom-cosgrove-arm tom-cosgrove-arm dismissed mspncp’s stale review October 28, 2023 12:28

Requested changes have been made

Copy link
Contributor

@tom-cosgrove-arm tom-cosgrove-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tom-cosgrove-arm tom-cosgrove-arm added approval: otc review pending and removed tests: exempted The PR is exempt from requirements for testing labels Oct 29, 2023
@t8m t8m removed the branch: 3.2 Merge to openssl-3.2 label Oct 31, 2023
@t-j-h t-j-h removed the hold: need otc decision The OTC needs to make a decision label Oct 31, 2023
@t-j-h
Copy link
Member

t-j-h commented Oct 31, 2023

OTC: vote to include PR subject to normal review process

@mspncp
Copy link
Contributor

mspncp commented Oct 31, 2023

This was written from scratch. CLA hold does not apply.

I didn't doubt the fact that it was written from scratch. But except for minor adjustments, it is the same algorithm. As a mathematician, I find it remarkable and non-intuitive that it's only the written program and not the underlying algorithm which is considered an intellectual property. Fortunately, I'm not a lawyer, and if they are happy with this interpretation, I'll not dive down further into this rabbit hole.

@openssl-machine
Copy link
Collaborator

24 hours has passed since 'approval: done' was set, but as this PR has been updated in that time the label 'approval: ready to merge' is not being automatically set. Please review the updates and set the label manually.

@paulidale
Copy link
Contributor Author

The code here is different to the one referenced. The referenced one only generates a second 64 bit value and accepts a 1 in 2^64 chance of bias. This PR, instead, generates 32 bit values and continues to do so until a definitive result is obtained -- this is not covered by the referenced code.

Algorithms cannot be copyrighted. They are patentable (unfortunately).

Refer: swiftlang/swift#39143 for a description
of the algorithm.

It is optimal in the sense of having:

* no divisions
* minimal number of blocks of random bits from the generator
Copy link
Member

@t8m t8m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still ok, no need to restart the 24h timer

@t8m
Copy link
Member

t8m commented Nov 1, 2023

@tom-cosgrove-arm please reconfirm

@t8m t8m added the branch: 3.2 Merge to openssl-3.2 label Nov 1, 2023
Copy link
Contributor

@mspncp mspncp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work done on the comments. Just two typos.

crypto/rand/rand_uniform.c Outdated Show resolved Hide resolved
crypto/rand/rand_uniform.c Outdated Show resolved Hide resolved
Copy link
Member

@t8m t8m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still OK, no need to restart the timer

@tom-cosgrove-arm
Copy link
Contributor

Still okay

@t8m
Copy link
Member

t8m commented Nov 1, 2023

Merged to the master and 3.2 branches. Thank you.

@t8m t8m closed this Nov 1, 2023
openssl-machine pushed a commit that referenced this pull request Nov 1, 2023
Refer: swiftlang/swift#39143 for a description
of the algorithm.

It is optimal in the sense of having:

* no divisions
* minimal number of blocks of random bits from the generator

Reviewed-by: Tom Cosgrove <[email protected]>
Reviewed-by: Matthias St. Pierre <[email protected]>
Reviewed-by: Tomas Mraz <[email protected]>
(Merged from #22499)
openssl-machine pushed a commit that referenced this pull request Nov 1, 2023
Reviewed-by: Tom Cosgrove <[email protected]>
Reviewed-by: Matthias St. Pierre <[email protected]>
Reviewed-by: Tomas Mraz <[email protected]>
(Merged from #22499)
openssl-machine pushed a commit that referenced this pull request Nov 1, 2023
Reviewed-by: Tom Cosgrove <[email protected]>
Reviewed-by: Matthias St. Pierre <[email protected]>
Reviewed-by: Tomas Mraz <[email protected]>
(Merged from #22499)
openssl-machine pushed a commit that referenced this pull request Nov 1, 2023
Refer: swiftlang/swift#39143 for a description
of the algorithm.

It is optimal in the sense of having:

* no divisions
* minimal number of blocks of random bits from the generator

Reviewed-by: Tom Cosgrove <[email protected]>
Reviewed-by: Matthias St. Pierre <[email protected]>
Reviewed-by: Tomas Mraz <[email protected]>
(Merged from #22499)

(cherry picked from commit 55755fb)
openssl-machine pushed a commit that referenced this pull request Nov 1, 2023
Reviewed-by: Tom Cosgrove <[email protected]>
Reviewed-by: Matthias St. Pierre <[email protected]>
Reviewed-by: Tomas Mraz <[email protected]>
(Merged from #22499)

(cherry picked from commit d05e0e4)
openssl-machine pushed a commit that referenced this pull request Nov 1, 2023
Reviewed-by: Tom Cosgrove <[email protected]>
Reviewed-by: Matthias St. Pierre <[email protected]>
Reviewed-by: Tomas Mraz <[email protected]>
(Merged from #22499)

(cherry picked from commit dfb26e0)
@paulidale paulidale deleted the rand-range branch November 1, 2023 20:12
wanghao75 pushed a commit to openeuler-mirror/openssl that referenced this pull request Nov 4, 2023
Refer: swiftlang/swift#39143 for a description
of the algorithm.

It is optimal in the sense of having:

* no divisions
* minimal number of blocks of random bits from the generator

Reviewed-by: Tom Cosgrove <[email protected]>
Reviewed-by: Matthias St. Pierre <[email protected]>
Reviewed-by: Tomas Mraz <[email protected]>
(Merged from openssl/openssl#22499)

Signed-off-by: fly2x <[email protected]>
wanghao75 pushed a commit to openeuler-mirror/openssl that referenced this pull request Nov 4, 2023
Reviewed-by: Tom Cosgrove <[email protected]>
Reviewed-by: Matthias St. Pierre <[email protected]>
Reviewed-by: Tomas Mraz <[email protected]>
(Merged from openssl/openssl#22499)

Signed-off-by: fly2x <[email protected]>
wanghao75 pushed a commit to openeuler-mirror/openssl that referenced this pull request Nov 4, 2023
Reviewed-by: Tom Cosgrove <[email protected]>
Reviewed-by: Matthias St. Pierre <[email protected]>
Reviewed-by: Tomas Mraz <[email protected]>
(Merged from openssl/openssl#22499)

Signed-off-by: fly2x <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approval: done This pull request has the required number of approvals branch: master Merge to master branch branch: 3.2 Merge to openssl-3.2 project Add item to the Project Board severity: fips change The pull request changes FIPS provider sources tests: present The PR has suitable tests present triaged: feature The issue/pr requests/adds a feature
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

7 participants