Temporarily change `test_randint_randomness` not to run #1672

stress-tess · 2022-08-10T18:51:42Z

test_randint_randomness has been failing in CI runs a bit too frequently. We should lower the threshold for now and decide how we want to proceed (either relying solely on randtest after #1665 or if we are okay with the lower confidence test)

The text was updated successfully, but these errors were encountered:

stress-tess · 2022-08-10T19:31:16Z

Upon thinking about this a bit more, I think we should actually not run the test for now.

I did some quick back of the envelope math (which should def be double checked because i'm very bad at stats):
if provides a 95% confidence. So I think that if randint is truly random, then every trial has a 95% chance of passing and 5% chance of failing. Since the test should be 20 (i think independent) trials, this is a Bernoulli experiment with $n = 20$ and $p = 0.95$

So the probability of getting $k$ successes is
$$P(X= k) = { n \choose k}p^k q^{n-k}$$

Since the overall test only fails if we have >4 failures, to calculate the odds of overall success we can sum the probabilities of passing 16, 17, 18, 19, and 20 trials (since these are all independent events)
$$\sum_{i=16}^{20}{20 \choose i}0.95^{i} 0.05^{20-i} = 0.997426$$

so the odds of failure is $1-$ this and should be $0.0025739$. The overall test should only fail $0.25$% of the time... So I'm thinking there is something wrong with randint given the frequency of failures

Ethan-DeBandi99 · 2022-08-10T19:33:00Z

Upon thinking about this a bit more, I think we should actually not run the test for now.

I did some quick back of the envelope math (which should def be double checked because i'm very bad at stats): if provides a 95% confidence. So I think that if randint is truly random, then every trial has a 95% chance of passing and 5% chance of failing. Since the test should be 20 (i think independent) trials, this is a Bernoulli experiment with n=20 and p=0.95

So the probability of getting k successes is P(X=k)=(nk)pkqn−k

Since the overall test only fails if we have >4 failures, to calculate the odds of overall success we can sum the probabilities of passing 16, 17, 18, 19, and 20 trials (since these are all independent events) ∑i=1620(20i)0.95i0.0520−i=0.997426

so the odds of failure is 1− this and should be 0.0025739. The overall test should only fail 0.25 of the time... So I'm thinking there is something wrong with randint given the frequency of failures

@pierce314159 - based on this, I am inclined to agree. We should probably pull the test for now.

…t to run This PR (Closes Bears-R-Us#1672): - Changes the name of `test_randint_randomness` to `randint_randomness` which will cause the test to not run

…1673) This PR (Closes #1672): - Changes the name of `test_randint_randomness` to `randint_randomness` which will cause the test to not run Co-authored-by: Pierce Hayes <[email protected]>

stress-tess self-assigned this Aug 10, 2022

stress-tess changed the title ~~Lower threshold of test_randint_randomness~~ Temporarily change test_randint_randomness not to run Aug 10, 2022

stress-tess mentioned this issue Aug 10, 2022

Closes #1672: Temporarily change test_randint_randomness not to run #1673

Merged

Ethan-DeBandi99 closed this as completed in #1673 Aug 10, 2022

stress-tess mentioned this issue Aug 10, 2022

Integrate randtest for validation of ak.randint #1665

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Temporarily change `test_randint_randomness` not to run #1672

Temporarily change `test_randint_randomness` not to run #1672

stress-tess commented Aug 10, 2022

stress-tess commented Aug 10, 2022 •

edited

Loading

Ethan-DeBandi99 commented Aug 10, 2022

Temporarily change test_randint_randomness not to run #1672

Temporarily change test_randint_randomness not to run #1672

Comments

stress-tess commented Aug 10, 2022

stress-tess commented Aug 10, 2022 • edited Loading

Ethan-DeBandi99 commented Aug 10, 2022

Temporarily change `test_randint_randomness` not to run #1672

Temporarily change `test_randint_randomness` not to run #1672

stress-tess commented Aug 10, 2022 •

edited

Loading