Use biased_coin in FeatureFlags #2485

DRMacIver · 2020-07-11T09:11:56Z

Part of why #2483 was so fiddly was that it turns out that our feature flags implementation shrinks very badly. On the one hand this was good in that it revealed problems with the pass that it was useful to catch, on the other hand it'd be nicer if it just didn't shrink badly.

The problem is basically that it leaves high bytes in the choice sequence as an intrinsic feature, so the shrinker potentially has to do a lot of work repeatedly checking if it can lower those. Ideally all bytes would be 0 or 1 after a successful shrink. This is easy to achieve by swapping over the implementation to biased_coin, which already has optimisations for this.

In order to enable this more elegantly this caused me to add a feature to biased_coin which allows you to force the value it returns. Once I'd done that it made sense to make use of that in the many utils class which essentially faked that feature badly on its own.

Zac-HD · 2020-07-11T09:34:24Z

This looks good to me in itself, but after starting at the code for a while I think we get the wrong distribution when p < 1/256 because drawing [1, 0] is interpreted as True when it ought to consider both bytes and return False.

DRMacIver · 2020-07-11T11:15:05Z

This looks good to me in itself, but after starting at the code for a while I think we get the wrong distribution when p < 1/256 because drawing [1, 0] is interpreted as True when it ought to consider both bytes and return False.

I don't think this is true.

In [1]: from hypothesis.internal.conjecture.data import ConjectureData

In [2]: import hypothesis.internal.conjecture.utils as cu

In [3]: cu.biased_coin(ConjectureData.for_buffer([1, 0]), 1.0 / 257)
Out[3]: False

Zac-HD · 2020-07-11T13:27:21Z

You're right, but in this case don't we need to write [255, 1] for a 1/257 forced-True? The current implementation looks like it would write [1] and that gets parsed as False.

DRMacIver · 2020-07-11T13:54:15Z

You're right, but in this case don't we need to write [255, 1] for a 1/257 forced-True? The current implementation looks like it would write [1] and that gets parsed as False.

Hmm. You're right that it does. I thought it wasn't supposed to and it was always choosing the bit length large enough to avoid doing that. I'll add some tests and fix that. Thanks, good catch!

…ve 1 as True and 0 as False

DRMacIver · 2020-07-11T16:22:05Z

Hmm. You're right that it does. I thought it wasn't supposed to and it was always choosing the bit length large enough to avoid doing that. I'll add some tests and fix that. Thanks, good catch!

It wasn't doing that. Apparently it was reducing the bit length where it could but never raising it. I've now modified the implementation of biased_coin to pick a bit width that works for its initial choice of p to allow 0 to always be False and 1 to always be True and then stick with it.

Zac-HD

A nice set of changes for clarity as well as performance 🙂

One comment below, but I don't need to review it again.

hypothesis-python/src/hypothesis/internal/conjecture/utils.py

…ration isn't already <= 1

DRMacIver · 2020-07-12T11:01:26Z

It looks like this made the shrink quality tests slightly flaky, so I added another patch to biased_coin so that it can take advantage of the remove_discarded functionality to instantly turn all biased coins into a 0 or 1 during shrinking. You might want to review that last commit before I merge?

Zac-HD · 2020-07-12T11:08:50Z

I was thinking that ConjectureData.draw_bits could also have *, forced=None, but that's not a big deal.

Final commit looks good, I'm a big fan of remove_discarded 😁

DRMacIver · 2020-07-12T11:15:34Z

I was thinking that ConjectureData.draw_bits could also have *, forced=None, but that's not a big deal.

Oh, sure. Good idea.

DRMacIver · 2020-07-12T11:36:31Z

Oh, sure. Good idea.

Looks like this breaks some tests, and this PR has scope creeped enough as it is, so I'm not going to do this now.

Zac-HD · 2020-07-12T12:10:36Z

I'll open an issue then, it's a nice small one for the SciPy sprint today 😄

DRMacIver added 3 commits July 11, 2020 10:05

Use biased_coin in FeatureFlags

e2830fa

Make use of new forced flag in many

840ac76

Update release a bit

dc78116

Pick a number of bits for the biased coin that allows us to always ha…

f58f86e

…ve 1 as True and 0 as False

Zac-HD approved these changes Jul 12, 2020

View reviewed changes

hypothesis-python/src/hypothesis/internal/conjecture/utils.py Outdated Show resolved Hide resolved

DRMacIver added 2 commits July 12, 2020 10:23

Use keyword only args

e305178

Always write a 0 or 1 into the choice sequence when the last loop ite…

8f752f0

…ration isn't already <= 1

DRMacIver force-pushed the DRMacIver/shrinker-friendly-flags branch from de2d98d to 8f752f0 Compare July 12, 2020 11:36

Zac-HD mentioned this pull request Jul 12, 2020

Make forced a keyword-only argument in ConjectureData.draw_bits #2487

Closed

DRMacIver merged commit 119893d into master Jul 12, 2020

DRMacIver deleted the DRMacIver/shrinker-friendly-flags branch July 12, 2020 12:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use biased_coin in FeatureFlags #2485

Use biased_coin in FeatureFlags #2485

DRMacIver commented Jul 11, 2020

Zac-HD commented Jul 11, 2020

DRMacIver commented Jul 11, 2020

Zac-HD commented Jul 11, 2020

DRMacIver commented Jul 11, 2020

DRMacIver commented Jul 11, 2020

Zac-HD left a comment

DRMacIver commented Jul 12, 2020

Zac-HD commented Jul 12, 2020

DRMacIver commented Jul 12, 2020

DRMacIver commented Jul 12, 2020

Zac-HD commented Jul 12, 2020

Use biased_coin in FeatureFlags #2485

Use biased_coin in FeatureFlags #2485

Conversation

DRMacIver commented Jul 11, 2020

Zac-HD commented Jul 11, 2020

DRMacIver commented Jul 11, 2020

Zac-HD commented Jul 11, 2020

DRMacIver commented Jul 11, 2020

DRMacIver commented Jul 11, 2020

Zac-HD left a comment

Choose a reason for hiding this comment

DRMacIver commented Jul 12, 2020

Zac-HD commented Jul 12, 2020

DRMacIver commented Jul 12, 2020

DRMacIver commented Jul 12, 2020

Zac-HD commented Jul 12, 2020