Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use biased_coin in FeatureFlags #2485

Merged
merged 6 commits into from
Jul 12, 2020
Merged

Conversation

DRMacIver
Copy link
Member

Part of why #2483 was so fiddly was that it turns out that our feature flags implementation shrinks very badly. On the one hand this was good in that it revealed problems with the pass that it was useful to catch, on the other hand it'd be nicer if it just didn't shrink badly.

The problem is basically that it leaves high bytes in the choice sequence as an intrinsic feature, so the shrinker potentially has to do a lot of work repeatedly checking if it can lower those. Ideally all bytes would be 0 or 1 after a successful shrink. This is easy to achieve by swapping over the implementation to biased_coin, which already has optimisations for this.

In order to enable this more elegantly this caused me to add a feature to biased_coin which allows you to force the value it returns. Once I'd done that it made sense to make use of that in the many utils class which essentially faked that feature badly on its own.

@Zac-HD
Copy link
Member

Zac-HD commented Jul 11, 2020

This looks good to me in itself, but after starting at the code for a while I think we get the wrong distribution when p < 1/256 because drawing [1, 0] is interpreted as True when it ought to consider both bytes and return False.

@DRMacIver
Copy link
Member Author

This looks good to me in itself, but after starting at the code for a while I think we get the wrong distribution when p < 1/256 because drawing [1, 0] is interpreted as True when it ought to consider both bytes and return False.

I don't think this is true.

In [1]: from hypothesis.internal.conjecture.data import ConjectureData

In [2]: import hypothesis.internal.conjecture.utils as cu

In [3]: cu.biased_coin(ConjectureData.for_buffer([1, 0]), 1.0 / 257)
Out[3]: False

@Zac-HD
Copy link
Member

Zac-HD commented Jul 11, 2020

You're right, but in this case don't we need to write [255, 1] for a 1/257 forced-True? The current implementation looks like it would write [1] and that gets parsed as False.

@DRMacIver
Copy link
Member Author

You're right, but in this case don't we need to write [255, 1] for a 1/257 forced-True? The current implementation looks like it would write [1] and that gets parsed as False.

Hmm. You're right that it does. I thought it wasn't supposed to and it was always choosing the bit length large enough to avoid doing that. I'll add some tests and fix that. Thanks, good catch!

@DRMacIver
Copy link
Member Author

Hmm. You're right that it does. I thought it wasn't supposed to and it was always choosing the bit length large enough to avoid doing that. I'll add some tests and fix that. Thanks, good catch!

It wasn't doing that. Apparently it was reducing the bit length where it could but never raising it. I've now modified the implementation of biased_coin to pick a bit width that works for its initial choice of p to allow 0 to always be False and 1 to always be True and then stick with it.

Copy link
Member

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nice set of changes for clarity as well as performance 🙂

One comment below, but I don't need to review it again.

@DRMacIver
Copy link
Member Author

It looks like this made the shrink quality tests slightly flaky, so I added another patch to biased_coin so that it can take advantage of the remove_discarded functionality to instantly turn all biased coins into a 0 or 1 during shrinking. You might want to review that last commit before I merge?

@Zac-HD
Copy link
Member

Zac-HD commented Jul 12, 2020

I was thinking that ConjectureData.draw_bits could also have *, forced=None, but that's not a big deal.

Final commit looks good, I'm a big fan of remove_discarded 😁

@DRMacIver
Copy link
Member Author

I was thinking that ConjectureData.draw_bits could also have *, forced=None, but that's not a big deal.

Oh, sure. Good idea.

@DRMacIver DRMacIver force-pushed the DRMacIver/shrinker-friendly-flags branch from de2d98d to 8f752f0 Compare July 12, 2020 11:36
@DRMacIver
Copy link
Member Author

Oh, sure. Good idea.

Looks like this breaks some tests, and this PR has scope creeped enough as it is, so I'm not going to do this now.

@Zac-HD
Copy link
Member

Zac-HD commented Jul 12, 2020

I'll open an issue then, it's a nice small one for the SciPy sprint today 😄

@DRMacIver DRMacIver merged commit 119893d into master Jul 12, 2020
@DRMacIver DRMacIver deleted the DRMacIver/shrinker-friendly-flags branch July 12, 2020 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants