Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve shrink ordering of text #2482

Merged
merged 6 commits into from
Jul 26, 2020
Merged

Conversation

DRMacIver
Copy link
Member

@DRMacIver DRMacIver commented Jul 9, 2020

OK, so bear with me, this is a lot of machinery for what is really a very small actual change, but actually the machinery is the main point and the actual change is just a nice hook to hang it off.

User visible change: Characters now shrink in a more sensible order.

This showed that we didn't really bias towards small values very well, so it caused some tests to pass, so I also added a strong bias toward the low end of the range for characters.

Actually interesting change:

  1. We now can test what the actual smallest values of a strategy are by enumerating a strategy's values in shortlex-ascending order (only available in testing at present).
  2. We've acquired a bunch of machinery for the inference of regular languages using the L* algorithm, which is used to implement (1).

Currently we are not using L* nearly enough to justify its inclusion in the code base, but the reason I have this implementation at all is actually as part of a new and exciting plan. That plan is only 80% baked but lead me to notice the problem with characters shrinking badly, which lead to this test and this patch, and that was a convenient opportunity to reduce the review burden. Once this is merged and that plan gets fully baked, we're going to start using L* to (offline) learn automata for shrinking. This will let us automatically (!!) take any user example that shrinks badly and learn a new shrink pass that can handle it.

I've also done something I've been meaning to do for a while, which is to add an (internal use only) flag to ConjectureRunner to disable all of the limits we have, as they're super annoying for a lot of use cases that use the runner directly (as is the case in these tests).

Suggested reviewing order:

  1. Look at the actual change to OneCharacterStringStrategy
  2. Look at the testing of it in tests/quality/test_shrinking_order.py
  3. Look at the L* implementation

I've tried to make (3) as clear as I can, but you might find it helpful to watch my PyCon UK talk about the L* algorithm and/or read some of the papers if you're very keen. Feel free to ask me lots of questions if any of it doesn't make sense.

@DRMacIver DRMacIver requested a review from Zalathar July 9, 2020 18:26
@DRMacIver DRMacIver force-pushed the DRMacIver/shrink-ordering-tests branch 2 times, most recently from f0b5d99 to 5c8eb76 Compare July 9, 2020 20:13
@Zac-HD
Copy link
Member

Zac-HD commented Jul 10, 2020

This seems relevant to our issues on unicode-category-based text generation, #1401 and #1621, which was much better at generating things like non-NFC strings (even before adding swarm testing) but had problems with shrinking quickly or to the minimal example.

I'm not suggesting that this PR needs to include such changes, but would appreciate your thoughts on whether it would make reviving them practical - or force us to find an alternative approach!

@DRMacIver
Copy link
Member Author

I'm not suggesting that this PR needs to include such changes, but would appreciate your thoughts on whether it would make reviving them practical - or force us to find an alternative approach!

This PR will not have much impact on that one way or the other I think. However watch this space for future PRs which may give us magic pixie dust that we can just sprinkle on such problems to make them go away by telling Hypothesis to do its own damn work in figuring out how to shrink such things without us having to hold its hand. 😁

@DRMacIver DRMacIver force-pushed the DRMacIver/shrink-ordering-tests branch from 27ce2c2 to a8945a7 Compare July 12, 2020 12:57
@DRMacIver
Copy link
Member Author

Huh I'm a little surprised that this passes without #2483 now - I guess the changes to biased_coin made enough of a difference on their own.

@Zalathar
Copy link
Contributor

I think I have a decent handle on L* now, and I’ve been mapping it back to what our implementation is actually doing.

A few things I’ve noticed/inferred:

  • We assume that our alphabet is unsigned byte values, and that our string classifications are booleans.

  • We aren’t necessarily trying to exhaustively learn the underlying DFA, just a good-enough approximation that we can incrementally update as we stumble across organic counterexamples.

  • Our learner doesn’t persistently store the state set or state-experiment table. Instead we hand the experiment list and oracle function to an ExperimentDFA which lazily re-derives all the states and transitions as it encounters them. We discard this every time we learn a new experiment.

@DRMacIver
Copy link
Member Author

We assume that our alphabet is unsigned byte values, and that our string classifications are booleans.

Yup. We could potentially lift either of these restrictions relatively easily as needed (the latter is more likely to be needed than the former).

We aren’t necessarily trying to exhaustively learn the underlying DFA, just a good-enough approximation that we can incrementally update as we stumble across organic counterexamples.

Correct. And we can't really hope to do better than that - the structure of the DFA can be basically arbitrarily weird, and there's no way to discover that without opening the black box of the predicate. e.g. you could have a predicate that only matches a single ridiculously long string, and there's no way to tell that is different from the DFA that matches nothing without knowing what that string is.

Our learner doesn’t persistently store the state set or state-experiment table. Instead we hand the experiment list and oracle function to an ExperimentDFA which lazily re-derives all the states and transitions as it encounters them. We discard this every time we learn a new experiment

Right. This is a modification from the classic L* but I think it's a good one: The reason for this is that it means we only need to recalculate the bit of the state space we actually need (which, when learning is only the set of states we can reach on a string, and when enumerating is only the states reachable by relatively short strings).

If we retained the states every time we added an experiment, we would have to recalculate the rows of each of the previously known states, which is potentially quite expensive. By discarding those and creating a fresh DFA we only need to revisit the states we actually use.

@DRMacIver
Copy link
Member Author

As well as addressing the comments I also had a realisation that the way learn was structured missed a fast path where it would do O(log(n)) membership calls in a common case where it could do O(1) (which is when the string is already correctly predicted) so I've added in a test and a check for that.

@DRMacIver DRMacIver force-pushed the DRMacIver/shrink-ordering-tests branch 2 times, most recently from 30b9314 to b541052 Compare July 21, 2020 16:04
@Zalathar
Copy link
Contributor

I've looked through everything one or more times and been pretty happy with it, and realistically I don't think I have much else to add.

So with a rebase I think this will be in a fine position to land.

@DRMacIver DRMacIver force-pushed the DRMacIver/shrink-ordering-tests branch from a034998 to 96a815c Compare July 26, 2020 11:08
@DRMacIver DRMacIver merged commit 87b10e6 into master Jul 26, 2020
@DRMacIver DRMacIver deleted the DRMacIver/shrink-ordering-tests branch July 26, 2020 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants