-
Notifications
You must be signed in to change notification settings - Fork 588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve shrink ordering of text #2482
Conversation
f0b5d99
to
5c8eb76
Compare
This seems relevant to our issues on unicode-category-based text generation, #1401 and #1621, which was much better at generating things like non-NFC strings (even before adding swarm testing) but had problems with shrinking quickly or to the minimal example. I'm not suggesting that this PR needs to include such changes, but would appreciate your thoughts on whether it would make reviving them practical - or force us to find an alternative approach! |
This PR will not have much impact on that one way or the other I think. However watch this space for future PRs which may give us magic pixie dust that we can just sprinkle on such problems to make them go away by telling Hypothesis to do its own damn work in figuring out how to shrink such things without us having to hold its hand. 😁 |
27ce2c2
to
a8945a7
Compare
Huh I'm a little surprised that this passes without #2483 now - I guess the changes to |
I think I have a decent handle on L* now, and I’ve been mapping it back to what our implementation is actually doing. A few things I’ve noticed/inferred:
|
Yup. We could potentially lift either of these restrictions relatively easily as needed (the latter is more likely to be needed than the former).
Correct. And we can't really hope to do better than that - the structure of the DFA can be basically arbitrarily weird, and there's no way to discover that without opening the black box of the predicate. e.g. you could have a predicate that only matches a single ridiculously long string, and there's no way to tell that is different from the DFA that matches nothing without knowing what that string is.
Right. This is a modification from the classic L* but I think it's a good one: The reason for this is that it means we only need to recalculate the bit of the state space we actually need (which, when learning is only the set of states we can reach on a string, and when enumerating is only the states reachable by relatively short strings). If we retained the states every time we added an experiment, we would have to recalculate the rows of each of the previously known states, which is potentially quite expensive. By discarding those and creating a fresh DFA we only need to revisit the states we actually use. |
hypothesis-python/src/hypothesis/strategies/_internal/strings.py
Outdated
Show resolved
Hide resolved
As well as addressing the comments I also had a realisation that the way |
hypothesis-python/src/hypothesis/internal/conjecture/dfa/lstar.py
Outdated
Show resolved
Hide resolved
hypothesis-python/src/hypothesis/internal/conjecture/dfa/__init__.py
Outdated
Show resolved
Hide resolved
30b9314
to
b541052
Compare
I've looked through everything one or more times and been pretty happy with it, and realistically I don't think I have much else to add. So with a rebase I think this will be in a fine position to land. |
a034998
to
96a815c
Compare
OK, so bear with me, this is a lot of machinery for what is really a very small actual change, but actually the machinery is the main point and the actual change is just a nice hook to hang it off.
User visible change: Characters now shrink in a more sensible order.
This showed that we didn't really bias towards small values very well, so it caused some tests to pass, so I also added a strong bias toward the low end of the range for characters.
Actually interesting change:
Currently we are not using L* nearly enough to justify its inclusion in the code base, but the reason I have this implementation at all is actually as part of a new and exciting plan. That plan is only 80% baked but lead me to notice the problem with
characters
shrinking badly, which lead to this test and this patch, and that was a convenient opportunity to reduce the review burden. Once this is merged and that plan gets fully baked, we're going to start usingL*
to (offline) learn automata for shrinking. This will let us automatically (!!) take any user example that shrinks badly and learn a new shrink pass that can handle it.I've also done something I've been meaning to do for a while, which is to add an (internal use only) flag to
ConjectureRunner
to disable all of the limits we have, as they're super annoying for a lot of use cases that use the runner directly (as is the case in these tests).Suggested reviewing order:
OneCharacterStringStrategy
tests/quality/test_shrinking_order.py
L*
implementationI've tried to make (3) as clear as I can, but you might find it helpful to watch my PyCon UK talk about the
L*
algorithm and/or read some of the papers if you're very keen. Feel free to ask me lots of questions if any of it doesn't make sense.