Improve shrink ordering of text #2482

DRMacIver · 2020-07-09T18:26:19Z

OK, so bear with me, this is a lot of machinery for what is really a very small actual change, but actually the machinery is the main point and the actual change is just a nice hook to hang it off.

User visible change: Characters now shrink in a more sensible order.

This showed that we didn't really bias towards small values very well, so it caused some tests to pass, so I also added a strong bias toward the low end of the range for characters.

Actually interesting change:

We now can test what the actual smallest values of a strategy are by enumerating a strategy's values in shortlex-ascending order (only available in testing at present).
We've acquired a bunch of machinery for the inference of regular languages using the L* algorithm, which is used to implement (1).

Currently we are not using L* nearly enough to justify its inclusion in the code base, but the reason I have this implementation at all is actually as part of a new and exciting plan. That plan is only 80% baked but lead me to notice the problem with characters shrinking badly, which lead to this test and this patch, and that was a convenient opportunity to reduce the review burden. Once this is merged and that plan gets fully baked, we're going to start using L* to (offline) learn automata for shrinking. This will let us automatically (!!) take any user example that shrinks badly and learn a new shrink pass that can handle it.

I've also done something I've been meaning to do for a while, which is to add an (internal use only) flag to ConjectureRunner to disable all of the limits we have, as they're super annoying for a lot of use cases that use the runner directly (as is the case in these tests).

Suggested reviewing order:

Look at the actual change to OneCharacterStringStrategy
Look at the testing of it in tests/quality/test_shrinking_order.py
Look at the L* implementation

I've tried to make (3) as clear as I can, but you might find it helpful to watch my PyCon UK talk about the L* algorithm and/or read some of the papers if you're very keen. Feel free to ask me lots of questions if any of it doesn't make sense.

Zac-HD · 2020-07-10T09:51:08Z

This seems relevant to our issues on unicode-category-based text generation, #1401 and #1621, which was much better at generating things like non-NFC strings (even before adding swarm testing) but had problems with shrinking quickly or to the minimal example.

I'm not suggesting that this PR needs to include such changes, but would appreciate your thoughts on whether it would make reviving them practical - or force us to find an alternative approach!

DRMacIver · 2020-07-10T09:57:03Z

I'm not suggesting that this PR needs to include such changes, but would appreciate your thoughts on whether it would make reviving them practical - or force us to find an alternative approach!

This PR will not have much impact on that one way or the other I think. However watch this space for future PRs which may give us magic pixie dust that we can just sprinkle on such problems to make them go away by telling Hypothesis to do its own damn work in figuring out how to shrink such things without us having to hold its hand. 😁

DRMacIver · 2020-07-12T19:55:36Z

Huh I'm a little surprised that this passes without #2483 now - I guess the changes to biased_coin made enough of a difference on their own.

Zalathar · 2020-07-13T03:36:02Z

I think I have a decent handle on L* now, and I’ve been mapping it back to what our implementation is actually doing.

A few things I’ve noticed/inferred:

We assume that our alphabet is unsigned byte values, and that our string classifications are booleans.
We aren’t necessarily trying to exhaustively learn the underlying DFA, just a good-enough approximation that we can incrementally update as we stumble across organic counterexamples.
Our learner doesn’t persistently store the state set or state-experiment table. Instead we hand the experiment list and oracle function to an ExperimentDFA which lazily re-derives all the states and transitions as it encounters them. We discard this every time we learn a new experiment.

DRMacIver · 2020-07-13T09:01:39Z

We assume that our alphabet is unsigned byte values, and that our string classifications are booleans.

Yup. We could potentially lift either of these restrictions relatively easily as needed (the latter is more likely to be needed than the former).

We aren’t necessarily trying to exhaustively learn the underlying DFA, just a good-enough approximation that we can incrementally update as we stumble across organic counterexamples.

Correct. And we can't really hope to do better than that - the structure of the DFA can be basically arbitrarily weird, and there's no way to discover that without opening the black box of the predicate. e.g. you could have a predicate that only matches a single ridiculously long string, and there's no way to tell that is different from the DFA that matches nothing without knowing what that string is.

Our learner doesn’t persistently store the state set or state-experiment table. Instead we hand the experiment list and oracle function to an ExperimentDFA which lazily re-derives all the states and transitions as it encounters them. We discard this every time we learn a new experiment

Right. This is a modification from the classic L* but I think it's a good one: The reason for this is that it means we only need to recalculate the bit of the state space we actually need (which, when learning is only the set of states we can reach on a string, and when enumerating is only the states reachable by relatively short strings).

If we retained the states every time we added an experiment, we would have to recalculate the rows of each of the previously known states, which is potentially quite expensive. By discarding those and creating a fresh DFA we only need to revisit the states we actually use.

hypothesis-python/src/hypothesis/internal/conjecture/dfa/lstar.py

hypothesis-python/src/hypothesis/strategies/_internal/strings.py

hypothesis-python/src/hypothesis/internal/conjecture/junkdrawer.py

DRMacIver · 2020-07-14T11:58:45Z

As well as addressing the comments I also had a realisation that the way learn was structured missed a fast path where it would do O(log(n)) membership calls in a common case where it could do O(1) (which is when the string is already correctly predicted) so I've added in a test and a check for that.

hypothesis-python/src/hypothesis/internal/conjecture/dfa/lstar.py

hypothesis-python/src/hypothesis/internal/conjecture/dfa/__init__.py

hypothesis-python/tests/quality/test_shrinking_order.py

Zalathar · 2020-07-26T10:34:26Z

I've looked through everything one or more times and been pretty happy with it, and realistically I don't think I have much else to add.

So with a rebase I think this will be in a fine position to land.

DRMacIver requested a review from Zalathar July 9, 2020 18:26

DRMacIver force-pushed the DRMacIver/shrink-ordering-tests branch 2 times, most recently from f0b5d99 to 5c8eb76 Compare July 9, 2020 20:13

DRMacIver mentioned this pull request Jul 10, 2020

Add shrink pass for pairs of bytes #2483

Closed

DRMacIver force-pushed the DRMacIver/shrink-ordering-tests branch from 27ce2c2 to a8945a7 Compare July 12, 2020 12:57

Zalathar reviewed Jul 13, 2020

View reviewed changes

hypothesis-python/src/hypothesis/internal/conjecture/dfa/lstar.py Show resolved Hide resolved

Zalathar reviewed Jul 13, 2020

View reviewed changes

hypothesis-python/src/hypothesis/strategies/_internal/strings.py Outdated Show resolved Hide resolved

Zalathar reviewed Jul 13, 2020

View reviewed changes

hypothesis-python/src/hypothesis/internal/conjecture/junkdrawer.py Show resolved Hide resolved

Zalathar reviewed Jul 16, 2020

View reviewed changes

hypothesis-python/src/hypothesis/internal/conjecture/dfa/lstar.py Outdated Show resolved Hide resolved

Zalathar reviewed Jul 18, 2020

View reviewed changes

hypothesis-python/src/hypothesis/internal/conjecture/dfa/__init__.py Outdated Show resolved Hide resolved

Zalathar reviewed Jul 18, 2020

View reviewed changes

hypothesis-python/tests/quality/test_shrinking_order.py Outdated Show resolved Hide resolved

DRMacIver force-pushed the DRMacIver/shrink-ordering-tests branch 2 times, most recently from 30b9314 to b541052 Compare July 21, 2020 16:04

DRMacIver added 6 commits July 26, 2020 12:01

Add a flag to ConjectureRunner to suppress limits

73761fe

Move find_integer into junk drawer

c471f90

Add LStar implementation

d5405b3

Use LStar to test that things shrink in the right order

ae8e055

Fix shrinking order of strings

a049ff6

Add release file

96a815c

DRMacIver force-pushed the DRMacIver/shrink-ordering-tests branch from a034998 to 96a815c Compare July 26, 2020 11:08

DRMacIver merged commit 87b10e6 into master Jul 26, 2020

DRMacIver deleted the DRMacIver/shrink-ordering-tests branch July 26, 2020 16:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve shrink ordering of text #2482

Improve shrink ordering of text #2482

DRMacIver commented Jul 9, 2020 •

edited

Loading

Zac-HD commented Jul 10, 2020

DRMacIver commented Jul 10, 2020

DRMacIver commented Jul 12, 2020

Zalathar commented Jul 13, 2020

DRMacIver commented Jul 13, 2020

DRMacIver commented Jul 14, 2020

Zalathar commented Jul 26, 2020

Improve shrink ordering of text #2482

Improve shrink ordering of text #2482

Conversation

DRMacIver commented Jul 9, 2020 • edited Loading

Zac-HD commented Jul 10, 2020

DRMacIver commented Jul 10, 2020

DRMacIver commented Jul 12, 2020

Zalathar commented Jul 13, 2020

DRMacIver commented Jul 13, 2020

DRMacIver commented Jul 14, 2020

Zalathar commented Jul 26, 2020

DRMacIver commented Jul 9, 2020 •

edited

Loading