You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note: This is part of #1093, where I encourage people to work on the shrinker. As such I will not be working on it myself and encourage you to take a look at it if it takes your fancy. Note: This one is a bit fiddly and requires experimentation. I do not recommend it as a first attempt on the shrinker, but it's probably a decent second attempt.
Hypothesis has a special case of sorts (actually, two) for things that look like the following:
Basically it will try to lower n while simultaneously deleting some data afterwards. It does this by noticing that lowering n reduces the total amount of data drawn and using this to trigger some heuristics that say that it's worth trying to simultaneously lower n and delete data.
Currently it can only delete data in two ways:
Deleting the data that immediately comes after n
Deleting one or two adjacent draw calls anywhere in the byte sequence after n.
This works surprisingly well for a lot of different things, especially in combination with the way that the reordering passes tend to move the interesting stuff to the end, so deleting right after n is often exactly the right thing to do.
Where it doesn't work is when the decisions we make after n become non-local, so that the things that need to be deleted don't line up neatly. For example, imagine the following:
Here we should always get ([1], [1], [1]) as an outcome, but (rarely) we don't.
This is actually quite difficult to reliably demonstrate with natural examples because have some backtracking logic (escape_local_minimum) which helps get us out of messes like this and it usually works, but it's non-deterministic and thus sometimes fails.
The easiest way to demonstrate this is with a unit test. The following follows the conventions of tests/cover/test_conjecture_engine.py and uses monkeypatching to put the shrinker into the right state to run into problems:
Here we create a scenario like this and turn off the backtracking logic so the shrinker is forced to do this the hard way. The thing for it to do here is to lower n to 1 and delete all the zero bytes, but there's no way for it to discover this right now.
Note: The start/stop example calls allow us to wrap a boundary around each drawn section so we know what goes where. If you want to play this on hard mode it would be cool if this also worked with those removed, but I suspect it will end up much harder.
I do not know exactly what the correct solution is going to be, but I suspect something like the following:
Add an expensive shrink pass that looks for this scenario by trying each shrinking block and lowering it to its predecessor.
Do a comparison of the draws in the current shrink target to look for things that "go missing" - places where in the current draw target some draw contains a child with some particular label and byte pattern, and in the shrunk version it doesn't.
Try to delete some values prior to those points until they no longer go missing. e.g. if it lost k draw calls of a particular label, try deleting both the first k and last k draw calls that ended up in that label in the child.
Note that this might not work and you might have to try something else! But I do think the general idea of looking for what went missing and trying to repair the draw is the way to go.
The text was updated successfully, but these errors were encountered:
I'm closing this because after four years I don't expect a volunteer to take it on soon - and as we said in #1099:
At present there is literally no good reason to improve the Hypothesis shrinker except enjoyment and weird aesthetic obsessions: due to the combination of better underlying model and a truly disproportionate amount of invested effort, it is currently so good that everyone else's shrinking looks comedically bad in comparison.
Note: This is part of #1093, where I encourage people to work on the shrinker. As such I will not be working on it myself and encourage you to take a look at it if it takes your fancy. Note: This one is a bit fiddly and requires experimentation. I do not recommend it as a first attempt on the shrinker, but it's probably a decent second attempt.
Hypothesis has a special case of sorts (actually, two) for things that look like the following:
Basically it will try to lower n while simultaneously deleting some data afterwards. It does this by noticing that lowering n reduces the total amount of data drawn and using this to trigger some heuristics that say that it's worth trying to simultaneously lower
n
and delete data.Currently it can only delete data in two ways:
n
n
.This works surprisingly well for a lot of different things, especially in combination with the way that the reordering passes tend to move the interesting stuff to the end, so deleting right after
n
is often exactly the right thing to do.Where it doesn't work is when the decisions we make after
n
become non-local, so that the things that need to be deleted don't line up neatly. For example, imagine the following:We can sometimes get stuck and fail to reduce this well:
Here we should always get
([1], [1], [1])
as an outcome, but (rarely) we don't.This is actually quite difficult to reliably demonstrate with natural examples because have some backtracking logic (
escape_local_minimum
) which helps get us out of messes like this and it usually works, but it's non-deterministic and thus sometimes fails.The easiest way to demonstrate this is with a unit test. The following follows the conventions of
tests/cover/test_conjecture_engine.py
and uses monkeypatching to put the shrinker into the right state to run into problems:Here we create a scenario like this and turn off the backtracking logic so the shrinker is forced to do this the hard way. The thing for it to do here is to lower
n
to1
and delete all the zero bytes, but there's no way for it to discover this right now.Note: The start/stop example calls allow us to wrap a boundary around each drawn section so we know what goes where. If you want to play this on hard mode it would be cool if this also worked with those removed, but I suspect it will end up much harder.
I do not know exactly what the correct solution is going to be, but I suspect something like the following:
k
draw calls of a particular label, try deleting both the firstk
and lastk
draw calls that ended up in that label in the child.Note that this might not work and you might have to try something else! But I do think the general idea of looking for what went missing and trying to repair the draw is the way to go.
The text was updated successfully, but these errors were encountered: