-
Notifications
You must be signed in to change notification settings - Fork 588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New core strategies: datetimes, dates, times, timedeltas #621
Conversation
cf8bfba
to
78011ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments from an initial skim (sorry, I haven’t had time to review it properly yet).
docs/changes.rst
Outdated
3.9.0 - 2017-05-12 | ||
------------------ | ||
|
||
This is a feature release, adding datetime-related strategies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make it clear these are part of the core lib.
docs/changes.rst
Outdated
|
||
- ``times`` and ``datetimes`` take an optional ``timezones=`` argument, which | ||
defaults to ``none()`` for naive times. You can use our extra strategy | ||
based on pytz, or roll your timezones strategy with dateutil or even the stdlib. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"roll your own timezones strategies"
docs/changes.rst
Outdated
|
||
- The old ``dates``, ``times``, and ``datetimes`` strategies in | ||
``hypothesis.extra.datetimes`` are deprecated in favor of the new | ||
strategies, which are more flexible and have no dependencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be presented as bullets? It's all part of the same concept – a new set of date-related strategies – let’s have it as top-level paragraphs instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to give a deprecation timeline?
Q for @DRMacIver – how long do you want to keep around these deprecated strategies?
src/hypothesis/extra/datetime.py
Outdated
timezones = list(pytz.all_timezones) | ||
timezones.remove(u'UTC') | ||
timezones.insert(0, u'UTC') | ||
return sampled_from(list(map(pytz.timezone, timezones))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style: I prefer list comprehensions to maps, as I think they’re more readable.
src/hypothesis/extra/datetime.py
Outdated
def timezones(): | ||
"""A strategy for pytz tzinfo objects. | ||
|
||
Essentially ``sampled_from(map(pytz.timezone, pytz.all_timezones))``, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let’s not put implementation details in the docstring, particularly because this isn’t accurate. Would be better to highlight the key difference – this strategy shrinks to UTC.
src/hypothesis/strategies.py
Outdated
|
||
There are a number of ways you can create a timezones strategy, depending | ||
mostly on your timezone objects (e.g. from the standard library, | ||
``datetutil``, or ``pytz``). In each case, you should probably use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"probably"?
When would I not want to use sampled_from
?
src/hypothesis/strategies.py
Outdated
working purely in UTC to avoid such issues. | ||
|
||
There are a number of ways you can create a timezones strategy, depending | ||
mostly on your timezone objects (e.g. from the standard library, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: drop "mostly", unless we want to give pointers to alternative reasons
src/hypothesis/strategies.py
Outdated
if not isinstance(timezones, SearchStrategy): | ||
raise InvalidArgument( | ||
'timezones=%r must be a SearchStrategy that can provide tzinfo ' | ||
'for datetimes (either None or dt.tzinfo objects)' % (timezones,)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have we suddenly wrapped a single format string argument in a tuple when you didn't above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that one obvious mistake would be to supply e.g. timezones=[None, pytz.UTC]
instead of a strategy. Above, we know that it's either None or a datetime so there's no need to handle the sequence case.
src/hypothesis/strategies.py
Outdated
return just(min_date) | ||
return datetimes(min_datetime=dt.datetime.combine(min_date, dt.time.min), | ||
max_datetime=dt.datetime.combine(max_date, dt.time.max) | ||
).map(dt.datetime.date) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code makes me uncomfortable – what if we try this on a day that didn’t start at midnight, or doesn’t finish at 23:59? The former is definitely possible.
I’d be tempted to implement this in a way that avoids letting times get involved at all:
interval_length = (max_date - min_date).days
return integers(min=0, max=interval_length).map(
lambda d: min_date + dt.timedelta(days=d)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those scenarios are only representable with tz-aware datetimes, luckily. I agree that the delta-based strategy is nicer, but the advantage of using the datetimes strat is it minimises towards 2000-01-01 rather than min_date (usually 0001-01-01).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That can be solved relatively easily by using the center argument to integer_range
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Derp, of course I can just calculate the timedelta in days to the millennium. Yeah, that's much nicer.
tests/cover/test_datetimes.py
Outdated
from hypothesis.strategytests import strategy_test_suite | ||
from hypothesis.internal.compat import hrange | ||
|
||
# TIMEDELTAS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this comment is useful.
3474dad
to
65be6dd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, thanks! Happy with the API except for possible the one comment about None vs none(). Specific comments inline.
tests/common/utils.py
Outdated
with settings(strict=False): | ||
try: | ||
func(*args, **kwargs) | ||
except HypothesisDeprecationWarning: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right. Surely the correct thing to do here is to never raise the error when strict=False?
Perhaps you wanted a capture_warnings call here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, but as it happens strategy_test_suite(...)
overrides it somehow. Should I just delete those tests for the deprecated strategies, and remove the try/except?
tests/cover/test_datetimes.py
Outdated
assert datetimes(val, val).example() is val | ||
|
||
|
||
TestStandardDescriptorFeatures_dates1 = strategy_test_suite(dates()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a request for changes, but I'm vaguely thinking we should just get rid of these. I'm not convinced they are testing anything useful any more - they're kinda a legacy of the old much more complicated way of writing strategies, and I'm not sure I've ever seen them catch an error in the new implementation that wasn't caught by another test.
tests/datetime/test_datetime.py
Outdated
|
||
|
||
@checks_deprecated_behaviour |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This decorator was a really good idea, thanks. It's much nicer just being able to slap it on all the old tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we merge this first, it would also be useful in #580 for a few places where you've put settings(strict=False)
.
tests/datetime/test_timezones.py
Outdated
lambda d: assume(d.tzinfo) and d.tzinfo.zone != u'UTC') | ||
|
||
|
||
@given(just('min_time') | just('max_time'), times(timezones=timezones())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just(a) | just(b)
would be better written sampled_from((a, b))
I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I'm not totally convinced this benefits from the given and might suggest just hard-coding a time value and using a parametrize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking about future changes - if UTC or a fixed offset is later allowed, I want that change to make a test fail.
self.tz_strat = timezones_strat | ||
|
||
def do_draw(self, data): | ||
# I've never seen this take eight iterations, but let's be safe... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can break in weird ways when things are being mutated and/or shrunk. It's important to remember that a ConjectureData object is not actually a random number generator and can, by design, fuck around arbitrarily with the data it generates.
The comparable logic in filter is to try 3 times and then fail which might be better to emulate here (though three is an arbitrary number and it could be 8 just as easily - filter might be expensive so it makes sense to keep it smaller).
Failing (by calling mark_invalid()
on the data object) is definitely the correct thing to do at the end of the loop rather than raising an exception.
Regardless of the number you end up settling on, here's how you might test this:
- Factor out the loop body into its own method.
- Use
strategies.binary()
+find()
to find a buffer such that whendo_draw
is called onConjectureData.for_buffer
the loop body fails. - Call
data.draw(datetimes())
on aConjectureData.for_buffer(failure_inducing * 100)
.
docs/changes.rst
Outdated
core strategies, which are more flexible and have no dependencies. | ||
|
||
The default field mapping for DateTimeField in the Django extra now respects | ||
the ``USE_TZ`` setting when choosing a strategy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth referencing #439 here?
Also, I feel like I should ask you to separate this out into its own patch release, but I won't actually insist on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current convention seems to be no issue links in the changelog; I'm happy to change that but it should happen via an entry in the review guide (or, better, part of a more detailed "making a pull" guide).
If it's a patch before, I then repatch it as part of this pull. If it's a patch after, why not include it? (I'm willing to split it out but inclined not to, given the build issues I've had lately)
src/hypothesis/extra/datetime.py
Outdated
@@ -15,111 +15,120 @@ | |||
# | |||
# END HEADER | |||
|
|||
"""This module provides time and date related strategies. | |||
|
|||
It depends on the ``pytz`` package, which is stable enough that almost any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it would be better to put the timezones strategy into a hypothesis.extra.pytz
package and let htis one become entirely deprecated.
src/hypothesis/strategies.py
Outdated
due to daylight savings, leap seconds, timezone and calendar | ||
adjustments, etc. This is intentional, as malformed timestamps are a | ||
common source of bugs. Consider validating all datetime values, or | ||
working purely in UTC to avoid such issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does working purely in UTC actually fully avoid such issues? It's not like UTC doesn't have leap seconds.
src/hypothesis/strategies.py
Outdated
@@ -910,6 +912,108 @@ def do_draw(self, data): | |||
return PermutationStrategy() | |||
|
|||
|
|||
@defines_strategy | |||
def datetimes(min_datetime=dt.datetime.min, max_datetime=dt.datetime.max, | |||
timezones=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to have timezones
default to none()
and not allow None as a value here? That way it's always consistently a strategy object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I was trying to allow a just(min_datetime)
case, but consistency of API is probably better than implementation details.
src/hypothesis/strategies.py
Outdated
check_type(dt.datetime, max_datetime, 'max_datetime') | ||
if min_datetime.tzinfo is not None: | ||
raise InvalidArgument('min_datetime=%r must not have tzinfo' | ||
% min_datetime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference is fairly strongly to always wrap % arguments in a tuple even when it's valid not to FWIW. I know it can't go wrong here, but it's better to be obviously correct than non-obviously correct.
src/hypothesis/extra/datetime.py
Outdated
|
||
""" | ||
all_timezones = [pytz.timezone(tz) for tz in pytz.all_timezones] | ||
static = [pytz.UTC] + sorted( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, one final thing: I don't quite understand this logic and would appreciate a comment here explaining what the significance of a a StaticTzInfo is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. A StaticTzInfo represents a timezone that has always had a constant offset from UTC - no daylight savings, and no other tricks like moving across the date line. IMO that makes them simpler than the others, and smaller absolute offsets simpler within that.
310ebf7
to
f0f7e87
Compare
tests/cover/test_datetimes.py
Outdated
lambda b: strat._attempt_one_draw( | ||
ConjectureData.for_buffer(b)) is None | ||
) | ||
# Should raise some kind of error? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It definitely should. I wonder why it isn't. I think the most likely explanation is that datetimes() is for some reason not exactly the same strategy as strat. Try passing strat instead of datetimes?
The other thing to check if that doesn't work is to make sure it's actually consuming the whole buffer. e.g. do a data = ConjectureData.for_buffer(failure_inducing)
, pass that to strat._attempt_one_draw, and assert afterwards that data.buffer == failure_inducing
and the result
docs/changes.rst
Outdated
|
||
This is a feature release, adding datetime-related strategies to the core strategies. | ||
|
||
``extra.datetime.timezones`` allows you to sample pytz timezones from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra.pytz.timezones
rather than extra.datetime.timezones
I think?
251dcd4
to
b20d42f
Compare
OK, rebased and generally fixed up. However, the "maybe a test" is passing when it shouldn't be, and this is the cause of failing coverage. No idea why, but it does produce the minimal |
Something weird is going on here. I'm going to have a bit of a poke into it. |
OK. I understand the problem now. The min_size parameter you're adding to I think you want to do something like: def test_DatetimeStrategy_draw_may_fail():
def is_failure_inducing(b):
try:
return strat._attempt_one_draw(
ConjectureData.for_buffer(b)) is None
except StopTest:
return False
strat = DatetimeStrategy(dt.datetime.min, dt.datetime.max, none())
failure_inducing = find(binary(), is_failure_inducing)
data = ConjectureData.for_buffer(failure_inducing * 100)
with pytest.raises(StopTest):
data.draw(strat)
assert data.status == Status.INVALID (Status and StopTest can both be imported from hypothesis.internal.conjecture.data) |
Ah, that makes sense - I got a StopTest error in the previous find because it raised, silenced that by adding the min_size, and then it couldn't raise later. |
Yeah, StopTest in this context basically gets raised to, err, stop the test when something has prevented data from being correctly drawn. It mostly happens when you run out of bytes, or when mark_invalid is called. Here you're getting the former. The problem is that having too much data here is bad as well as too little, because the goal of the test is to get things to line up perfectly so that the strategy keeps getting given the wrong data. (As a side note this test would have very occasionally failed with a StopTest before hand: The amount of data drawn by |
To include the type of the argument that was invalid, and optionally the name too.
Allowing them to exist unchanged.
And wow, is it nice not to have compatibility constraints.
7e8f4ba
to
0f23443
Compare
Largely taken from the tests for deprecated versions, because why mess with a working implementation?
Ping @DRMacIver for final review. It looks like the appveyor build simply timed out, so if you rerun that job I'm confident it will pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 ❤️
@alexwlchan you still have an open review of this but I think @Zac-HD addressed all the issues. Do you want to do a more in depth final review or should we dismiss your review? |
Addressed comments on an early version
@alexwlchan - if you have any issues I haven't addressed, let me know; otherwise I'll merge this late tonight (UTC). [hmm, the comment above hadn't loaded. I'll wait!] |
I think it's fine to just merge if you're keen to get it through FWIW. |
While I'm very keen to get this merged, I'd rather err on the side of more feedback. Since midnight UTC is 10am local, it's no trouble for me to merge in the morning 😄 |
Released! 🥂 |
This time with feeling! Successor to #556 and #520. Closes #333, closes #352, closes #441, closes #599. (Apparently we've wanted to do something about time for a while)
599 outlines most of the design lessons I learned from 556, and has a few explanatory comments I won't repeat here.
I will note that I see the
timezones()
strategy largely as a placeholder: there are several ways I could reorder timezones, or allow them to be grouped and selected. Until I've seen it used in the wild though, I'd prefer to keep it forward-compatible and let user code shape how it evolves - especially since it's trivial to copy and modify.