New core strategies: datetimes, dates, times, timedeltas #621

Zac-HD · 2017-05-12T11:33:32Z

This time with feeling! Successor to #556 and #520. Closes #333, closes #352, closes #441, closes #599. (Apparently we've wanted to do something about time for a while)

599 outlines most of the design lessons I learned from 556, and has a few explanatory comments I won't repeat here.

I will note that I see the timezones() strategy largely as a placeholder: there are several ways I could reorder timezones, or allow them to be grouped and selected. Until I've seen it used in the wild though, I'd prefer to keep it forward-compatible and let user code shape how it evolves - especially since it's trivial to copy and modify.

alexwlchan

A few comments from an initial skim (sorry, I haven’t had time to review it properly yet).

alexwlchan · 2017-05-13T06:55:06Z

docs/changes.rst

+3.9.0 - 2017-05-12
+------------------
+
+This is a feature release, adding datetime-related strategies.


Let's make it clear these are part of the core lib.

alexwlchan · 2017-05-13T06:55:43Z

docs/changes.rst

+
+- ``times`` and ``datetimes`` take an optional ``timezones=`` argument, which
+  defaults to ``none()`` for naive times.  You can use our extra strategy
+  based on pytz, or roll your timezones strategy with dateutil or even the stdlib.


"roll your own timezones strategies"

alexwlchan · 2017-05-13T06:56:31Z

docs/changes.rst

+
+- The old ``dates``, ``times``, and ``datetimes`` strategies in
+  ``hypothesis.extra.datetimes`` are deprecated in favor of the new
+  strategies, which are more flexible and have no dependencies.


Does this need to be presented as bullets? It's all part of the same concept – a new set of date-related strategies – let’s have it as top-level paragraphs instead.

Do we want to give a deprecation timeline?

Q for @DRMacIver – how long do you want to keep around these deprecated strategies?

alexwlchan · 2017-05-13T06:57:23Z

src/hypothesis/extra/datetime.py

+    timezones = list(pytz.all_timezones)
+    timezones.remove(u'UTC')
+    timezones.insert(0, u'UTC')
+    return sampled_from(list(map(pytz.timezone, timezones)))


Style: I prefer list comprehensions to maps, as I think they’re more readable.

alexwlchan · 2017-05-13T06:58:02Z

src/hypothesis/extra/datetime.py

+def timezones():
+    """A strategy for pytz tzinfo objects.
+
+    Essentially ``sampled_from(map(pytz.timezone, pytz.all_timezones))``,


Let’s not put implementation details in the docstring, particularly because this isn’t accurate. Would be better to highlight the key difference – this strategy shrinks to UTC.

alexwlchan · 2017-05-13T07:11:38Z

src/hypothesis/strategies.py

+
+    There are a number of ways you can create a timezones strategy, depending
+    mostly on your timezone objects (e.g. from the standard library,
+    ``datetutil``, or ``pytz``).  In each case, you should probably use


"probably"?

When would I not want to use sampled_from?

alexwlchan · 2017-05-13T07:12:10Z

src/hypothesis/strategies.py

+        working purely in UTC to avoid such issues.
+
+    There are a number of ways you can create a timezones strategy, depending
+    mostly on your timezone objects (e.g. from the standard library,


nit: drop "mostly", unless we want to give pointers to alternative reasons

alexwlchan · 2017-05-13T07:13:07Z

src/hypothesis/strategies.py

+    if not isinstance(timezones, SearchStrategy):
+        raise InvalidArgument(
+            'timezones=%r must be a SearchStrategy that can provide tzinfo '
+            'for datetimes (either None or dt.tzinfo objects)' % (timezones,))


Why have we suddenly wrapped a single format string argument in a tuple when you didn't above?

I was thinking that one obvious mistake would be to supply e.g. timezones=[None, pytz.UTC] instead of a strategy. Above, we know that it's either None or a datetime so there's no need to handle the sequence case.

alexwlchan · 2017-05-13T07:22:16Z

src/hypothesis/strategies.py

+        return just(min_date)
+    return datetimes(min_datetime=dt.datetime.combine(min_date, dt.time.min),
+                     max_datetime=dt.datetime.combine(max_date, dt.time.max)
+                     ).map(dt.datetime.date)


This code makes me uncomfortable – what if we try this on a day that didn’t start at midnight, or doesn’t finish at 23:59? The former is definitely possible.

I’d be tempted to implement this in a way that avoids letting times get involved at all:

interval_length = (max_date - min_date).days return integers(min=0, max=interval_length).map( lambda d: min_date + dt.timedelta(days=d) )

Those scenarios are only representable with tz-aware datetimes, luckily. I agree that the delta-based strategy is nicer, but the advantage of using the datetimes strat is it minimises towards 2000-01-01 rather than min_date (usually 0001-01-01).

That can be solved relatively easily by using the center argument to integer_range

Derp, of course I can just calculate the timedelta in days to the millennium. Yeah, that's much nicer.

alexwlchan · 2017-05-13T07:23:54Z

tests/cover/test_datetimes.py

+from hypothesis.strategytests import strategy_test_suite
+from hypothesis.internal.compat import hrange
+
+# TIMEDELTAS


I don't think this comment is useful.

DRMacIver

Overall looks good, thanks! Happy with the API except for possible the one comment about None vs none(). Specific comments inline.

DRMacIver · 2017-05-15T08:43:48Z

tests/common/utils.py

+        with settings(strict=False):
+            try:
+                func(*args, **kwargs)
+            except HypothesisDeprecationWarning:


This doesn't seem right. Surely the correct thing to do here is to never raise the error when strict=False?

Perhaps you wanted a capture_warnings call here?

Yep, but as it happens strategy_test_suite(...) overrides it somehow. Should I just delete those tests for the deprecated strategies, and remove the try/except?

DRMacIver · 2017-05-15T08:45:45Z

tests/cover/test_datetimes.py

+    assert datetimes(val, val).example() is val
+
+
+TestStandardDescriptorFeatures_dates1 = strategy_test_suite(dates())


Not a request for changes, but I'm vaguely thinking we should just get rid of these. I'm not convinced they are testing anything useful any more - they're kinda a legacy of the old much more complicated way of writing strategies, and I'm not sure I've ever seen them catch an error in the new implementation that wasn't caught by another test.

DRMacIver · 2017-05-15T08:46:55Z

tests/datetime/test_datetime.py



+@checks_deprecated_behaviour


This decorator was a really good idea, thanks. It's much nicer just being able to slap it on all the old tests.

If we merge this first, it would also be useful in #580 for a few places where you've put settings(strict=False).

DRMacIver · 2017-05-15T08:48:09Z

tests/datetime/test_timezones.py

+            lambda d: assume(d.tzinfo) and d.tzinfo.zone != u'UTC')
+
+
+@given(just('min_time') | just('max_time'), times(timezones=timezones()))


just(a) | just(b) would be better written sampled_from((a, b)) I think.

Although I'm not totally convinced this benefits from the given and might suggest just hard-coding a time value and using a parametrize.

I'm thinking about future changes - if UTC or a fixed offset is later allowed, I want that change to make a test fail.

DRMacIver · 2017-05-15T09:04:01Z

src/hypothesis/searchstrategy/datetime.py

+        self.tz_strat = timezones_strat
+
+    def do_draw(self, data):
+        # I've never seen this take eight iterations, but let's be safe...


I think this can break in weird ways when things are being mutated and/or shrunk. It's important to remember that a ConjectureData object is not actually a random number generator and can, by design, fuck around arbitrarily with the data it generates.

The comparable logic in filter is to try 3 times and then fail which might be better to emulate here (though three is an arbitrary number and it could be 8 just as easily - filter might be expensive so it makes sense to keep it smaller).

Failing (by calling mark_invalid() on the data object) is definitely the correct thing to do at the end of the loop rather than raising an exception.

Regardless of the number you end up settling on, here's how you might test this:

Factor out the loop body into its own method.

Use strategies.binary() + find() to find a buffer such that when do_draw is called on ConjectureData.for_buffer the loop body fails.

Call data.draw(datetimes()) on a ConjectureData.for_buffer(failure_inducing * 100).

DRMacIver · 2017-05-15T09:25:54Z

docs/changes.rst

+core strategies, which are more flexible and have no dependencies.
+
+The default field mapping for DateTimeField in the Django extra now respects
+the ``USE_TZ`` setting when choosing a strategy.


Worth referencing #439 here?

Also, I feel like I should ask you to separate this out into its own patch release, but I won't actually insist on it.

The current convention seems to be no issue links in the changelog; I'm happy to change that but it should happen via an entry in the review guide (or, better, part of a more detailed "making a pull" guide).

If it's a patch before, I then repatch it as part of this pull. If it's a patch after, why not include it? (I'm willing to split it out but inclined not to, given the build issues I've had lately)

DRMacIver · 2017-05-15T09:26:59Z

src/hypothesis/extra/datetime.py

@@ -15,111 +15,120 @@
 #
 # END HEADER

+"""This module provides time and date related strategies.
+
+It depends on the ``pytz`` package, which is stable enough that almost any


I feel like it would be better to put the timezones strategy into a hypothesis.extra.pytz package and let htis one become entirely deprecated.

DRMacIver · 2017-05-15T09:29:56Z

src/hypothesis/strategies.py

+        due to daylight savings, leap seconds, timezone and calendar
+        adjustments, etc.  This is intentional, as malformed timestamps are a
+        common source of bugs.  Consider validating all datetime values, or
+        working purely in UTC to avoid such issues.


Does working purely in UTC actually fully avoid such issues? It's not like UTC doesn't have leap seconds.

DRMacIver · 2017-05-15T09:30:38Z

src/hypothesis/strategies.py

@@ -910,6 +912,108 @@ def do_draw(self, data):
    return PermutationStrategy()


+@defines_strategy
+def datetimes(min_datetime=dt.datetime.min, max_datetime=dt.datetime.max,
+              timezones=None):


Would it be better to have timezones default to none() and not allow None as a value here? That way it's always consistently a strategy object.

Sure. I was trying to allow a just(min_datetime) case, but consistency of API is probably better than implementation details.

DRMacIver · 2017-05-15T09:38:05Z

src/hypothesis/strategies.py

+    check_type(dt.datetime, max_datetime, 'max_datetime')
+    if min_datetime.tzinfo is not None:
+        raise InvalidArgument('min_datetime=%r must not have tzinfo'
+                              % min_datetime)


My preference is fairly strongly to always wrap % arguments in a tuple even when it's valid not to FWIW. I know it can't go wrong here, but it's better to be obviously correct than non-obviously correct.

DRMacIver · 2017-05-15T09:42:02Z

src/hypothesis/extra/datetime.py

+
+    """
+    all_timezones = [pytz.timezone(tz) for tz in pytz.all_timezones]
+    static = [pytz.UTC] + sorted(


Oh, one final thing: I don't quite understand this logic and would appreciate a comment here explaining what the significance of a a StaticTzInfo is.

Will do. A StaticTzInfo represents a timezone that has always had a constant offset from UTC - no daylight savings, and no other tricks like moving across the date line. IMO that makes them simpler than the others, and smaller absolute offsets simpler within that.

DRMacIver · 2017-05-22T08:54:15Z

tests/cover/test_datetimes.py

+        lambda b: strat._attempt_one_draw(
+            ConjectureData.for_buffer(b)) is None
+    )
+    # Should raise some kind of error?


It definitely should. I wonder why it isn't. I think the most likely explanation is that datetimes() is for some reason not exactly the same strategy as strat. Try passing strat instead of datetimes?

The other thing to check if that doesn't work is to make sure it's actually consuming the whole buffer. e.g. do a data = ConjectureData.for_buffer(failure_inducing), pass that to strat._attempt_one_draw, and assert afterwards that data.buffer == failure_inducing and the result

DRMacIver · 2017-05-22T10:08:31Z

docs/changes.rst

+
+This is a feature release, adding datetime-related strategies to the core strategies.
+
+``extra.datetime.timezones`` allows you to sample pytz timezones from


extra.pytz.timezones rather than extra.datetime.timezones I think?

Zac-HD · 2017-05-22T13:33:53Z

OK, rebased and generally fixed up.

However, the "maybe a test" is passing when it shouldn't be, and this is the cause of failing coverage. No idea why, but it does produce the minimal datetime(2000, 1, 1, 0, 0).

DRMacIver · 2017-05-22T13:43:35Z

However, the "maybe a test" is passing when it shouldn't be

Something weird is going on here. I'm going to have a bit of a poke into it.

DRMacIver · 2017-05-22T13:51:12Z

OK. I understand the problem now.

The min_size parameter you're adding to binary() is breaking things: You're ending up with a bunch of extra zeros at the end which don't result in a failed draw when it retries there.

I think you want to do something like:

  def test_DatetimeStrategy_draw_may_fail():
      def is_failure_inducing(b):
          try:
              return strat._attempt_one_draw(
                  ConjectureData.for_buffer(b)) is None
          except StopTest:
              return False

      strat = DatetimeStrategy(dt.datetime.min, dt.datetime.max, none())
      failure_inducing = find(binary(), is_failure_inducing)

      data = ConjectureData.for_buffer(failure_inducing * 100)
      with pytest.raises(StopTest):
          data.draw(strat)
      assert data.status == Status.INVALID

(Status and StopTest can both be imported from hypothesis.internal.conjecture.data)

Zac-HD · 2017-05-22T14:05:17Z

Ah, that makes sense - I got a StopTest error in the previous find because it raised, silenced that by adding the min_size, and then it couldn't raise later.

DRMacIver · 2017-05-22T14:09:25Z

I got a StopTest error in the previous find because it raised,

Yeah, StopTest in this context basically gets raised to, err, stop the test when something has prevented data from being correctly drawn. It mostly happens when you run out of bytes, or when mark_invalid is called. Here you're getting the former. The problem is that having too much data here is bad as well as too little, because the goal of the test is to get things to line up perfectly so that the strategy keeps getting given the wrong data.

(As a side note this test would have very occasionally failed with a StopTest before hand: The amount of data drawn by integer_range and similar is slightly variable, so sometimes it needed more than 30 bytes. I saw this happen about once in ten runs)

To include the type of the argument that was invalid, and optionally the name too.

Allowing them to exist unchanged.

And wow, is it nice not to have compatibility constraints.

Largely taken from the tests for deprecated versions, because why mess with a working implementation?

Zac-HD · 2017-05-23T09:22:40Z

Ping @DRMacIver for final review. It looks like the appveyor build simply timed out, so if you rerun that job I'm confident it will pass.

DRMacIver

🎉 ❤️

DRMacIver · 2017-05-23T10:26:08Z

@alexwlchan you still have an open review of this but I think @Zac-HD addressed all the issues. Do you want to do a more in depth final review or should we dismiss your review?

Addressed comments on an early version

Zac-HD · 2017-05-23T10:35:01Z

@alexwlchan - if you have any issues I haven't addressed, let me know; otherwise I'll merge this late tonight (UTC).

[hmm, the comment above hadn't loaded. I'll wait!]

DRMacIver · 2017-05-23T12:31:24Z

[hmm, the comment above hadn't loaded. I'll wait!]

I think it's fine to just merge if you're keen to get it through FWIW.

Zac-HD · 2017-05-23T12:40:11Z

While I'm very keen to get this merged, I'd rather err on the side of more feedback. Since midnight UTC is 10am local, it's no trouble for me to merge in the morning 😄

Zac-HD · 2017-05-23T23:57:50Z

Released! 🥂

Zac-HD force-pushed the core-time-strats branch 5 times, most recently from cf8bfba to 78011ca Compare May 12, 2017 14:39

Zac-HD mentioned this pull request May 13, 2017

False positive HypothesisDeprecationWarning #595

Closed

alexwlchan previously requested changes May 13, 2017

View reviewed changes

Zac-HD force-pushed the core-time-strats branch 2 times, most recently from 3474dad to 65be6dd Compare May 14, 2017 03:11

DRMacIver requested changes May 15, 2017

View reviewed changes

DRMacIver reviewed May 15, 2017

View reviewed changes

Zac-HD force-pushed the core-time-strats branch from 65be6dd to 4bd400c Compare May 16, 2017 14:11

Zac-HD mentioned this pull request May 16, 2017

Respect Django's USE_TZ setting in the Django extra #633

Merged

Zac-HD force-pushed the core-time-strats branch 2 times, most recently from 310ebf7 to f0f7e87 Compare May 17, 2017 00:29

Zac-HD mentioned this pull request May 19, 2017

Bias st.datetimes() towards bug-revealing values such as DST transitions and other oddities #69

Open

DRMacIver reviewed May 22, 2017

View reviewed changes

Zac-HD force-pushed the core-time-strats branch 3 times, most recently from 251dcd4 to b20d42f Compare May 22, 2017 13:15

Zac-HD added 4 commits May 23, 2017 17:31

Expand error message in type check

b43f485

To include the type of the argument that was invalid, and optionally the name too.

Decorator for deprecated tests

8aef501

Allowing them to exist unchanged.

Core strategies for datetime types

34cbb3c

And wow, is it nice not to have compatibility constraints.

timedeltas() strategy

bf16311

Zac-HD added 3 commits May 23, 2017 17:31

Add extra.timezones() strategy

fa4e3e1

Use new datetimes strat for Django extra

99c7136

Check deprecation for old datetime tests

05f268a

Zac-HD force-pushed the core-time-strats branch 2 times, most recently from 7e8f4ba to 0f23443 Compare May 23, 2017 07:35

Zac-HD added 2 commits May 23, 2017 18:56

Add tests for new datetimes strategies

352fd3e

Largely taken from the tests for deprecated versions, because why mess with a working implementation?

Bump version, add datetimes changelog entry

441f92c

Zac-HD force-pushed the core-time-strats branch from 0f23443 to 441f92c Compare May 23, 2017 08:56

DRMacIver approved these changes May 23, 2017

View reviewed changes

Zac-HD merged commit 67d6922 into HypothesisWorks:master May 23, 2017

Zac-HD mentioned this pull request May 24, 2017

Same type range specification for datetimes and related #421

Closed

Zac-HD deleted the core-time-strats branch January 31, 2018 14:30

		assert datetimes(val, val).example() is val


		TestStandardDescriptorFeatures_dates1 = strategy_test_suite(dates())

		lambda d: assume(d.tzinfo) and d.tzinfo.zone != u'UTC')


		@given(just('min_time') \| just('max_time'), times(timezones=timezones()))


		This is a feature release, adding datetime-related strategies to the core strategies.

		``extra.datetime.timezones`` allows you to sample pytz timezones from



		@checks_deprecated_behaviour

New core strategies: datetimes, dates, times, timedeltas #621

New core strategies: datetimes, dates, times, timedeltas #621

Conversation

Zac-HD commented May 12, 2017 • edited Loading

alexwlchan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zac-HD May 13, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DRMacIver left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zac-HD commented May 22, 2017

DRMacIver commented May 22, 2017

DRMacIver commented May 22, 2017

Zac-HD commented May 22, 2017

DRMacIver commented May 22, 2017

Zac-HD commented May 23, 2017

DRMacIver left a comment

Choose a reason for hiding this comment

DRMacIver commented May 23, 2017

Zac-HD commented May 23, 2017 • edited Loading

DRMacIver commented May 23, 2017

Zac-HD commented May 23, 2017

Zac-HD commented May 23, 2017

Zac-HD commented May 12, 2017 •

edited

Loading

Zac-HD May 13, 2017 •

edited

Loading

Zac-HD commented May 23, 2017 •

edited

Loading