Masking and overflow checks for datetimeindex and timedeltaindex ops #18020

jbrockmendel · 2017-10-29T17:30:14Z

There are a bunch of new tests (not obvious what the appropriate place is for a WIP test matrix like this, pls advise). The ones that will fail under master are test_timedeltaindex_add_timestamp_nat_masking and test_datetimeindex_sub_timestamp_overflow

Start filling out a test matrix of arithmetic ops.

closes NaT in TimedeltaIndex + Timestamp overflows #17991
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

WIP filling out a test matrix of arithmetic ops closes pandas-dev#17991

jreback · 2017-10-29T19:10:01Z

pandas/tests/indexes/datetimelike.py

+    #   - timezone-aware variants
+    #   - object-dtype, categorical dtype
+    #   - PeriodIndex
+    #   - consistency with .map(...) ?


have a look thru test_ops, there is lots of coverage for things like this already (or maybe test_base). don't create a giant matrix, rather parametrize as much as possible.

I'll take a look. After adding tests for #7996 (separate branch/PR), this class gets pretty huge. So yah, parameterization sounds nice.

(Also it looks like tests in this module don't get run, so that needs changing anyway).

(Also it looks like tests in this module don't get run, so that needs changing anyway).

sure they do, classes inherit from this. Pls pls pls don't create a huge matrix of tests w/o looking thru the existing. we cover quite a bit of this already.

Pls pls pls don't create a huge matrix of tests w/o looking thru the existing. we cover quite a bit of this already.

Message received. Worrying about correctness first, brevity later.

codecov · 2017-10-29T19:32:59Z

Codecov Report

Merging #18020 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #18020      +/-   ##
==========================================
- Coverage   91.25%   91.23%   -0.02%     
==========================================
  Files         163      163              
  Lines       50123    50124       +1     
==========================================
- Hits        45741    45733       -8     
- Misses       4382     4391       +9

Flag	Coverage Δ
#multiple	`89.05% <100%> (ø)`	⬆️
#single	`40.32% <20%> (-0.06%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/indexes/timedeltas.py	`91.19% <100%> (ø)`	⬆️
pandas/core/indexes/datetimelike.py	`97.1% <100%> (ø)`	⬆️
pandas/core/indexes/datetimes.py	`95.51% <100%> (ø)`	⬆️
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️
pandas/core/frame.py	`97.75% <0%> (-0.1%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3268f4e...4dbdfd7. Read the comment docs.

jreback

see comments

…dev#7996

…dules

jbrockmendel · 2017-10-29T20:49:08Z

OK, just removed the new file, took the two tests that currently fail on master and moved them into existing datetime and timedelta test files.

jorisvandenbossche · 2017-10-30T08:59:30Z

Does this also fix #7996 ?

jbrockmendel · 2017-10-30T15:18:43Z

Does this also fix #7996 ?

No, but it does fix a related bug that probably belongs in the same issue

In [3]: s = pd.Series(pd.date_range('20130101',periods=3))
In [4]: dti = pd.DatetimeIndex(s)

In [5]: s-np.datetime64('2013-01-01')
Out[5]: 
0   15705 days 23:59:59.999984
1   15706 days 23:59:59.999984
2   15707 days 23:59:59.999984
dtype: timedelta64[ns]

In [6]: dti - np.datetime64('2013-01-01')
Out[6]: DatetimeIndex(['1970-01-01', '1970-01-02', '1970-01-03'], dtype='datetime64[ns]', freq=None)

jbrockmendel · 2017-10-30T15:23:51Z

Rebased and pushed; hoping that magically fixes the CI errors

jbrockmendel · 2017-10-30T17:17:58Z

AFAICT the test failures here were caused because of fragility in test_NaT_methods and TestTimestamp.test_to_datetime_depr. Specifically, a test that this PR added (until moments ago) called ts.to_datetime() instead of ts.to_pydatetime(). This triggered a FutureWarning, which resulted in FutureWarning not being called in the tests that specifically check for it.

I think the immediate issue is now fixed, but ideally test_to_datetime_depr and test_NaT_methods would be isolated enough not to depend on test ordering.

jreback

needs a whatsnew note. 0.21.1 is fine.

jreback · 2017-10-31T01:09:29Z

pandas/tests/indexes/datetimes/test_datetime.py

@@ -447,6 +447,40 @@ def f():
            t - offset
        pytest.raises(OverflowError, f)

+    def test_datetimeindex_sub_timestamp_overflow(self):
+        dtimax = pd.to_datetime(['now', pd.Timestamp.max])


add the issue for these

There is no issue for this; I noticed it when tracking down the TimedeltaIndex bug.

sure there is, the PR number!

jreback · 2017-10-31T01:10:52Z

pandas/tests/indexes/timedeltas/test_ops.py

+
+        for variant in ts_neg_variants + ts_pos_variants:
+            res = tdinat + variant
+            assert res[1] is pd.NaT


check sub as well

also would check both add/sub for the reverse, e.g. variant + tdinat (and -)

might as well assert fully, e.g.

tm.assert_index_equal(pd.TimedeltaIndex(['NaT', 'NaT']))

Happy to do the full add/sub/radd/rsub matrix... that was kind of what I started out with. The question becomes where to put it, since arithmetic tests are scattered about. My preference would be new test modules test_arithmetic in each of indexes.timedeltas, indexes.datetimes, and indexes.periods where I can a) put these new tests and b) collect the arithmetic tests that are currently scattered about.

See discussion in #18026, #18036.

But for now I'll just edit the contents of the tests already introduced.

might as well assert fully, e.g.

tm.assert_index_equal(pd.TimedeltaIndex(['NaT', 'NaT']))

The first entry is not NaT, will not be constant across all of the variants (though the variants could be split into two groups over which it should be unchanging)

test_arithmetic sounds like a good name. key is to share code as much as possible, via fixtures / parametrization / inheritence. We ideally want these objects to act as similar as possible, so keeping special cases to a minimum is important.

note before we do this, I think splitting out the tz-aware tests to its own hierarchy should be done.

see #17583 and #17694

But for now I'll just edit the contents of the tests already introduced.

Actually as I look at this, I'd much rather close out this bug fix and follow up with the Do It Right approach.

certainly. bug fixes are good. separate, self-contained refactorings to make things more readable are better!

Great. After the current deluge of clears up, I'll circle back to this.

jreback · 2017-10-31T01:12:37Z

pandas/tests/indexes/datetimes/test_datetime.py

+
+        expected = pd.Timestamp.min.value - tsneg.value
+        for variant in ts_neg_variants:
+            res = dtimin - variant


same comment as below, assert fully the result type.

jbrockmendel · 2017-10-31T16:14:17Z

Where did we land on this? My preference is to get this bug-fix in and worry about the rest in #18049+followups.

jreback · 2017-11-01T01:21:40Z

needs a rebase

jbrockmendel · 2017-11-01T03:56:18Z

Just rebased. The two new tests are unchanged, just moved to the appropriate locations in test_arithmetic.

jreback · 2017-11-04T18:14:54Z

looks fine, tiny doc change. ping on green.

jreback · 2017-11-04T18:13:13Z

doc/source/whatsnew/v0.21.1.txt

@@ -57,6 +57,8 @@ Documentation Changes
 Bug Fixes
 ~~~~~~~~~
 - Bug in ``DataFrame.resample(...).apply(...)`` when there is a callable that returns different columns (:issue:`15169`)
+- Bug in :class:`TimedeltaIndex` subtraction could incorrectly overflow when `NaT` is present (:issue:`17791`)


double backticks around NaT

jreback · 2017-11-04T18:13:45Z

doc/source/whatsnew/v0.21.1.txt

@@ -57,6 +57,8 @@ Documentation Changes
 Bug Fixes
 ~~~~~~~~~
 - Bug in ``DataFrame.resample(...).apply(...)`` when there is a callable that returns different columns (:issue:`15169`)
+- Bug in :class:`TimedeltaIndex` subtraction could incorrectly overflow when `NaT` is present (:issue:`17791`)
+- Bug in :class:`DatetimeIndex` subtraction could fail to overflow (:issue:`18020`)


can you expand this a bit (e.g. subtracting what)

jreback · 2017-11-04T18:57:48Z

ping on green.

jbrockmendel · 2017-11-04T21:44:04Z

ping

jreback · 2017-11-04T21:47:05Z

thanks!

…andas-dev#18020) closes pandas-dev#17991

…andas-dev#18020) closes pandas-dev#17991 (cherry picked from commit 8388a47)

…ex ops (pandas-dev#18020)

…18020) closes #17991 (cherry picked from commit 8388a47)

…ex ops (#18020)

jbrockmendel added 3 commits October 29, 2017 10:22

masking and overflow checks for datetimeindex and timedeltaindex ops

e9dd35b

WIP filling out a test matrix of arithmetic ops closes pandas-dev#17991

flake8 fixup

36aea69

add todo notes

dd1eddd

jreback requested changes Oct 29, 2017

View reviewed changes

Move test somewhere it will actually get run

92afc80

remove invalid test

de663a1

jreback requested changes Oct 29, 2017

View reviewed changes

jbrockmendel added 3 commits October 29, 2017 13:12

Move towards parameterized tests; fix for np.datetime64 arith pandas-…

183e8d0

…dev#7996

test cases for datetime64/timedelta64 with different units

75a403c

Remove new test file; move two currently-failing tests to existing mo…

ff72d98

…dules

jbrockmendel added 2 commits October 29, 2017 16:42

typo fixup

ac246b1

fixup NameErrors

05254e2

jbrockmendel mentioned this pull request Oct 30, 2017

duplicate tests #18026

Closed

gfyoung added Bug Timedelta Timedelta data type labels Oct 30, 2017

Merge branch 'master' of https://github.com/pandas-dev/pandas into arith

7214823

to_datetime-->to_pydatetime; fixes unrelated test failure

f048ffa

jreback added this to the 0.21.1 milestone Oct 31, 2017

jreback requested changes Oct 31, 2017

View reviewed changes

whatsnew notes

7be5984

jreback requested changes Oct 31, 2017

View reviewed changes

Merge branch 'master' of https://github.com/pandas-dev/pandas into arith

95f801f

Merge branch 'master' of https://github.com/pandas-dev/pandas into arith

586d872

jreback requested changes Nov 4, 2017

View reviewed changes

jbrockmendel added 2 commits November 4, 2017 11:53

clarify todo note per reviewre suggestion

24dd75b

Merge branch 'master' of https://github.com/pandas-dev/pandas into arith

4dbdfd7

jreback approved these changes Nov 4, 2017

View reviewed changes

jreback merged commit 8388a47 into pandas-dev:master Nov 4, 2017

jreback added the Needs Backport label Nov 4, 2017

1kastner pushed a commit to 1kastner/pandas that referenced this pull request Nov 5, 2017

Masking and overflow checks for datetimeindex and timedeltaindex ops (p…

ffd363b

…andas-dev#18020) closes pandas-dev#17991

No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017

Masking and overflow checks for datetimeindex and timedeltaindex ops (p…

636ccf8

…andas-dev#18020) closes pandas-dev#17991

jbrockmendel deleted the arith branch December 8, 2017 19:40

TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this pull request Dec 11, 2017

Masking and overflow checks for datetimeindex and timedeltaindex ops (p…

8adc68f

…andas-dev#18020) closes pandas-dev#17991 (cherry picked from commit 8388a47)

TomAugspurger added a commit to TomAugspurger/pandas that referenced this pull request Dec 11, 2017

fixup! Masking and overflow checks for datetimeindex and timedeltaind…

6b2e0a4

…ex ops (pandas-dev#18020)

TomAugspurger pushed a commit that referenced this pull request Dec 12, 2017

Masking and overflow checks for datetimeindex and timedeltaindex ops (#…

64339a6

…18020) closes #17991 (cherry picked from commit 8388a47)

TomAugspurger added a commit that referenced this pull request Dec 12, 2017

fixup! Masking and overflow checks for datetimeindex and timedeltaind…

1e81abb

…ex ops (#18020)

TomAugspurger removed the Needs Backport label Dec 12, 2017

spencerkclark mentioned this pull request Dec 24, 2018

Fix failure in time encoding for pandas < 0.21.1 pydata/xarray#2630

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Masking and overflow checks for datetimeindex and timedeltaindex ops #18020

Masking and overflow checks for datetimeindex and timedeltaindex ops #18020

jbrockmendel commented Oct 29, 2017

jreback Oct 29, 2017

jbrockmendel Oct 29, 2017

jreback Oct 29, 2017

jbrockmendel Oct 29, 2017

codecov bot commented Oct 29, 2017 •

edited

Loading

jreback left a comment

jbrockmendel commented Oct 29, 2017

jorisvandenbossche commented Oct 30, 2017

jbrockmendel commented Oct 30, 2017

jbrockmendel commented Oct 30, 2017

jbrockmendel commented Oct 30, 2017

jreback left a comment

jreback Oct 31, 2017

jbrockmendel Oct 31, 2017

jreback Oct 31, 2017

jreback Oct 31, 2017

jreback Oct 31, 2017

jreback Oct 31, 2017

jbrockmendel Oct 31, 2017

jbrockmendel Oct 31, 2017

jreback Oct 31, 2017

jbrockmendel Oct 31, 2017

jreback Oct 31, 2017

jbrockmendel Oct 31, 2017

jreback Oct 31, 2017

jbrockmendel commented Oct 31, 2017

jreback commented Nov 1, 2017

jbrockmendel commented Nov 1, 2017

jreback commented Nov 4, 2017

jreback Nov 4, 2017

jreback Nov 4, 2017

jreback commented Nov 4, 2017

jbrockmendel commented Nov 4, 2017

jreback commented Nov 4, 2017

Masking and overflow checks for datetimeindex and timedeltaindex ops #18020

Masking and overflow checks for datetimeindex and timedeltaindex ops #18020

Conversation

jbrockmendel commented Oct 29, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Oct 29, 2017 • edited Loading

Codecov Report

jreback left a comment

Choose a reason for hiding this comment

jbrockmendel commented Oct 29, 2017

jorisvandenbossche commented Oct 30, 2017

jbrockmendel commented Oct 30, 2017

jbrockmendel commented Oct 30, 2017

jbrockmendel commented Oct 30, 2017

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Oct 31, 2017

jreback commented Nov 1, 2017

jbrockmendel commented Nov 1, 2017

jreback commented Nov 4, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Nov 4, 2017

jbrockmendel commented Nov 4, 2017

jreback commented Nov 4, 2017

codecov bot commented Oct 29, 2017 •

edited

Loading