Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: fix Period.asfreq conversion near datetime(1, 1, 1) #19650

Merged
merged 17 commits into from
Feb 21, 2018

Conversation

jbrockmendel
Copy link
Member

@jbrockmendel jbrockmendel commented Feb 11, 2018

@jorisvandenbossche
Copy link
Member

Is there a reason you moved the tests?

@jbrockmendel
Copy link
Member Author

Is there a reason you moved the tests?

Ongoing process of centralizing and de-duplicating tests. The moved tests cover code near the bug being fixed, so it seemed close enough to in-scope.

@codecov
Copy link

codecov bot commented Feb 12, 2018

Codecov Report

Merging #19650 into master will increase coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19650      +/-   ##
==========================================
+ Coverage   91.61%   91.61%   +<.01%     
==========================================
  Files         150      150              
  Lines       48887    48882       -5     
==========================================
- Hits        44786    44782       -4     
+ Misses       4101     4100       -1
Flag Coverage Δ
#multiple 89.98% <ø> (ø) ⬆️
#single 41.79% <ø> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/sparse/series.py 95.25% <0%> (-0.02%) ⬇️
pandas/core/sparse/frame.py 94.81% <0%> (ø) ⬆️
pandas/core/sparse/array.py 91.46% <0%> (+0.08%) ⬆️
pandas/core/ops.py 96.86% <0%> (+0.12%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 740ad9a...e4e453b. Read the comment docs.

@gfyoung gfyoung added Bug Period Period data type labels Feb 12, 2018
@jorisvandenbossche
Copy link
Member

It feels you are rather spreading the Period tests to me? This are tests for the public Period class, and tests related to our scalar classes (Timestamp, Period, Timedelta, Interval) are centralized in the /tests/scalar/ directory. My preference is to keep it that way.
(I think the tslibs tests are currently tests on functionality contained in tslibs we want to test but is not public?)

@jreback
Copy link
Contributor

jreback commented Feb 12, 2018

lets create pandas/tests/scalar/period and move these tests to freq.py

@@ -154,12 +154,30 @@ cdef inline int get_freq_group(int freq) nogil:
return (freq // 1000) * 1000


@cython.cdivision
@cython.cdivision(False) # specifically _dont_ use cdvision GH#19643
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment on why

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather does this actually matter now that you are always using floor division?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather does this actually matter now that you are always using floor division?

Yes. With cdivision -1 // 7 returns 0. I find it really counter-intuitive, not at all surprising that this bug slipped through.

Parameters
----------
dinfo : date_info*
absdate : int64_t (as returned from get_python_ordinal or absdate_from_ymd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a great comment (nor does it follow doc-string conventions)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change.


Parameters
----------
dinfo : date_info*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pointer to

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what the asterisk is for (this notation is used in docstrings elsewhere). Have a preferred syntax?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, you are removing this anyhow IIRC? (in follow on)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Parameters
----------
dinfo : date_info*
abstime : double
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you be more specific here , e.g. what is abstime

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next line seconds elapsed since beginning of day described by absdate isn't clear?


Returns
-------
code : int (always 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

success / error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually a dummy; there is no error-state. I guess just make it return None?

@jbrockmendel
Copy link
Member Author

re test location: will change.

@jorisvandenbossche jorisvandenbossche changed the title Fix asfreq conversion BUG: fix Period.asfreq conversion near datetime(1, 1, 1) Feb 12, 2018


class TestPeriodFreqConversion(object):
@pytest.mark.parametrize('freq', ['A', 'Q', 'M', 'W', 'B', 'D'])
def test_asfreq_near_zero(self, freq):
Copy link
Contributor

@jreback jreback Feb 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a whatsnew note for this
nvm saw that

tup2 = (prev.year, prev.month, prev.day)
assert tup2 < tup1

@pytest.mark.xfail
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a reason

@jreback jreback added this to the 0.23.0 milestone Feb 13, 2018
@jreback
Copy link
Contributor

jreback commented Feb 13, 2018

small comment. ping on green. ok with moving tests in another PR (to a period scalar sub-module) as discussed.

@jreback
Copy link
Contributor

jreback commented Feb 13, 2018

The following asvs benchmarks (if any) failed.
[ 43.09%] ··· Running io.excel.Excel.time_read_excel                  1/3 failed
                   xlwt      failed 

its possible this actually broke this asv

@jbrockmendel
Copy link
Member Author

its possible this actually broke this asv

Is there a test this corresponds to?

@jbrockmendel
Copy link
Member Author

It executes fine locally, will run asv on it to see if the timing is massively changed.

@jbrockmendel
Copy link
Member Author

asv runs fine on both 3.5 and 2.7

@jbrockmendel
Copy link
Member Author

Thoughts on this? I assumed the asv concern superseded the "ping on green"

@jbrockmendel jbrockmendel mentioned this pull request Feb 16, 2018
4 tasks

Parameters
----------
dinfo : date_info*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, you are removing this anyhow IIRC? (in follow on)

abstime : double
seconds elapsed since beginning of day described by absdate

Returns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should no longer be needed. IOW this is the point of an errors declaration (and raising)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, will be removing the leftover-from-C bits (and camelCase) in the next pass. For now this PR is focused on just the one bug.

Parameters
----------
dinfo : date_info*
abstime : double
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here

@@ -336,7 +407,7 @@ cdef int dInfoCalc_SetFromAbsTime(date_info *dinfo, double abstime) nogil:
dinfo.hour = hour
dinfo.minute = minute
dinfo.second = second
return 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need a return?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't. Will remove when killing this function off.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just remove the return

@jreback jreback removed this from the 0.23.0 milestone Feb 16, 2018
@@ -336,7 +407,7 @@ cdef int dInfoCalc_SetFromAbsTime(date_info *dinfo, double abstime) nogil:
dinfo.hour = hour
dinfo.minute = minute
dinfo.second = second
return 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just remove the return

abstime : double
seconds elapsed since beginning of day described by absdate

Returns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't have Returns for None, remove


Returns
-------
absdate : int days elapsed since datetime(1, 1, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc-string format, put the expl on the next line


Returns
-------
qtr_freq : int describing the implied quarterly frequency
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same


Parameters
----------
ordinal : int64_t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very odd that these take pointers

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point these functions are translated as directly as possible from the C versions. In future passes I'd like to make them more pythonic too, will need to be careful about perf.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that’s fine

@@ -1,10 +1,32 @@
# -*- coding: utf-8 -*-

import pytest

from pandas.errors import OutOfBoundsDatetime
from pandas._libs.tslibs.frequencies import get_freq
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would not be averse to makeing this a submodule in future
tests/tslibls/period/test_asfreq

@jbrockmendel
Copy link
Member Author

I think all the requests have been hit.

@jorisvandenbossche
Copy link
Member

Is there a reason you are adding the test in /tests/tslibs/test_period_asfreq.py and not in tests/scalar/test_period_asfreq.py ?

@jbrockmendel
Copy link
Member Author

Is there a reason you are adding the test in /tests/tslibs/test_period_asfreq.py and not in tests/scalar/test_period_asfreq.py ?

test.tslibs is intended for testing tslibs in isolation. These asfreq tests are largely for period_helper (or were until recently) and so are self-contained. This is the right long-term home for these tests.

@jorisvandenbossche
Copy link
Member

test.tslibs is intended for testing tslibs in isolation.

Yes, and the tests you added are testing behaviour of the Period class (not a private utility of tslib), which for the rest is tested in tests/scalar/test_period(_asfreq).py ?

@jreback
Copy link
Contributor

jreback commented Feb 19, 2018

@jorisvandenbossche I asked @jbrockmendel mendel to move these. not quite happy where the scalar tests are ATM, now whether actualy in tslib or the scalar/period is a matter of opinion here. I agree if these are actually user facing then they should be in the scalar/period/....

however as that doesn't exist yet (well in a minute it will), it was a bit confusing.

@jorisvandenbossche
Copy link
Member

however as that doesn't exist yet (well in a minute it will)

tests/scalar/test_period_asfreq.py does actually already exist

agree if these are actually user facing then they should be in the scalar/period/....

This is the case here

We can reorganize tests, but let's do that in a separate PR, and add the tests here in the location where similar tests already are? (which is AFAIK tests/scalar/test_period_asfreq.py)

@jreback
Copy link
Contributor

jreback commented Feb 19, 2018

@jorisvandenbossche you will see shortly. the test reorg has been going on for quite a while and is moving in the right direction.

tup1 = (per.year, per.hour, per.day)

prev = per - 1
assert (per - 1).ordinal == per.ordinal - 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think as a general rule if you are importing Period, then it should go in pandas/tests/scalar/period/* else it can go here (e.g. you are testing a helper function)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing as how the recent addition of tests.scalar.period will necessitate a new look at this, I propose we merge this bug fix and the next pass (later today) can attempt to standardize this convention.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can also simply move the tests there now (as travis has to run again anyhow since you just added a commit)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update for this comment

@jreback jreback added this to the 0.23.0 milestone Feb 20, 2018
@jreback
Copy link
Contributor

jreback commented Feb 20, 2018

lgtm. I changed the whatsnew slightly, so needs a rebase. ping on green. can fixup period tests after.

@jbrockmendel
Copy link
Member Author

ping

tup1 = (per.year, per.hour, per.day)

prev = per - 1
assert (per - 1).ordinal == per.ordinal - 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update for this comment

@jorisvandenbossche jorisvandenbossche merged commit e5be6bd into pandas-dev:master Feb 21, 2018
@jorisvandenbossche
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Period Period data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Period.asfreq issue caused(?) by C division rules
4 participants