-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix Period and PeriodIndex support of combined offsets aliases #13874
Conversation
can you run an asv on this, make sure looks good for perf. |
also would need to add tests from the issue. I dont' think we are actually testing this (and that's why it doesn't fail). |
@@ -1203,6 +1208,7 @@ class Period(_Period): | |||
|
|||
def _ordinal_from_fields(year, month, quarter, day, | |||
hour, minute, second, freq): | |||
freq = Period._maybe_convert_freq(freq) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't needed, as _ordinal_from_field
is only called from here (after 1130th line)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right. I was being overly cautious (maybe some other part of the code might need to call this in the future?) but I can remove it.
I ran the asv benchmarks. Here are the results:
I have no experience whatsoever with these, so I don't know how to interpret the results above, but I find it perplexing that the two tests that run slower have absolutely nothing to do with time series... |
Thanks for the test. The slowness looks to be caused by regex search. Maybe we should try dict lookup first like |
@sinhrks I'm confused. The asv test that has to do with periods is faster with this PR... |
Ah sorry, I looked first 2 ratios (looks unrelated to the period). Pls ignore my comment. |
Current coverage is 85.31% (diff: 96.29%)@@ master #13874 diff @@
==========================================
Files 139 139
Lines 50143 50146 +3
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 42777 42780 +3
Misses 7366 7366
Partials 0 0
|
pls rebase. Need tests to close #13730 ; and add to whatsnew. |
Rebased. I have also added tests for There is one standing issue: the function
In view of this, I propose deprecating |
I implemented the changes and deprecation I proposed in my last comment in the last commit, including tests for the deprecation warning anywhere |
@@ -864,3 +865,4 @@ Bug Fixes | |||
- Bug in ``.to_excel()`` when DataFrame contains a MultiIndex which contains a label with a NaN value (:issue:`13511`) | |||
- Bug in ``pd.read_csv`` in Python 2.x with non-UTF8 encoded, multi-character separated data (:issue:`3404`) | |||
- Bug in ``Index`` raises ``KeyError`` displaying incorrect column when column is not in the df and columns contains duplicate values (:issue:`13822`) | |||
- Bug in ``Period`` and ``PeriodIndex`` creating wrong dates when frequency has multiple offsets (:issue:`13874`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"combined offsets" or "combined aliases" should be better.
@sinhrks Is "combined offset aliases" okay? |
@agraboso Thx, it sounds ok. |
@@ -478,7 +485,7 @@ def asfreq(self, freq=None, how='E'): | |||
""" | |||
how = _validate_end_alias(how) | |||
|
|||
freq = frequencies.get_standard_freq(freq) | |||
freq = Period._maybe_convert_freq(freq).freqstr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks better to pass freq as it is, because _gfc
can accept offset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean stripping the .freqstr
at the end? Sure, that works (tests pass)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, thx to confirm.
@jreback How does it look? |
|
||
Parameters | ||
---------- | ||
freq : str or tuple or datetime.timedelta or DateOffset or None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use commas rather than or
@jreback Done. |
|
||
See Also | ||
-------- | ||
DateOffset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this pandas.DateOffset
you are going to have a flake8 minor issue (long lines). otherwise lgtm. can you run an asv vs master (just |
|
are u just showing the single change in asv? |
Here is the whole output:
By the way, I just realized that |
thanks! |
* Remove frequencies.get_standard_freq * Drop the "freqstr" keyword from frequencies.to_offset Deprecated in v0.19.0 xref pandas-devgh-13874
* Remove frequencies.get_standard_freq * Drop the "freqstr" keyword from frequencies.to_offset Deprecated in v0.19.0 xref pandas-devgh-13874
* Remove frequencies.get_standard_freq * Drop the "freqstr" keyword from frequencies.to_offset Deprecated in v0.19.0 xref pandas-devgh-13874
* Remove frequencies.get_standard_freq * Drop the "freqstr" keyword from frequencies.to_offset Deprecated in v0.19.0 xref pandas-devgh-13874
* Remove frequencies.get_standard_freq * Drop the "freqstr" keyword from frequencies.to_offset Deprecated in v0.19.0 xref gh-13874
git diff upstream/master | flake8 --diff
Essentially, makes sure any
freq
string passed to aPeriod
orPeriodIndex
goes through_Period._maybe_convert_freq()
, which callsto_offset()
, which is where the logic for combining aliases is.All the examples in #13730 result in the correct output, and all existing tests pass. I have not written any new ones yet — I first wanted to get the opinion of the maintainers.
This PR builds on #13868 (without it, some existing tests fail).