-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Last of the timezones funcs #17669
Last of the timezones funcs #17669
Conversation
refactor a large timezone-specific chunk of tslib.tz_localize_to_utc to tslibs.timezones._infer_dst
pandas/_libs/tslibs/timezones.pyx
Outdated
ndarray[int64_t] result_b): | ||
cdef: | ||
Py_ssize_t n = len(vals) | ||
ndarray[int64_t] dst_hours |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sure u r typing s the original
there are lots of issues here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The typings should be identical. There are a couple of variables that do not have type declarations; those can be added.
def _infer_tzinfo(start, end): | ||
def _infer(a, b): | ||
tz = a.tzinfo | ||
if b and b.tzinfo: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this actually used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outside of tests, its used once in indexes.datetimes
pandas/_libs/tslibs/timezones.pyx
Outdated
@@ -275,3 +275,84 @@ cdef object get_dst_info(object tz): | |||
dst_cache[cache_key] = (trans, deltas, typ) | |||
|
|||
return dst_cache[cache_key] | |||
|
|||
|
|||
def _infer_tzinfo(start, end): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
de-privatize these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
pandas/_libs/tslibs/timezones.pyx
Outdated
|
||
def _infer_tzinfo(start, end): | ||
def _infer(a, b): | ||
tz = a.tzinfo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be an outside function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to bother changing it, might as well get rid of _infer
. It's only actually relevant in one case (of four)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
pandas/_libs/tslib.pyx
Outdated
@@ -105,7 +105,7 @@ from tslibs.timezones cimport ( | |||
is_utc, is_tzlocal, is_fixed_offset, | |||
treat_tz_as_dateutil, treat_tz_as_pytz, | |||
get_timezone, get_utcoffset, maybe_get_tz, | |||
get_dst_info | |||
get_dst_info, _infer_dst |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's give this a more descriptive name: infer_dst_transitions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
pandas/_libs/tslibs/timezones.pyx
Outdated
|
||
def _infer_tzinfo(start, end): | ||
def _infer(a, b): | ||
tz = a.tzinfo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
pandas/_libs/tslibs/timezones.pyx
Outdated
return tz | ||
|
||
|
||
cdef ndarray[int64_t] _infer_dst(ndarray[int64_t] vals, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are removing part of this function. I think its better to move almost all of it to timezones or leave it. (ok with latter for now, maybe do former at some point). Otherwise you end up splitting the logic / comments in 2 places.
I think this requires a bit more thought here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the appropriate block to separate out. It is exclusively focused on getting DST info for use in the original function. The rest of the function is thematically more related to _TSObject conversion.
In particular, pandas_datetimestruct
doesn't belong anywhere near timezones
.
Codecov Report
@@ Coverage Diff @@
## master #17669 +/- ##
==========================================
- Coverage 91.25% 91.23% -0.02%
==========================================
Files 163 163
Lines 49808 49798 -10
==========================================
- Hits 45454 45435 -19
- Misses 4354 4363 +9
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #17669 +/- ##
==========================================
+ Coverage 91.24% 91.25% +<.01%
==========================================
Files 163 163
Lines 49766 49770 +4
==========================================
+ Hits 45411 45417 +6
+ Misses 4355 4353 -2
Continue to review full report at Codecov.
|
pandas/_libs/tslib.pyx
Outdated
both_eq = result_a == result_b | ||
trans_idx = np.squeeze(np.nonzero(np.logical_and(both_nat, ~both_eq))) | ||
if trans_idx.size == 1: | ||
stamp = Timestamp(vals[trans_idx]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert this routine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just do the infer_tzinfo for now
ok rebase, ping on green. |
thanks |
* 'master' of github.com:pandas-dev/pandas: (188 commits) Separate out _convert_datetime_to_tsobject (pandas-dev#17715) DOC: remove whatsnew note for xref pandas-dev#17131 BUG: Regression in .loc accepting a boolean Index as an indexer (pandas-dev#17738) DEPR: Deprecate cdate_range and merge into bdate_range (pandas-dev#17691) CLN: replace %s syntax with .format in pandas.core: categorical, common, config, config_init (pandas-dev#17735) Fixed the memory usage explanation of categorical in gotchas from O(nm) to O(n+m) (pandas-dev#17736) TST: add backward compat for offset testing for pickles (pandas-dev#17733) remove unused time conversion funcs (pandas-dev#17711) DEPR: Deprecate convert parameter in take (pandas-dev#17352) BUG:Time Grouper bug fix when applied for list groupers (pandas-dev#17587) BUG: Fix some PeriodIndex resampling issues (pandas-dev#16153) BUG: Fix unexpected sort in groupby (pandas-dev#17621) DOC: Fixed typo in documentation for 'pandas.DataFrame.replace' (pandas-dev#17731) BUG: Fix series rename called with str altering name rather index (GH17407) (pandas-dev#17654) DOC: Add examples for MultiIndex.get_locs + cleanups (pandas-dev#17675) Doc improvements for IntervalIndex and Interval (pandas-dev#17714) BUG: DataFrame sort_values and multiple "by" columns fails to order NaT correctly Last of the timezones funcs (pandas-dev#17669) Add missing file to _pyxfiles, delete commented-out (pandas-dev#17712) update imports of DateParseError, remove unused imports from tslib (pandas-dev#17713) ...
Moves
tools.datetimes._infer_tzinfo
totslibs.timezones._infer_tzinfo
.Refactor a large timezone-specific chunk of
tslib.tz_localize_to_utc
totslibs.timezones._infer_dst
The new func
timezones._infer_dst
is taken out oftslib.tz_localize_to_utc
. It is very nearly a cut/paste. The only change is removed a couple calls toTimestamp
that are used for rendering an exception message, used np.datetime64 in its place.git diff upstream/master -u -- "*.py" | flake8 --diff