-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move timezone conversion logic to DatetimeColumn
#15545
Move timezone conversion logic to DatetimeColumn
#15545
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor cleanups, looks good to me (though I am definitely tz-naive).
|
||
|
||
@lru_cache(maxsize=20) | ||
def get_tz_data(zone_name): | ||
def get_tz_data(zone_name: str) -> Tuple[DatetimeColumn, TimeDeltaColumn]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: Please update the docstring description of the return value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch. Updated
@@ -77,99 +76,23 @@ def _find_and_read_tzfile_tzdata(zone_name): | |||
raise zoneinfo.ZoneInfoNotFoundError(zone_name) | |||
|
|||
|
|||
def _read_tzfile_as_frame(tzdir, zone_name): | |||
def _read_tzfile_as_frame( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
polish: rename to _read_tzfile_as_columns
, since we no longer return a frame?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Change it to that
cond = clock_1 > clock_2 | ||
nonexistent_begin = clock_2.apply_boolean_mask(cond) | ||
return (as_column([min_date]), as_column([np.timedelta64(0, "s")])) | ||
return tuple(transition_times_and_offsets) # type: ignore[return-value] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: This type ignore is because the cython wrapper has no typing information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I believe so. mypy is picking up the call from the cython function as list[Any]
) -> DatetimeTZColumn: | ||
def check_ambiguous_and_nonexistent( | ||
ambiguous: Literal["NaT"], nonexistent: Literal["NaT"] | ||
) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: parse don't validate, check_ambiguous_and_nonexistent(...) -> tuple[Literal["NaT"], Literal["NaT"]]
and return those values (and then use them in callers).
Though here it's a bit of a wash since you're not doing any type narrowing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing can change it to parse (although these arguments are not actually used)
@@ -688,6 +702,119 @@ def _with_type_metadata(self, dtype): | |||
) | |||
return self | |||
|
|||
def _find_ambiguous_and_nonexistent( | |||
self, zone_name: str | |||
) -> Tuple[NumericalColumn | bool, NumericalColumn | bool]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
) -> Tuple[NumericalColumn | bool, NumericalColumn | bool]: | |
) -> Tuple[NumericalColumn, NumericalColumn] | Tuple[bool, bool]: |
size as `data`, that respectively indicate ambiguous and | ||
nonexistent timestamps in `data` with the value `True`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size as `data`, that respectively indicate ambiguous and | |
nonexistent timestamps in `data` with the value `True`. | |
size as `self`, that respectively indicate ambiguous and | |
nonexistent timestamps in `self` with the value `True`. |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Changed to self
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
/merge |
Description
Moves methods/logic in
python/cudf/cudf/core/_internals/timezones.py
to the newly createdDatetimeColumn.tz_localize
andDatetimeColumn.tz_convert
.Additionally adds typing and improves an error message when doing
tz_convert(None)
on a tz-naive Series/Index to raise aTypeError
(like pandas) instead of anAttributeError
Checklist