Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Raise NotImplementedError for Categoricals with timezones (#14032)
Currently `cudf.from_pandas` with a pandas Categorical with datetimetz type will drop the timezone information (due to pyarrow) ```python In [5]: import pandas as pd In [6]: ci = pd.CategoricalIndex(pd.date_range("2016-01-01 01:01:00", periods=5, freq="D").tz_localize("UTC")) In [7]: ci Out[7]: CategoricalIndex(['2016-01-01 01:01:00+00:00', '2016-01-02 01:01:00+00:00', '2016-01-03 01:01:00+00:00', '2016-01-04 01:01:00+00:00', '2016-01-05 01:01:00+00:00'], categories=[2016-01-01 01:01:00+00:00, 2016-01-02 01:01:00+00:00, 2016-01-03 01:01:00+00:00, 2016-01-04 01:01:00+00:00, 2016-01-05 01:01:00+00:00], ordered=False, dtype='category') In [8]: ci_cudf = cudf.from_pandas(ci) In [10]: ci_cudf Out[10]: CategoricalIndex(['2016-01-01 01:01:00', '2016-01-02 01:01:00', '2016-01-03 01:01:00', '2016-01-04 01:01:00', '2016-01-05 01:01:00'], categories=[2016-01-01 01:01:00, 2016-01-02 01:01:00, 2016-01-03 01:01:00, 2016-01-04 01:01:00, 2016-01-05 01:01:00], ordered=False, dtype='category') ``` Like what is done with `IntervalIndex`, raises a `NotImplementedError` for now to avoid this wrong behavior. Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: #14032
- Loading branch information