Skip to content

Commit

Permalink
BUG: convert_dtypes(dtype_backend="pyarrow") losing tz for tz-aware d…
Browse files Browse the repository at this point in the history
…types (#53382)

* BUG: convert_dtypes(dtype_backend="pyarrow") losing tz for tz-aware dtypes

* whatsnew
  • Loading branch information
lukemanley authored May 25, 2023
1 parent d6ee9ad commit d15041b
Show file tree
Hide file tree
Showing 4 changed files with 19 additions and 2 deletions.
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.0.2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ Bug fixes
- Bug in :func:`to_timedelta` was raising ``ValueError`` with ``pandas.NA`` (:issue:`52909`)
- Bug in :meth:`DataFrame.__getitem__` not preserving dtypes for :class:`MultiIndex` partial keys (:issue:`51895`)
- Bug in :meth:`DataFrame.convert_dtypes` ignores ``convert_*`` keywords when set to False ``dtype_backend="pyarrow"`` (:issue:`52872`)
- Bug in :meth:`DataFrame.convert_dtypes` losing timezone for tz-aware dtypes and ``dtype_backend="pyarrow"`` (:issue:`53382`)
- Bug in :meth:`DataFrame.sort_values` raising for PyArrow ``dictionary`` dtype (:issue:`53232`)
- Bug in :meth:`Series.describe` treating pyarrow-backed timestamps and timedeltas as categorical data (:issue:`53001`)
- Bug in :meth:`Series.rename` not making a lazy copy when Copy-on-Write is enabled when a scalar is passed to it (:issue:`52450`)
Expand Down
3 changes: 3 additions & 0 deletions pandas/core/arrays/arrow/array.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
is_object_dtype,
is_scalar,
)
from pandas.core.dtypes.dtypes import DatetimeTZDtype
from pandas.core.dtypes.missing import isna

from pandas.core import roperator
Expand Down Expand Up @@ -170,6 +171,8 @@ def to_pyarrow_type(
return dtype.pyarrow_dtype
elif isinstance(dtype, pa.DataType):
return dtype
elif isinstance(dtype, DatetimeTZDtype):
return pa.timestamp(dtype.unit, dtype.tz)
elif dtype:
try:
# Accepts python types too
Expand Down
4 changes: 3 additions & 1 deletion pandas/core/dtypes/cast.py
Original file line number Diff line number Diff line change
Expand Up @@ -1097,7 +1097,9 @@ def convert_dtypes(
and not isinstance(inferred_dtype, StringDtype)
)
):
if isinstance(inferred_dtype, PandasExtensionDtype):
if isinstance(inferred_dtype, PandasExtensionDtype) and not isinstance(
inferred_dtype, DatetimeTZDtype
):
base_dtype = inferred_dtype.base
elif isinstance(inferred_dtype, (BaseMaskedDtype, ArrowDtype)):
base_dtype = inferred_dtype.numpy_dtype
Expand Down
13 changes: 12 additions & 1 deletion pandas/tests/frame/methods/test_convert_dtypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,8 @@ def test_pyarrow_dtype_backend(self):
"c": pd.Series([True, False, None], dtype=np.dtype("O")),
"d": pd.Series([np.nan, 100.5, 200], dtype=np.dtype("float")),
"e": pd.Series(pd.date_range("2022", periods=3)),
"f": pd.Series(pd.timedelta_range("1D", periods=3)),
"f": pd.Series(pd.date_range("2022", periods=3, tz="UTC").as_unit("s")),
"g": pd.Series(pd.timedelta_range("1D", periods=3)),
}
)
result = df.convert_dtypes(dtype_backend="pyarrow")
Expand All @@ -76,6 +77,16 @@ def test_pyarrow_dtype_backend(self):
)
),
"f": pd.arrays.ArrowExtensionArray(
pa.array(
[
datetime.datetime(2022, 1, 1),
datetime.datetime(2022, 1, 2),
datetime.datetime(2022, 1, 3),
],
type=pa.timestamp(unit="s", tz="UTC"),
)
),
"g": pd.arrays.ArrowExtensionArray(
pa.array(
[
datetime.timedelta(1),
Expand Down

0 comments on commit d15041b

Please sign in to comment.