clib.converison._to_numpy: Add tests for pandas.Series with datetime dtypes #3670
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds tests for pandas.Series with datetime dtypes. Address #3600.
In pandas, datetime dtypes can be specified in following ways:
"datetime64[s]"
pd.DatetimeTZDtype(s, tz="UTC")
or"datetime64[s, UTC]"
pd.ArrowDtype(pyarrow.Timestamp(s, tz="UTC"))
or"timestamp[s, UTC][pyarrow]"
The following codes help us understand the default conversion behaviors:
Via NumPy dtypes. The conversions are done as expected.
Via pd.DateTimeTZDtype with TZ. The pandas.series object is converted to
object
dtype. So we need to deal with the conversion manually. The expected numpy dtype and TZ information can be accessed viaseries.dtype.base
andseries.dtype.tz
.In pandas 2.0, there was a bug (pandas-dev/pandas#52705) that pd.DateTimeTZDtype with any units are stored with dtype in
ns
resolution. The bug was fixed in pandas 2.1 (pandas-dev/pandas#52706), but there is no workaround on our side so the related tests are marked as xfail for pandas 2.0Via pa.Timestamp. The pandas.Series object is converted to
object
dtype. So, we need to deal with it manually. The expected numpy datetime type and TZ information can be accessed viaseries.dtype.numpy_dtype
andseries.dtype.pyarrow_dtype.tz
.In pandas 2.0,
series.dtype.numpy_dtype
isdtype('O')
, and it doesn't have theseries.dt.tz_convert
method.