Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Raise when casting NaT to int #28492

Merged
merged 18 commits into from
Jan 1, 2020
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,7 @@ Datetimelike
- Bug in :class:`Timestamp` subtraction when subtracting a :class:`Timestamp` from a ``np.datetime64`` object incorrectly raising ``TypeError`` (:issue:`28286`)
- Addition and subtraction of integer or integer-dtype arrays with :class:`Timestamp` will now raise ``NullFrequencyError`` instead of ``ValueError`` (:issue:`28268`)
- Bug in :class:`Series` and :class:`DataFrame` with integer dtype failing to raise ``TypeError`` when adding or subtracting a ``np.datetime64`` object (:issue:`28080`)
- Bug in :meth:`Series.astype` failing to handle ``pd.NaT`` when casting to an integer dtype (:issue:`28492`)
-


Expand Down
4 changes: 4 additions & 0 deletions pandas/core/dtypes/cast.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@
from pandas._libs.tslibs import NaT, OutOfBoundsDatetime, Period, iNaT
from pandas.util._validators import validate_bool_kwarg

import pandas as pd
dsaxton marked this conversation as resolved.
Show resolved Hide resolved

from .common import (
_INT64_DTYPE,
_NS_DTYPE,
Expand Down Expand Up @@ -696,6 +698,8 @@ def astype_nansafe(arr, dtype, copy=True, skipna=False):
if is_object_dtype(dtype):
return tslib.ints_to_pydatetime(arr.view(np.int64))
elif dtype == np.int64:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this likely needs to be is_integer_dtype(dtype)

Copy link
Member Author

@dsaxton dsaxton Sep 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might actually need to be np.int64 since above we check for datetime64 (getting some weird test failures after this change). Looking at the numpy documentation it seems the output of view is "unpredictable" when the precision differs between the data types (the shape of the output can even change).

if pd.isnull(arr).any():
raise ValueError("Cannot convert NaT values to integer")
return arr.view(dtype)

# allow frequency conversions
Expand Down
9 changes: 9 additions & 0 deletions pandas/tests/dtypes/test_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

import pandas.util._test_decorators as td

from pandas.core.dtypes.cast import astype_nansafe
import pandas.core.dtypes.common as com
from pandas.core.dtypes.dtypes import (
CategoricalDtype,
Expand Down Expand Up @@ -731,3 +732,11 @@ def test__is_dtype_type_sparse():
result = np.dtype("int32")
assert com._is_dtype_type(ser, lambda tipo: tipo == result)
assert com._is_dtype_type(ser.dtype, lambda tipo: tipo == result)


def test__nansafe_nat_to_int():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to

test_astype_nansafe; can you parameterize this on all ints at the very least. ideally we would have more testing for this (but can be followup)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the other (non-int64) dtypes we end up hitting the TypeError here (related to the above comment):

raise TypeError(

Would you recommend breaking these out into a separate test?

arr = np.array([np.datetime64("NaT")])
dsaxton marked this conversation as resolved.
Show resolved Hide resolved

msg = "Cannot convert NaT values to integer"
with pytest.raises(ValueError, match=msg):
astype_nansafe(arr, dtype=np.int64)