-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Raise when casting NaT to int #28492
Conversation
@@ -696,6 +698,8 @@ def astype_nansafe(arr, dtype, copy=True, skipna=False): | |||
if is_object_dtype(dtype): | |||
return tslib.ints_to_pydatetime(arr.view(np.int64)) | |||
elif dtype == np.int64: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this likely needs to be is_integer_dtype(dtype)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might actually need to be np.int64
since above we check for datetime64
(getting some weird test failures after this change). Looking at the numpy
documentation it seems the output of view
is "unpredictable" when the precision differs between the data types (the shape of the output can even change).
pandas/tests/dtypes/test_common.py
Outdated
@@ -731,3 +732,11 @@ def test__is_dtype_type_sparse(): | |||
result = np.dtype("int32") | |||
assert com._is_dtype_type(ser, lambda tipo: tipo == result) | |||
assert com._is_dtype_type(ser.dtype, lambda tipo: tipo == result) | |||
|
|||
|
|||
def test__nansafe_nat_to_int(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to
test_astype_nansafe; can you parameterize this on all ints at the very least. ideally we would have more testing for this (but can be followup)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the other (non-int64) dtypes we end up hitting the TypeError
here (related to the above comment):
pandas/pandas/core/dtypes/cast.py
Line 707 in b81f433
raise TypeError( |
Would you recommend breaking these out into a separate test?
is there a user visible level that hits this? |
This hits both Series.astype, Index.astype, and DataFrame.astype In [6]: a = [pd.Timestamp('2000'), pd.NaT]
In [7]: pd.Series(a).astype(int)
Out[7]:
0 946684800000000000
1 -9223372036854775808
dtype: int64
In [8]: pd.Index(a).astype(int)
Out[8]: Int64Index([946684800000000000, -9223372036854775808], dtype='int64')
In [10]: pd.DataFrame({"a": a}).astype(int)
Out[10]:
a
0 946684800000000000
1 -9223372036854775808 In addition to the And we should double check: we want this behavior right? Some people may be relying on this behavior (treating |
@@ -706,3 +707,13 @@ def test__get_dtype_fails(input_param, expected_error_message): | |||
) | |||
def test__is_dtype_type(input_param, result): | |||
assert com._is_dtype_type(input_param, lambda tipo: tipo == result) | |||
|
|||
|
|||
@pytest.mark.parametrize("val", [np.datetime64("NaT"), np.timedelta64("NaT")]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add an additional test for when dtype == object (which is the other path that hits M8 and m8)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a test separate from test_astype_nansafe
to check that NaT
is still null after casting to object
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep
Did we decide on whether we're going to deprecate this behavior before changing it? IMO, ew should |
I'd defer to you guys on that one. If we deprecate, the change would be to use a deprecation warning instead of an error and xfail the tests? |
@TomAugspurger you feel strongly about this? I don't think its a big deal, but we could deprecate. |
Not *too* strongly, just a slight preference. I could go either way.
…On Thu, Sep 26, 2019 at 9:28 AM Jeff Reback ***@***.***> wrote:
Did we decide on whether we're going to deprecate this behavior before
changing it? IMO, ew should
@TomAugspurger <https://github.com/TomAugspurger> you feel strongly about
this? I don't think its a big deal, but we could deprecate.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#28492?email_source=notifications&email_token=AAKAOIT4X7EAX2U27B7IYLDQLTBH7A5CNFSM4IXYP7Z2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7VYO7I#issuecomment-535529341>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOIXMNRQ27VC6JNSNIRLQLTBH7ANCNFSM4IXYP7ZQ>
.
|
Given how unusual the current behavior is, I'd be surprised if anyone was relying on it. I'd say it's more likely to be a source of unexpected bugs. |
Is there a legitimate use case for using Does seem kind of buggy so I think OK without deprecation too |
@dsaxton can you merge master to be sure? I think we can move forward with this as is, unless @TomAugspurger you feel otherwise |
can you merge master |
@jreback all green - OK to merge? |
can you merge master and we'll look again |
lgtm. @jbrockmendel if you'd have a glance (and merge if ok) |
thanks @dsaxton |
* BUG: Raise when casting NaT to int * Add release note * Add PR number * Use isna * Parametrize test * Check all integer dtypes * Fix merge * Edit release note * Check for np.int64 * Handle timedelta64 * Add astype_nansafe datetime tests * Add test for NaT object casting
black pandas
Fixes a bug in
astype_nansafe
whereNaT
was ignored when casting a datetime or timedelta toint
. I put the test inpandas/tests/dtypes/test_common.py
since I couldn't find another place whereastype_nansafe
was tested. Also adds various other tests forastype_nansafe
.Related: #28438