Separate NaT values for Timedelta ("NaTD") and Period? #24983

shoyer · 2019-01-28T17:32:43Z

Separate scalar missing values Timedelta, Timestamp and Period scalars would go a long ways towards achieving predictable types with pandas. As noted in #19124, it is impossible to make some operations consistent with the current state of affairs. Most recently this came up in #24957.

This is listed in the pandas2 tracker (wesm/pandas2#74), but I think it might even be achievable for pandas 1.x? There would only be backwards compatibility issues if people are explicitly checking object identity against the pd.NaT scalar, which is a bit of an anti-pattern.

The text was updated successfully, but these errors were encountered:

jbrockmendel · 2019-01-28T17:57:37Z

xref #24645

burnpanck · 2022-08-16T11:02:04Z

We did run into this in production code of ours, where we do some timedelta gymnastics: An innocent looking part of code iterates over the rows of a pandas dataframe, and among other things applies a np.maximum to a timedelta within that row, and a constant lower bound. This fails with an UFuncTypeError only in the case where we happen to have a NaT in that row, even if that piece of code would otherwise work fine with NaTs. Thus, in our view, we'd consider this an Issue rather than an Enhancement, as it breaks invariants that one would expect from the type system.

While searching for this issue, I also came across #46171. If I understand correctly, in that PR type annotations have been made less accurate for the convenience of the users. I believe that this inconvenience was actually an alarm signal from the type checkers pointing to the underlying issue. The PR simply swept that alarm signal under the rug. Without that PR, the type-checker might have prevented us from running into this in production.

gfyoung added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Timedelta Timedelta data type labels Feb 7, 2019

jbrockmendel mentioned this issue Feb 7, 2019

Series[Period] +/- NaT returns Series[datetime64] #19389

Closed

frexvahi mentioned this issue Apr 10, 2019

Comparing Timedelta and NaT gives inconsistent results depending on order #26039

Closed

jbrockmendel mentioned this issue Jul 10, 2019

BUG: Fix insertion of wrong-dtypes NaT into Series[m8ns] #27323

Merged

jorisvandenbossche mentioned this issue Oct 2, 2019

ROADMAP: Consistent missing value handling with new NA scalar #28095

Open

jbrockmendel added the Needs Discussion Requires discussion from core team before further action label Oct 16, 2019

mroeschke added the Enhancement label May 3, 2020

mroeschke added the Period Period data type label Jun 26, 2021

jbrockmendel mentioned this issue Feb 28, 2022

TYP: remove NaTType as possible result of Timestamp and Timedelta constructor #46171

Merged

burnpanck mentioned this issue Aug 16, 2022

BUG: groupby(Grouper) with all-NaT grouping keys #43486

Open

3 tasks

randolf-scholz mentioned this issue Apr 29, 2024

ENH: Introduce type-safe constructors for Timestamp and Timedelta. #58475

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate NaT values for Timedelta ("NaTD") and Period? #24983

Separate NaT values for Timedelta ("NaTD") and Period? #24983

shoyer commented Jan 28, 2019

jbrockmendel commented Jan 28, 2019

burnpanck commented Aug 16, 2022

Separate NaT values for Timedelta ("NaTD") and Period? #24983

Separate NaT values for Timedelta ("NaTD") and Period? #24983

Comments

shoyer commented Jan 28, 2019

jbrockmendel commented Jan 28, 2019

burnpanck commented Aug 16, 2022