-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: ser[dt64].astype("string") vs ser[dt64]._values.astype("string") #36153
Comments
This already happens in general for extension dtypes: pandas/pandas/core/internals/blocks.py Lines 583 to 585 in aca77f7
But DatetimeBlock (and probably TimedeltaBlock as well?) is the special case of not being an extension block, so taking a different path. Now, I suppose that also the different path in theory should handle it correctly:
so not really sure what is going wrong there. |
@jorisvandenbossche could use your help medium-priority as this is becoming a blocker on ArrayManager work.
In [12] I think we want the first entry to be pd.NA, not "NaT". On the flip side, we probably want the latter two entries of [13] to match the latter two entries of [12]. i.e. dta.astype("string")[~dta.isna()] == dta.astype(str)[~dta.isna()] seems desirable. This opens a new can of worms w/r/t consistency:
|
Yes, agreed. First question: why is it different? pandas/pandas/core/internals/blocks.py Lines 2211 to 2216 in 8dbb593
The DatetimeArray.astype for tz-naive data (case In[13]), on the other hand, use pandas/pandas/core/arrays/datetimelike.py Lines 346 to 351 in 8dbb593
I suppose we can fix In the end, this also relates a lot to #22384 for more thoroughly clean this up (but I think we can certainly already fix the inconsistency above). I will try to carve out some time to look back at this issue. |
@jbrockmendel this seems to be working correctly now (using your initial example):
(not sure if it has done intentionally and is tested, or if we should still add tests for it) |
best guess is it still merits a test |
It would be nice to have the Series code (i.e. the Block.astype) dispatch to the array code.
Note also:
The text was updated successfully, but these errors were encountered: