Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression: ts64->int64 #31634

Closed
lmeyerov opened this issue Feb 3, 2020 · 8 comments
Closed

regression: ts64->int64 #31634

lmeyerov opened this issue Feb 3, 2020 · 8 comments
Labels
Bug Datetime Datetime data dtype Dtype Conversions Unexpected or buggy dtype conversions Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Regression Functionality that used to work in a prior pandas version

Comments

@lmeyerov
Copy link

lmeyerov commented Feb 3, 2020

Code Sample, a copy-pastable example if possible

!pip install pandas==1.0.0
import datetime
import pandas as pd
pd.Series([datetime.datetime.now(), datetime.datetime.now(), None]).astype('int64')

=>

ValueError: Cannot convert NaT values to integer

Problem description

In 0.25.*, the above ran without an exception, while in 1.0.0, it now throws an exception. While the new behavior may be OK, it would be better as first a deprecation warning with a suggested migration.

(We had the same code for at least 1-2 years w/out issue. Still unclear on the proper fix for getting the int64 or int32 epoch out in a way that is friendly for both pre/post 1.0.0.)

Expected Output

Same behavior as 0.25.*:

0    1580769049934654000
1    1580769049934659000
2   -9223372036854775808
dtype: int64

Output of pd.show_versions()

/usr/local/lib/python3.6/dist-packages/psycopg2/init.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: http://initd.org/psycopg/docs/install.html#binary-install-from-pypi.
""")
/usr/local/lib/python3.6/dist-packages/pandas_datareader/compat/init.py:18: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.
from pandas.util.testing import assert_frame_equal

INSTALLED VERSIONS

commit : None
python : 3.6.9.final.0
python-bits : 64
OS : Linux
OS-release : 4.14.137+
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.0.0
numpy : 1.17.5
pytz : 2018.9
dateutil : 2.6.1
pip : 19.3.1
setuptools : 45.1.0
Cython : 0.29.14
pytest : 3.6.4
hypothesis : None
sphinx : 1.8.5
blosc : None
feather : 0.4.0
xlsxwriter : None
lxml.etree : 4.2.6
html5lib : 1.0.1
pymysql : None
psycopg2 : 2.7.6.1 (dt dec pq3 ext lo64)
jinja2 : 2.11.1
IPython : 5.5.0
pandas_datareader: None
bs4 : 4.6.3
bottleneck : 1.3.1
fastparquet : None
gcsfs : None
lxml.etree : 4.2.6
matplotlib : 3.1.2
numexpr : 2.7.1
odfpy : None
openpyxl : 2.5.9
pandas_gbq : 0.11.0
pyarrow : 0.14.1
pytables : None
pytest : 3.6.4
pyxlsb : None
s3fs : 0.4.0
scipy : 1.4.1
sqlalchemy : 1.3.13
tables : 3.4.4
tabulate : 0.8.6
xarray : 0.14.1
xlrd : 1.1.0
xlwt : 1.3.0
xlsxwriter : None
numba : 0.47.0

@TomAugspurger
Copy link
Contributor

In #28492 we decided to call that a bug and make the change without deprecation. cc @dsaxton

I believe an alternative is to use .view()

   ...: pd.Series([datetime.datetime.now(), datetime.datetime.now(), None]).view('datetime64[ns]')
Out[7]:
0   2020-02-03 16:46:05.797981
1   2020-02-03 16:46:05.797986
2                          NaT
dtype: datetime64[ns]

It's unclear to me how to move forward, but my initial preference is to leave the change as is and not add a deprecation.

@TomAugspurger
Copy link
Contributor

FYI, we publish nightly wheels and release candidates for major versions. You might consider watching "Releases only" and testing out release candidates to catch this before things change.

@lmeyerov
Copy link
Author

lmeyerov commented Feb 3, 2020

I still get an exn when doing .view() in 1.0.0: https://colab.research.google.com/drive/1BFk-KVLh77e6kVf1SW93Ccs67N0cZITo

@lmeyerov
Copy link
Author

lmeyerov commented Feb 3, 2020

pd.Series([datetime.datetime.now(), datetime.datetime.now(), None]).apply(lambda t: t.value) seems safer, and we can do a case statement on pre/post 1.0.0 to do something not as slow? (Or maybe roundtrip through arrow?)

@TomAugspurger
Copy link
Contributor

Sorry, my example was incorrect. I meant .view('int64')

   ...: pd.Series([datetime.datetime.now(), datetime.datetime.now(), None]).view('int64')
Out[2]:
0    1580748933365711000
1    1580748933365716000
2   -9223372036854775808
dtype: int64

@lmeyerov
Copy link
Author

lmeyerov commented Feb 3, 2020

Ah awesome, thank you!

It's a bit of a nasty & deep change that is hard to catch without good unit tests, so I'd recommend deprecation warning, but I'm not too familiar with pandas policies here. (We actually have 2 libs that I bet were broken by it.)

lmeyerov added a commit to graphistry/pygraphistry that referenced this issue Feb 3, 2020
@dsaxton
Copy link
Member

dsaxton commented Feb 3, 2020

@lmeyerov Apologies for the disruption, but hopefully @TomAugspurger 's suggestion works for you.

lmeyerov added a commit to graphistry/pygraphistry that referenced this issue Feb 3, 2020
@simonjayhawkins simonjayhawkins added Dtype Conversions Unexpected or buggy dtype conversions Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Regression Functionality that used to work in a prior pandas version Datetime Datetime data dtype labels Apr 25, 2020
@mroeschke mroeschke added the Bug label May 11, 2020
@jbrockmendel
Copy link
Member

This now issues a FutureWarning instead of raising. Closing.

lmeyerov added a commit to graphistry/pygraphistry that referenced this issue Mar 13, 2022
pull bot pushed a commit to admariner/pygraphistry that referenced this issue Mar 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Dtype Conversions Unexpected or buggy dtype conversions Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

No branches or pull requests

6 participants