-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pd.read_sql timestamptz converted to object dtype #30207
Comments
I have found the culprit and there was a similar problem in the PR #11216. So in the end the issue is not directly related to SQL and I think the example below pinpoints the problem. I don't see a solution other than discarding offsets. import pandas as pd
import datetime
import psycopg2
from pandas.api.types import is_datetime64_any_dtype
# create datetimes with different offsets (60 and 120 minutes respectively)
data = [[datetime.datetime(2019, 11, 14, 16, 12,
tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=60))],
[datetime.datetime(2019, 8, 7, 15, 37, 4,
tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=120))]]
# different offsets causes the data to be read as object dtype\
# instead of any datetime dtype
# pd.DataFrame.from_records is what is used in functions that read_sql uses
df = pd.DataFrame.from_records(data, columns = ['ts'])
df.dtypes
ts object
dtype: object
# also this outputs False instead of True
is_datetime64_any_dtype(df['ts'])
False
# upon using pd.to_sql pd.to_datetime will be executed but
# it won't work since there are different offsets
pd.to_datetime(df['ts']) Traceback---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~/GitHub/pandas_master/pandas/core/arrays/datetimes.py in objects_to_datetime64ns(data, dayfirst, yearfirst, utc, errors, require_iso8601, allow_object)
1968 try:
-> 1969 values, tz_parsed = conversion.datetime_to_datetime64(data)
1970 # If tzaware, these values represent unix timestamps, so we
~/GitHub/pandas_master/pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.datetime_to_datetime64()
ValueError: Array must be all same time zone
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-55-46efc3100ca6> in <module>
----> 1 pd.to_datetime(df['ts'])
~/GitHub/pandas_master/pandas/core/tools/datetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, format, exact, unit, infer_datetime_format, origin, cache)
717 result = arg.map(cache_array)
718 else:
--> 719 values = convert_listlike(arg._values, format)
720 result = arg._constructor(values, index=arg.index, name=arg.name)
721 elif isinstance(arg, (ABCDataFrame, abc.MutableMapping)):
~/GitHub/pandas_master/pandas/core/tools/datetimes.py in _convert_listlike_datetimes(arg, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
431 errors=errors,
432 require_iso8601=require_iso8601,
--> 433 allow_object=True,
434 )
435
~/GitHub/pandas_master/pandas/core/arrays/datetimes.py in objects_to_datetime64ns(data, dayfirst, yearfirst, utc, errors, require_iso8601, allow_object)
1972 return values.view("i8"), tz_parsed
1973 except (ValueError, TypeError):
-> 1974 raise e
1975
1976 if tz_parsed is not None:
~/GitHub/pandas_master/pandas/core/arrays/datetimes.py in objects_to_datetime64ns(data, dayfirst, yearfirst, utc, errors, require_iso8601, allow_object)
1963 dayfirst=dayfirst,
1964 yearfirst=yearfirst,
-> 1965 require_iso8601=require_iso8601,
1966 )
1967 except ValueError as e:
~/GitHub/pandas_master/pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime()
~/GitHub/pandas_master/pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime()
ValueError: Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True
Solution with caveat (offsets are dropped)pd.to_datetime(df['ts'], utc = True)
I tested this directly in the master so you'll find the output of pd.show_versions() here again. Output of pd.show_versions()INSTALLED VERSIONS ------------------ commit : None python : 3.7.3.final.0 python-bits : 64 OS : Linux OS-release : 5.3.0-20-generic machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8pandas : 0.26.0.dev0+1382.g3577b5a34.dirty |
It's pandas' policy to convert https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#datetime-data-types |
Hi @mroeschke, thanks for your answer :). I was able to patch to_sql function. |
take |
Code Sample, a copy-pastable example if possible
Output
Problem description
There are 2 problems here:
pandas reads 2 columns of the test DataFrame I save in posgres as "object" and not "datetime64[ns, UTC]" although in postgres the data types are all "timestamp with time zone"
when I attempt to append the postgres table to itself via pandas (pd.DataFrame.read_sql then pd.DataFrame.to_sql) it fails when trying to convert the 2 columns "object" to datetime. So perhaps there is an issue with pd.to_datetime or I am missing something here.
I would make another issue for the second problem but I cannot reproduce it other than with this workflow.
Expected Output
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : None
python : 3.7.3.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 85 Stepping 4, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None
pandas : 0.25.3
numpy : 1.16.4
pytz : 2019.1
dateutil : 2.8.0
pip : 19.1.1
setuptools : 41.0.1
Cython : 0.29.12
pytest : 5.0.1
hypothesis : None
sphinx : 2.1.2
blosc : None
feather : None
xlsxwriter : 1.1.8
lxml.etree : 4.3.4
html5lib : 1.0.1
pymysql : None
psycopg2 : 2.8.3 (dt dec pq3 ext lo64)
jinja2 : 2.10.1
IPython : 7.6.1
pandas_datareader: None
bs4 : 4.7.1
bottleneck : 1.2.1
fastparquet : None
gcsfs : None
lxml.etree : 4.3.4
matplotlib : 3.1.0
numexpr : 2.6.9
odfpy : None
openpyxl : 2.6.2
pandas_gbq : None
pyarrow : 0.14.0
pytables : None
s3fs : None
scipy : 1.2.1
sqlalchemy : 1.3.5
tables : 3.5.2
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.1.8
The text was updated successfully, but these errors were encountered: