Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: SQLAlchemy 2.0.0 creates incompatibility with Pandas read_sql (fails to execute) #51015

Closed
3 tasks done
ephe-meral opened this issue Jan 27, 2023 · 5 comments
Closed
3 tasks done
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@ephe-meral
Copy link

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
from sqlalchemy import create_engine # SQLAlchemy v. 2.0!

sql_engine = create_engine('postgresql://user:pass@host:5432/database')

df = pd.read_sql(f'''
    SELECT * FROM table LIMIT 100''',
    sql_engine)

Issue Description

With the newest release of SQLAlchemy (2.0, released on 26. Jan 23), the read_sql function fails with the following stacktrace:

Traceback (most recent call last):
   File "/app/test.py", line 31, in <module>
     df = pd.read_sql(f'''
  File "/usr/local/lib/python3.10/site-packages/pandas/io/sql.py", line 590, in read_sql
     return pandas_sql.read_query(
   File "/usr/local/lib/python3.10/site-packages/pandas/io/sql.py", line 1560, in read_query
     result = self.execute(*args)
   File "/usr/local/lib/python3.10/site-packages/pandas/io/sql.py", line 1405, in execute
     return self.connectable.execution_options().execute(*args, **kwargs)
 AttributeError: 'OptionEngine' object has no attribute 'execute'

Expected Behavior

Should return the table as it did moments ago before I ran the package update.
I.e. this works with sqlalchemy vs. 1.4.46 - installing that version is a working workaround!

Installed Versions

INSTALLED VERSIONS
 ------------------
 commit           : 2e218d10984e9919f0296931d92ea851c6a6faf5
 python           : 3.10.9.final.0
 python-bits      : 64
 OS               : Linux
 OS-release       : 5.10.76-linuxkit
 Version          : #1 SMP PREEMPT Mon Nov 8 11:22:26 UTC 2021
 machine          : aarch64
 processor        :
 byteorder        : little
 LC_ALL           : None
 LANG             : C.UTF-8
 LOCALE           : en_US.UTF-8

 pandas           : 1.5.3
 numpy            : 1.24.1
 pytz             : 2022.7.1
 dateutil         : 2.8.2
 setuptools       : 65.5.1
 pip              : 22.3.1
 Cython           : None
 pytest           : None
 hypothesis       : None
 sphinx           : None
 blosc            : None
 feather          : None
 xlsxwriter       : None
 lxml.etree       : None
 html5lib         : None
 pymysql          : None
 psycopg2         : 2.9.5
 jinja2           : 3.1.2
 IPython          : None
 pandas_datareader: None
 bs4              : None
 bottleneck       : None
 brotli           : None
 fastparquet      : None
 fsspec           : None
 gcsfs            : None
 matplotlib       : None
 numba            : None
 numexpr          : None
 odfpy            : None
 openpyxl         : None
 pandas_gbq       : None
 pyarrow          : None
 pyreadstat       : None
 pyxlsb           : None
 s3fs             : None
 scipy            : 1.10.0
 snappy           : None
 sqlalchemy       : 2.0.0
 tables           : None
 tabulate         : None
 xarray           : None
 xlrd             : None
 xlwt             : None
 zstandard        : None
 tzdata           : None
@ephe-meral ephe-meral added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 27, 2023
@aberres
Copy link
Contributor

aberres commented Jan 27, 2023

See #40686

@phofl
Copy link
Member

phofl commented Jan 27, 2023

Yep agree, can you post your example there? Closing here

@phofl phofl closed this as completed Jan 27, 2023
@ryan-williams
Copy link

ryan-williams commented Jan 28, 2023

Seems like pinning sqlalchemy<2 is the workaround for now.


I ran into this starting Saturday as an unpinned sqlalchemy requirement started picking up 2.0.0, which is broken with current latest Pandas (1.5.3).

Here is a Docker-based repro:

# test.py
import pandas as pd
pd.DataFrame([{"n": 11}]).to_sql("test", "sqlite:///test.db")
print(pd.read_sql_table("test", "sqlite:///test.db"))
# Dockerfile
FROM python:3.9-slim
ARG SQLALCHEMY
RUN pip install pandas==1.5.3 sqlalchemy==$SQLALCHEMY
COPY test.py test.py
ENTRYPOINT [ "python", "test.py" ]
# bash
docker build -t pd-sqla2-test --build-arg SQLALCHEMY=2.0.0 .
docker run --rm pd-sqla2-test  # ❌ fails with sqlalchemy 2.0.0
# Traceback (most recent call last):
#   File "//test.py", line 3, in <module>
#     print(pd.read_sql_table('test', 'sqlite:///test.db'))
#   File "/usr/local/lib/python3.9/site-packages/pandas/io/sql.py", line 286, in read_sql_table
#     table = pandas_sql.read_table(  # type: ignore[union-attr]
#   File "/usr/local/lib/python3.9/site-packages/pandas/io/sql.py", line 1460, in read_table
#     return table.read(
#   File "/usr/local/lib/python3.9/site-packages/pandas/io/sql.py", line 1003, in read
#     result = self.pd_sql.execute(sql_select)
#   File "/usr/local/lib/python3.9/site-packages/pandas/io/sql.py", line 1405, in execute
#     return self.connectable.execution_options().execute(*args, **kwargs)
# AttributeError: 'OptionEngine' object has no attribute 'execute'

docker build -t pd-sqla1-test --build-arg SQLALCHEMY=1.4.46 .
docker run --rm pd-sqla1-test  # ✅ works with sqlalchemy 1.4.46
#    index   n
# 0      0  11
# 1      1  22

For some reason this manifested for me as a hang in a papermill invocation in a GitHub Action, resulting in the action just timing out after 6 hours! (Update: the hang seems to be an issue with jupyter-client>=8: nteract/papermill#711)

#40686 lists this as an "enhancement" dating back a couple years, but the "BUG" and description here feels more apt at this point, as pip install pandas sqlalchemy is broken as of 2d ago.

ryan-williams added a commit to hudcostreets/nj-crashes that referenced this issue Jan 28, 2023
@phofl phofl mentioned this issue Jan 31, 2023
3 tasks
@RMagician RMagician mentioned this issue Feb 1, 2023
3 tasks
@mikekeith52
Copy link

The problem with using sqlalchemy <2.0 as the workaround is that those of us using oracle cannot use the new oracledb library in conjunction with sqlachemy.create_engine() for versions before 2.0. The old way of doing it, using the cx_oracle library, is not compatible with newer operating systems. So we are stuck in dependency hell. Any way pandas can work on a solution to restore functionality to the engine object for post 2.0?

@mikekeith52 mikekeith52 mentioned this issue Aug 24, 2023
3 tasks
@squarewave24
Copy link

using .connection() also works

pd.read_sql(query, db_session.connection())

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

6 participants