Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to get QueryExecutionId from cursor when using sqlalchemy + pyathena #339

Open
mdeshmu opened this issue Jun 27, 2022 · 8 comments
Open

Comments

@mdeshmu
Copy link
Contributor

mdeshmu commented Jun 27, 2022

I am using sqlalchemy + pyathena

Here is an example code:

from urllib.parse import quote_plus
from sqlalchemy import create_engine, inspect
conn_str = "awsathena+rest://@athena.{region_name}.amazonaws.com:443/{schema_name}?s3_staging_dir={s3_staging_dir}"
engine = create_engine(conn_str.format(region_name="us-east-1", schema_name="default", s3_staging_dir=quote_plus("s3://aws-athena-query-results-bucket/")))
conn = engine.raw_connection()
cursor = conn.cursor()
cursor.execute("""SELECT * FROM "database"."table""")

for sqlalchemy + pyathena setup, I want to know if there is a way to get QueryExecutionId of a running query from its cursor so that it can be later used to cancel the query using cancel method.

@laughingman7743
Copy link
Owner

laughingman7743 commented Jun 28, 2022

The cursor.execute method blocks until the query execution is finished.
Cursor has a cancel method, so it can be canceled from another thread. The query ID can also be found in cursor.query_id.
https://github.com/laughingman7743/PyAthena/blob/master/tests/test_cursor.py#L474-L491
https://github.com/laughingman7743/PyAthena/blob/master/pyathena/cursor.py#L120-L124
AsyncCursor makes query execution asynchronous, so it is easy to get the query ID. However, it may be difficult to use in combination with SQLAlchemy.
https://github.com/laughingman7743/PyAthena#asynccursor
https://github.com/laughingman7743/PyAthena/blob/master/tests/test_async_cursor.py#L179-L195

@mdeshmu
Copy link
Contributor Author

mdeshmu commented Jun 28, 2022

Are query_id and cancel methods available with PyAthenaJDBC ?

@laughingman7743
Copy link
Owner

No, it is not available; the JDBC version is also not asynchronous in query execution.

@mdeshmu
Copy link
Contributor Author

mdeshmu commented Jun 29, 2022

Does that mean, with PyAthena + SQL Alchemy, when a cursor is executing a query, the same cursor cannot be used to call cursor.query_id?

@laughingman7743
Copy link
Owner

You can easily get the query ID from the cursor if you execute the query in a separate thread.

@jack-miller-xc
Copy link

Following up on this @laughingman7743.

Quick context. I have a FastAPI application that facilitates data from RDS and Athena backends using asynchronous calls. Currently to do this async for Athena I use PyAthenas Connector and AsyncCursor. I’d love to be able to just create another async SQL Alchemy session like I do for RDS queries but that doesn’t appear to be possible ATM.

Before spending time investigating further is an async dialect for SQL Alchemy completely impossible? If so is this from a design choice from PyAthena, a limitation within SQL Alchemy, a blocker from Athena, or some combination of the three?

@laughingman7743
Copy link
Owner

AsyncCursor is a cursor that I created with a self-defined interface, not the interface as summarized in PEP 249, which makes it difficult to use SQLAlchemy. I believe SQLAlchemy only supports the DB-API of the PEP249 specification.

@paavanb
Copy link

paavanb commented Apr 10, 2024

Hi! Running into a similar usecase, the SQLAlchemy interface has fantastic ergonomics but I'm having trouble parallelizing multiple requests to cut down on response time. Is it possible to grab the underlying query built up using SQLAlchemy and then use AsyncCursor to execute it instead? Or would that not even help since execution blocks?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants