Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add cancel and status queries to server-side async execution #192

Merged
merged 93 commits into from
Aug 24, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
93 commits
Select commit Hold shift + click to select a range
cfd856b
Added async_execution to async_db/cursor and db/cursor. Added an erro…
ericf-firebolt Jul 8, 2022
321b021
Removed ignore C901 from flake8 settings in setup.cfg.
ericf-firebolt Jul 8, 2022
370e5cf
Merge branch 'main' into async_queries
ericf-firebolt Jul 8, 2022
6ad3be4
Fixed a couple of missing arguments in async_db/cursor.py on execute …
ericf-firebolt Jul 8, 2022
1bea472
Merge branch 'async_queries' of https://github.com/firebolt-db/firebo…
ericf-firebolt Jul 8, 2022
bf7f1f1
mypy and black cleanup.
ericf-firebolt Jul 8, 2022
4196204
Removed set_parameters argument from all(?) functions. Started adding…
ericf-firebolt Jul 15, 2022
7ff71a7
Added a bunch of callbacks to cursor tests.
ericf-firebolt Jul 18, 2022
0bace5b
Merge branch 'main' into async_queries
ericf-firebolt Jul 19, 2022
c69f585
Pulled out a couple more set_parameters variables from function signa…
ericf-firebolt Jul 20, 2022
15df8cc
Added, and commented out, server_side_async_url to unit/conftests.py.
ericf-firebolt Jul 20, 2022
bec0804
Removed test_set_parameters() from tests/async/cursor.py.
ericf-firebolt Jul 20, 2022
37f2373
Removed test_set_parameters() from tests/async/cursor.py.
ericf-firebolt Jul 20, 2022
cb04815
Added ability to see which exectute is failing (execute or executeman…
ericf-firebolt Jul 20, 2022
fc7d116
Added more explicit error messages to async and sync test_cursor.py m…
ericf-firebolt Jul 20, 2022
ec0fd55
Added some periods.
ericf-firebolt Jul 20, 2022
f7b9d09
Replaced cursor reset call in async _do_execute().
ericf-firebolt Jul 20, 2022
d2ae886
Updated query/message tuple decomposition in test_cursor to be more h…
ericf-firebolt Jul 21, 2022
f77ba35
Fixed a typo and function signature for db/test_cursor_server_side_as…
ericf-firebolt Jul 21, 2022
8cae1d2
Removed second hard-coding of query_id in server-side async id callback.
ericf-firebolt Jul 21, 2022
95f5e4d
Needed to add an await to an _api_request() call.
ericf-firebolt Jul 21, 2022
9d1f1ff
Used InternalError to error out on no response to async server-side q…
ericf-firebolt Jul 21, 2022
442c420
Added additional checks on rowcount and description in test_cursor_se…
ericf-firebolt Jul 21, 2022
e76a545
Added QueryResponse class.
ericf-firebolt Jul 22, 2022
224f7c4
Minor changes requested on PR.
ericf-firebolt Jul 22, 2022
9d20bba
Added an OperationalError is asynchronous query response is missing q…
ericf-firebolt Jul 22, 2022
f403673
Had a typo.
ericf-firebolt Jul 22, 2022
16ed40d
Added a warning if asyc_execution is set via a SET parameter rather t…
ericf-firebolt Jul 22, 2022
b58f881
Started adding test_cursor_async_execute_error().
ericf-firebolt Jul 25, 2022
8753a99
Updated query_url argument in test_cursor_async_execute_error().
ericf-firebolt Jul 25, 2022
3d1c5b2
Added AsyncExecutionUnavailableError on server-side async query execu…
ericf-firebolt Jul 26, 2022
1abaaa4
Seem to have dealt with auth issues in test_cursor_async_execute_erro…
ericf-firebolt Jul 27, 2022
9d4ffe8
Added all necessary set params to url string in test_cursor_async_exe…
ericf-firebolt Jul 27, 2022
c9e8947
Cleaned up string input in test_cursor_async_execute_error().
ericf-firebolt Jul 27, 2022
2ba105f
Now no token error.
ericf-firebolt Jul 27, 2022
52bd72c
Multi-statement queries now error out correctly.
ericf-firebolt Jul 28, 2022
22e4540
Reworked a string to try to get commit/push to work.
ericf-firebolt Jul 28, 2022
3320495
Had to add an extra auth callback to get all cursor.execute() calls t…
ericf-firebolt Jul 28, 2022
aef7dff
Removed some parameters from various fns in unit/async_db/test_cursor…
ericf-firebolt Jul 28, 2022
65b61da
Added error check for missing query_id on async_execution. A little b…
ericf-firebolt Jul 28, 2022
3a4655d
Merge branch 'main' into async_queries
ericf-firebolt Jul 28, 2022
ba6fd52
Finished merge by hand.
ericf-firebolt Jul 29, 2022
e9c70f8
Merge branch 'main' into async_queries
ericf-firebolt Jul 29, 2022
7671506
Merge branch 'async_queries' into async_queries_add_cancel_status
ericf-firebolt Jul 29, 2022
c3a02af
Fixed error for empty response.json on asynch execution. Also changed…
ericf-firebolt Jul 29, 2022
2bc20db
Merge branch 'main' into async_queries
ericf-firebolt Jul 29, 2022
d482caf
Fixed error for empty response.json on asynch execution. Also changed…
ericf-firebolt Jul 29, 2022
04a9e2f
Fixed error for empty response.json on asynch execution. Also changed…
ericf-firebolt Jul 29, 2022
cc749a3
Fixed failed auto merge.
ericf-firebolt Jul 29, 2022
3650755
Added a test to check that an server-side asynchronous execution retu…
ericf-firebolt Jul 29, 2022
3a33790
Added an integration test to check that an server-side asynchronous e…
ericf-firebolt Jul 29, 2022
e373cf0
Finished merge by hand.
ericf-firebolt Jul 29, 2022
18d8fbe
Added cancel() to async/cursor.py. Also fixed an error where I was ge…
ericf-firebolt Aug 1, 2022
36d0bbe
Forgot that I'd commented out most of test_cursor.py.
ericf-firebolt Aug 1, 2022
50b89f9
Trying to get rid of coroutine 'BaseCursor.execute' was never awaited…
ericf-firebolt Aug 2, 2022
66604c0
Added unit tests for cancel and cancel errors.
ericf-firebolt Aug 2, 2022
bdb1063
Fixed a mistake that would have failed the cancel() integration test.
ericf-firebolt Aug 2, 2022
4888d3f
Completed auto-merge fail.
ericf-firebolt Aug 2, 2022
a37d250
Fixed several imports that had disappeared (maybe during a merge?). A…
ericf-firebolt Aug 2, 2022
6939e28
get_status() and two unit tests are added. Integration test is failin…
ericf-firebolt Aug 2, 2022
634541d
Added a new QueryStatus, NOT_AVAILABLE, because checking status will …
ericf-firebolt Aug 3, 2022
3278e94
Added a comment.
ericf-firebolt Aug 3, 2022
f7dc846
Updated a comment.
ericf-firebolt Aug 3, 2022
4072793
Added stub fn for async execution fetch.
ericf-firebolt Aug 4, 2022
8830361
Keep forgetting to uncomment test code and the pre-commit checks are …
ericf-firebolt Aug 4, 2022
df3709d
Removed some extraneous testing code.
ericf-firebolt Aug 4, 2022
0e2c553
Updated test_ss_async_execution_get_status() after Yoni pointed out t…
ericf-firebolt Aug 5, 2022
7a798be
Had to comment out test_ss_async_execution_get_status(), as it basica…
ericf-firebolt Aug 5, 2022
b428ba9
Added ability to specify output_format in _api_request(), as status r…
ericf-firebolt Aug 9, 2022
3a98c18
First set of requested changes on PR.
ericf-firebolt Aug 9, 2022
e9f2fc4
Removed noqa on _do_execute().
ericf-firebolt Aug 9, 2022
98c3252
Renamed _find_async_problems() to _validate_ss_async_settings(). Remo…
ericf-firebolt Aug 10, 2022
c840433
Moved call to _validate_ss_async_settings() into try.
ericf-firebolt Aug 10, 2022
58807a2
Added asyncio_mode=auto to pytest config in config.cfg, because I was…
ericf-firebolt Aug 10, 2022
99540ed
Changed long query in test_queries_async integration tests. Paused ex…
ericf-firebolt Aug 11, 2022
d8d6d8f
Updated all unit tests that test SET parameters to not have output_fo…
ericf-firebolt Aug 11, 2022
0ab70cc
Changed query_loop() in integration tests/async/test_queries to check…
ericf-firebolt Aug 11, 2022
3704b9a
Merge branch 'main' into async_queries_add_cancel_status
ericf-firebolt Aug 11, 2022
18d7c83
Noticed that test_anyio_backend_import_issue() was commented out in s…
ericf-firebolt Aug 11, 2022
8d4ce91
Added query tests to integration/dbapi/sync/test_queries.py. Changed …
ericf-firebolt Aug 12, 2022
c93bc08
Added query tests to integration/dbapi/sync/test_queries.py. Changed …
ericf-firebolt Aug 12, 2022
ef3b668
Merge branch 'async_queries_add_cancel_status' of https://github.com/…
ericf-firebolt Aug 12, 2022
7ad25c9
Now errors out when use_standard_sql=0 rather than when it equals 1. …
ericf-firebolt Aug 12, 2022
b5efbf4
Changed order of synchronous unit tests to move all server-side async…
ericf-firebolt Aug 15, 2022
143b0bd
Changed order of asynchronous cursor unit tests to move all server-si…
ericf-firebolt Aug 15, 2022
3090dc1
Reordered integration and unit test modules to move all server-side a…
ericf-firebolt Aug 15, 2022
b24a263
Merge branch 'main' into async_queries_add_cancel_status
ericf-firebolt Aug 16, 2022
923d931
Moving JSON_OUTPUT_FORMAT outside of _api_request (#196)
ptiurin Aug 19, 2022
316f50a
Updated docs to include information on server-side async query execut…
ericf-firebolt Aug 21, 2022
ad18500
Merge branch 'main' into async_queries_add_cancel_status
ericf-firebolt Aug 23, 2022
51fb204
Updated external table mention in comments and removed sentence in do…
ericf-firebolt Aug 23, 2022
0e816a7
Made a change to server-side execution explanation for clarity and to…
ericf-firebolt Aug 23, 2022
5398f77
Renamed a function and moved table create and drop out of test_querie…
ericf-firebolt Aug 24, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
168 changes: 123 additions & 45 deletions docsrc/Connecting_and_queries.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,23 +3,23 @@
Connecting and running queries
###############################

This topic provides a walkthrough and examples for how to use the Firebolt Python SDK to connect to Firebolt resources to run commands and query data.
This topic provides a walkthrough and examples for how to use the Firebolt Python SDK to connect to Firebolt resources to run commands and query data.


Setting up a connection
=========================

To connect to a Firebolt database to run queries or command, you must provide your account credentials through a connection request.
To connect to a Firebolt database to run queries or command, you must provide your account credentials through a connection request.

To get started, follow the steps below:
To get started, follow the steps below:

**1. Import modules**

The Firebolt Python SDK requires you to import the following modules before making any command or query requests to your Firebolt database.
The Firebolt Python SDK requires you to import the following modules before making any command or query requests to your Firebolt database.

.. _required_connection_imports:

::
::

from firebolt.db import connect
from firebolt.client import DEFAULT_API_URL
Expand All @@ -30,9 +30,9 @@ To get started, follow the steps below:
**2. Connect to your database and engine**


Your account information can be provided as parameters in a ``connection()`` function.
Your account information can be provided as parameters in a ``connection()`` function.

A connection requires the following parameters:
A connection requires the following parameters:

+------------------------------------+-------------------------------------------------------------------+
| ``username`` | The email address associated with your Firebolt user. |
Expand All @@ -50,9 +50,9 @@ To get started, follow the steps below:

* **Set credentials manually**

You can manually include your account information in a connection object in your code for any queries you want to request.
You can manually include your account information in a connection object in your code for any queries you want to request.

Replace the values in the example code below with your Firebolt account credentials as appropriate.
Replace the values in the example code below with your Firebolt account credentials as appropriate.

::

Expand All @@ -61,19 +61,19 @@ To get started, follow the steps below:
engine_name = "your_engine"
database_name = "your_database"

connection = connect(
connection = connect(
engine_name=engine_name,
database=database_name,
username=username,
password=password,
)

cursor = connection.cursor()


* **Use an .env file**

Consolidating all of your Firebolt credentials into a ``.env`` file can help protect sensitive information from exposure. Create an ``.env`` file with the following key-value pairs, and replace the values with your information.
Consolidating all of your Firebolt credentials into a ``.env`` file can help protect sensitive information from exposure. Create an ``.env`` file with the following key-value pairs, and replace the values with your information.

::

Expand All @@ -82,9 +82,9 @@ To get started, follow the steps below:
FIREBOLT_ENGINE="your_engine"
FIREBOLT_DB="your_database"

Be sure to place this ``.env`` file into your root directory.
Be sure to place this ``.env`` file into your root directory.

Your connection script can load these environmental variables from the ``.env`` file by using the `python-dotenv <https://pypi.org/project/python-dotenv/>`_ package. Note that the example below imports the ``os`` and ``dotenv`` modules in order to load the environmental variables.
Your connection script can load these environmental variables from the ``.env`` file by using the `python-dotenv <https://pypi.org/project/python-dotenv/>`_ package. Note that the example below imports the ``os`` and ``dotenv`` modules in order to load the environmental variables.

::

Expand All @@ -105,35 +105,35 @@ To get started, follow the steps below:

**3. Execute commands using the cursor**

The ``cursor`` object can be used to send queries and commands to your Firebolt database and engine. See below for examples of functions using the ``cursor`` object.
The ``cursor`` object can be used to send queries and commands to your Firebolt database and engine. See below for examples of functions using the ``cursor`` object.

Command and query examples
Server-side synchronous command and query examples
============================

This section includes Python examples of various SQL commands and queries.
This section includes Python examples of various SQL commands and queries.


Inserting and selecting data
-----------------------------

.. _basic_execute_example:

The example below uses ``cursor`` to create a new table called ``test_table``, insert rows into it, and then select the table's contents.
The example below uses ``cursor`` to create a new table called ``test_table``, insert rows into it, and then select the table's contents.

The engine attached to your specified database must be started before executing any queries. For help, see :ref:`starting an engine`.
The engine attached to your specified database must be started before executing any queries. For help, see :ref:`starting an engine`.

::

cursor.execute(
'''CREATE FACT TABLE IF NOT EXISTS test_table (
id INT,
name TEXT
)
id INT,
name TEXT
)
PRIMARY INDEX id;'''
)

cursor.execute(
'''INSERT INTO test_table VALUES
'''INSERT INTO test_table VALUES
(1, 'hello'),
(2, 'world'),
(3, '!');'''
Expand All @@ -145,23 +145,23 @@ The engine attached to your specified database must be started before executing

cursor.close()

.. note::
.. note::

For reference documentation on ``cursor`` functions, see :ref:`Db.cursor`
For reference documentation on ``cursor`` functions, see :ref:`Db.cursor`


Fetching query results
-----------------------

After running a query, you can fetch the results using a ``cursor`` object. The examples below use the data queried from ``test_table`` created in the :ref:`Inserting and selecting data`.
After running a query, you can fetch the results using a ``cursor`` object. The examples below use the data queried from ``test_table`` created in the :ref:`Inserting and selecting data`.

.. _fetch_example:

::

print(cursor.fetchone())

**Returns**: ``[2, 'world']``
**Returns**: ``[2, 'world']``

::

Expand All @@ -181,25 +181,25 @@ Executing parameterized queries

.. _parameterized_query_execute_example:

Parameterized queries (also known as “prepared statements”) format a SQL query with placeholders and then pass values into those placeholders when the query is run. This protects against SQL injection attacks and also helps manage dynamic queries that are likely to change, such as filter UIs or access control.
Parameterized queries (also known as “prepared statements”) format a SQL query with placeholders and then pass values into those placeholders when the query is run. This protects against SQL injection attacks and also helps manage dynamic queries that are likely to change, such as filter UIs or access control.

To run a parameterized query, use the ``execute()`` cursor method. Add placeholders to your statement using question marks ``?``, and in the second argument pass a tuple of parameters equal in length to the number of ``?`` in the statement.


::
::

cursor.execute(
'''CREATE FACT TABLE IF NOT EXISTS test_table2 (
id INT,
name TEXT,
name TEXT,
date_value DATE
)
PRIMARY INDEX id;'''
)


::

cursor.execute(
"INSERT INTO test_table2 VALUES (?, ?, ?)",
(1, "apple", "2018-01-01"),
Expand All @@ -216,8 +216,8 @@ If you need to run the same statement multiple times with different parameter in
cursor.executemany(
"INSERT INTO test_table2 VALUES (?, ?, ?)",
(
(2, "banana", "2019-01-01"),
(3, "carrot", "2020-01-01"),
(2, "banana", "2019-01-01"),
(3, "carrot", "2020-01-01"),
(4, "donut", "2021-01-01")
)
)
Expand All @@ -231,7 +231,7 @@ Executing multiple-statement queries

Multiple-statement queries allow you to run a series of SQL statements sequentially with just one method call. Statements are separated using a semicolon ``;``, similar to making SQL statements in the Firebolt UI.

::
::

cursor.execute(
"""
Expand All @@ -246,32 +246,110 @@ Multiple-statement queries allow you to run a series of SQL statements sequentia

cursor.close()

**Returns**:
**Returns**:

::
::

First query: [[2, 'banana', datetime.date(2019, 1, 1)], [3, 'carrot', datetime.date(2020, 1, 1)], [1, 'apple', datetime.date(2018, 1, 1)]]
Second query: [[3, 'carrot', datetime.date(2020, 1, 1)], [4, 'donut', datetime.date(2021, 1, 1)]]

.. note::
.. note::

Multiple statement queries are not able to use placeholder values for parameterized queries.



Server-side asynchronous query execution
==========================================

Server-side asynchronous query execution allows you to run a long query in the background while executing other asynchronous or synchronous queries. An additional benefit of server-side async execution that can free up a connection, close a connection while running a query, or potentially even spin down an entire service (such as AWS Lambda) while a long-running database job is still underway. Note that it is not possible to retrieve the results of a server-side asynchronous query, so these queries are best used for running DMLs and DDLs. SELECTs should be used only for warming the cache.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding a brief paragraph here explaining the differences between server-side async vs. client-side async?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericf-firebolt I see you merged the PR, but what about this comment above?

Running DDL commands
-----------------------------

.. _basic_execute_example:

Running queries server-side asynchronously is similar to running server-side asynchronous queries, but the ``execute()`` command receives an extra parameter, ``async_execution=True``. The example below uses ``cursor`` to create a new table called ``test_table``. ``execute(query, async_execution=True)`` will return a query ID, which can subsequently be used to check the query status.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that it's similar to client-side async?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericf-firebolt I see you merged the PR, but what about this comment above?


::

query_id = cursor.execute(
'''CREATE FACT TABLE IF NOT EXISTS test_table (
id INT,
name TEXT
)
PRIMARY INDEX id;''',
async_execution=True
)


To check the status of a query, send the query ID to ```get_status()``` to receive a QueryStatus enumeration object. Possible statuses are:


* ``RUNNING``
* ``ENDED_SUCCESSFULLY``
* ``ENDED_UNSUCCESSFULLY``
* ``NOT_READY``
* ``STARTED_EXECUTION``
* ``PARSE_ERROR``
* ``CANCELED_EXECUTION``
* ``EXECUTION_ERROR``


Once the status of the table creation is ``ENDED_SUCCESSFULLY`` created, data can be inserted into it:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete "created" in this sentence?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericf-firebolt I see you merged the PR, but what about this comment above?


::

from firebolt.async_db.cursor import QueryStatus

query_status = cursor.get_status(query_id)

if query_status == QueryStatus.ENDED_SUCCESSFULLY:
cursor.execute(
'''INSERT INTO test_table VALUES
(1, 'hello'),
(2, 'world'),
(3, '!');'''
)


In addition, server-side asynchronous queries can be cancelled calling ``cancel()``.

::

query_id = cursor.execute(
'''CREATE FACT TABLE IF NOT EXISTS test_table (
id INT,
name TEXT
)
PRIMARY INDEX id;''',
async_execution=True
)

cursor.cancel(query_id)

query_status = cursor.get_status(query_id)

print(query_status)

**Returns**: ``CANCELED_EXECUTION``

Multiple statement queries are not able to use placeholder values for parameterized queries.


Using DATE and DATETIME values
---------------------------------
==============================

DATE, DATETIME and TIMESTAMP values used in SQL insertion statements must be provided in a specific format; otherwise they could be read incorrectly.
DATE, DATETIME and TIMESTAMP values used in SQL insertion statements must be provided in a specific format; otherwise they could be read incorrectly.

* DATE values should be formatted as **YYYY-MM-DD**
* DATE values should be formatted as **YYYY-MM-DD**

* DATETIME and TIMESTAMP values should be formatted as **YYYY-MM-DD HH:MM:SS.SSSSSS**

The `datetime <https://docs.python.org/3/library/datetime.html>`_ module from the Python standard library contains various classes and methods to format DATE, TIMESTAMP and DATETIME data types.
The `datetime <https://docs.python.org/3/library/datetime.html>`_ module from the Python standard library contains various classes and methods to format DATE, TIMESTAMP and DATETIME data types.

You can import this module as follows:
You can import this module as follows:

::
::

from datetime import datetime

2 changes: 1 addition & 1 deletion docsrc/firebolt.async_db.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Async DB
==========================

The Async DB package enables connecting to a Firebolt database for asynchronous queries.
The Async DB package enables connecting to a Firebolt database for `client-side` asynchronous queries. For running queries in `server-side` asynchronous mode see :ref:`server-side asynchronous query execution`.
ericf-firebolt marked this conversation as resolved.
Show resolved Hide resolved

Connect
------------------------------------
Expand Down
Loading