-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: cancel db query on stop #15403
Conversation
Slightly off-topic: |
It behaves the same and Stop button in Explore doesn't stop the query in the database. I'm planning to implement the same functionality for Explore as well by reusing some components of this PR once it get merged. This PR is focusing only on SQLLab. |
Codecov Report
@@ Coverage Diff @@
## master #15403 +/- ##
==========================================
- Coverage 76.95% 76.71% -0.24%
==========================================
Files 976 976
Lines 51326 51410 +84
Branches 6912 6918 +6
==========================================
- Hits 39498 39441 -57
- Misses 11607 11748 +141
Partials 221 221
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code looks good to me, thank you for the awesome contribution!
Would you mind adding a few tests for the new functionality?
@suddjian any chance to get this merged? |
Huh, I'm pretty sure the stop button used to work with Hive/Presto, killing the query. I wonder if we had a regression at some point. |
I'm not sure if navigating to another page should cancel the query - I can see wanting to navigate away from a long running query and coming back to it. |
@betodealmeida I think you're right. Hive and presto drivers are running queries in async mode with custom cursor handlers. They periodically selects the query status from the superset db in every second and if the query status is The behaviour of Db engines that doesn't implement the cancel_query method remains the same, so no changes in hive and presto queries. They're still running in async mode, polling query statuses and terminating queries by
@yousoph you can still keep running the long queries by not closing the browser tab. This is a very similar behaviour when running queries in a desktop SQL client. If we close a desktop SQL client then the running queries are terminated. I believe that SQL Lab should behave the same and running queries need to be terminated when navigating to another window or when closing the browser. This is a default behaviour in Looker (and maybe in Tableau?) as well. A bit more context about this PR: We're about to give SQL Lab access to 2000 registered users and expecting about 100 concurrent users in every hour. Manual queries accidentally will be be self joins, cross joins, ones with missing WHERE conditions or simple very long running queries. People will press the stop button or close the browser windows in the hope their queries got killed to let other people to run queries. In concurrency sensitive databases (like snowflake or redshift), stuck queries blocking people to run new queries and features like this PR is crucial. Please let me know your thoughts. 🙇♂️ |
@@ -222,6 +224,7 @@ class SqlEditor extends React.PureComponent { | |||
|
|||
componentWillUnmount() { | |||
window.removeEventListener('resize', this.handleWindowResize); | |||
window.removeEventListener('beforeunload', this.onBeforeUnload.bind(this)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't think you need the bind here, since this has already been bound when the method was declared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, fixed by efb86e5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, true. And also I think it actually won't work with .bind
since that returns a new function.
@@ -212,6 +213,7 @@ class SqlEditor extends React.PureComponent { | |||
this.setState({ height: this.getSqlEditorHeight() }); | |||
|
|||
window.addEventListener('resize', this.handleWindowResize); | |||
window.addEventListener('beforeunload', this.onBeforeUnload.bind(this)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing.. don't think you need the bind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks fixed by efb86e5
can you please approve the CI workflow so github actions will run again? 🙇♂️
I agree on ending the query when closing the browser, but I believe this will also end the query when you go to a different tab outside of sql lab as well. |
That's true. Clicking on a non sqllab link (like Charts or Dashboards) is terminating the running query but opening a new tab in the browser should keep it running. What do you think is that acceptable and can it be the default behaviour? Or maybe we should make it configurable? |
I think configurable is a good approach, since it changes the current behavior. And just to confirm, when you leave the page it only cancels sync queries, not async? |
I only tested on sync queries so far, yes. |
Also testing with Postgres, I'm seeing an error on a long-running query. Small nit, maybe @yousoph can comment, but on shorter queries, there's a bit of a race condition going on. I can hit "stop", see the confirmation that it was stopped but then a few seconds later, still see the results. LMK if you want a video. |
The popup appears only in Firefox. It's a little bit annoying indeed but looking into the beforeunload spec this is what might happen in Chrome as well soon. 👍 I'll make killing queries on window close events configurable in the DB config window and setting it to False as default. I hope that's fine and will update this PRs soon. |
@koszti if you want to look at the error above, I may have some simple workaround suggestions for the alert. Since FF is going to alert anyway with no real meaningful message, we could go ahead and put a |
I like this idea! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as Beto, but this looks pretty good otherwise. cc @john-bodley for python review, but i have no concerns besides the comments here
superset/db_engine_specs/base.py
Outdated
@@ -1304,6 +1304,28 @@ def get_column_spec( | |||
) | |||
return None | |||
|
|||
@classmethod | |||
def get_cancel_query_payload(cls, cursor: Any, query: Query) -> Any: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we improve the types here (e.g. better than Any
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the cursor
is db driver specific, defined in the driver and it's hard to set anything better than Any
that remains compatible with every driver. The cursor
is currently using Any
in the entire codebase. But please correct me if I'm wrong or if you have better idea that doesn't require major refactoring of the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like now the return type here is typed, so this lgtm
superset/db_engine_specs/base.py
Outdated
return None | ||
|
||
@classmethod | ||
def cancel_query(cls, cursor: Any, query: Query, payload: Any) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same type question here
superset/db_engine_specs/mysql.py
Outdated
def get_cancel_query_payload(cls, cursor: Any, query: Query) -> Any: | ||
cursor.execute("SELECT CONNECTION_ID()") | ||
row = cursor.fetchone() | ||
return row[0] | ||
|
||
@classmethod | ||
def cancel_query(cls, cursor: Any, query: Query, payload: Any) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same typing questions here
superset/db_engine_specs/postgres.py
Outdated
cursor.execute( | ||
"SELECT pg_terminate_backend(pid) " | ||
"FROM pg_stat_activity " | ||
"WHERE pid <> pg_backend_pid() and usename='%s'" % payload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, same question here. ideally we can only cancel this query
superset/sql_lab.py
Outdated
@@ -438,12 +449,18 @@ def execute_sql_statements( # pylint: disable=too-many-arguments, too-many-loca | |||
with closing(engine.raw_connection()) as conn: | |||
# closing the connection closes the cursor as well | |||
cursor = conn.cursor() | |||
cancel_query_payload = db_engine_spec.get_cancel_query_payload(cursor, query) | |||
logger.info(cancel_query_payload) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe remove the logger?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed by ec39477
superset/sql_lab.py
Outdated
|
||
|
||
def cancel_query(query: Query, user_name: Optional[str] = None) -> None: | ||
"""Cancal a running query.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spelling nit: cancel
also please update the docblock to match all the others with arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
improved by 4706b43
superset/db_engine_specs/base.py
Outdated
@classmethod | ||
def get_cancel_query_payload(cls, cursor: Any, query: Query) -> Any: | ||
""" | ||
Returns None if query can not be cancelled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you provide a more descriptive description of what this function does, especially for the non-base case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed by ef8436b
superset/db_engine_specs/base.py
Outdated
:param cursor: Cursor instance in which the query will be executed | ||
:param query: Query instance | ||
:return: Type of the payload can vary depends on databases | ||
but must be jsonable. None if query can't be cancelled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None if query can't be cancelled
could possibly be implied by better type hints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed by ef8436b
|
||
@classmethod | ||
def cancel_query(cls, cursor: Any, query: Query, payload: Any) -> None: | ||
cursor.execute("SELECT SYSTEM$CANCEL_ALL_QUERIES(%s)" % payload) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use a Python 3 format string instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed by bfc8d60
I liked it too, it's elegant but turns out there are some problems. We can't check what the user answered in the page leave confirmation popup hence we can't stop the query conditionally. Checking the user's answer requires to introduce the unload event but in the Using standard I made the following changes in 54da5e4:
The only solution I can see to make the auto-kill optional is to introduce the "Cancel db query on window close" configurable option in the databases table but that'd require some changes in the core |
At the moment, editing a dashboard / chart and navigating away will prompt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! I have a small question, and may main concern is that I think we should never cancel a running query if the DB is configured to be async.
onBeforeUnload(event) { | ||
if (this.props.latestQuery?.state === 'running') { | ||
event.preventDefault(); | ||
this.stopQuery(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more question about this: can we make sure that we stop only queries when the DB is in sync mode? if the DB is in async mode the expectation I don't think we should ever cancel the query by navigating away from the browser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make it at least configurable? I’m happy to go for the extra mile if needed and add a new extra param to the db config if no objections. We really need to cancel all db queries both in sync and async modes.
This is because we’d like to go with async mode to utilise the celery workers but we still expect a lot of accidental cross joins, missing WHERE clauses and unwanted long running queries. In this case people will just close the browser and that’s making other people impossible running new queries. DBs like snowflake start queuing queries very quickly and unwanted long running queries easily could block using the entire db for long hours. Or DBAs have to kill unwanted queries manually.
Furthermore if it works only in sync mode then the stop button is basically doing nothing in async mode with non hive/presto databases.
Let me know your thoughts 🙇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, if there's a legit use case let's make it configurable. I like the idea of having a per-DB option, this way for Hive we could have it off (since queries can take hours), and for Postgres (say) we could have it on.
@yousoph, any thoughts here?
Furthermore if it works only in sync mode then the stop button is basically doing nothing in async mode with non hive/presto databases.
Oh, sorry if I wasn't clear, I meant stopping a query by navigating away. If we press the stop button we should still cancel async queries, regardless of the configuration flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now configurable at Database -> Performance -> Cancel query on window unload event
and defaults to False to keep the current behaviour. Commit: e5223ec
superset/db_engine_specs/base.py
Outdated
return None | ||
|
||
@classmethod | ||
def cancel_query(cls, cursor: Any, query: Query, cancel_query_id: str) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to return a boolean indicating if the query was canceled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed by 517c792
Added a new option to Commit: e5223ec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome! <3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this feature. I wasn't aware that the STOP
button in SQL Lab didn't actually stop the query. From a UX perspective, given that not all engines are supported, it seems the frontend should only show the STOP
button for eligible databases.
""" | ||
try: | ||
cursor.execute(f"KILL CONNECTION {cancel_query_id}") | ||
except Exception: # pylint: disable=broad-except |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catching broad exceptions (per the Pylint message) is undesirable. This should most likely be sqlalchemy.exc.DBAPIError.
:param query: Query instance | ||
:return: Snowflake Session ID | ||
""" | ||
cursor.execute("SELECT CURRENT_SESSION()") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@koszti does this work? The reason I ask is per the documentation, CURRENT_SESSION
Returns a unique system identifier for the Snowflake session corresponding to the present connection
yet the connection associated with said cursor doesn't not the same connection which instantiated the query (in all likelihood for async queries that connection resides on a Celery host). Thus I would suspect there are actually no queries running within the current session.
The same is true for the other engines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The query that you're referring to is running when the query initiated and not when the stop button pressed.
It's using the same cursor that is created in the first place in execute_sql_statements
here. The returned session id then saved into the query
table in the superset backend database.
When the stop button pressed, we get the saved session id from the query
table and kill the query by another SQL. In this way queries can be killed from every celery worker.
It's tested on Postgres, MySQL and Snowflake engines with sync and async query mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies. I misread the code logic. I originally thought the new cursor which is used to stop the query was being used to obtain the session identifier.
@koszti at Airbnb we predominantly use Presto (where only the system connector is able to kill queries) and though the new behavior is more reflective of what's actually happening, i.e., the query isn't stopped it does result in a subtle UX "regression". Specifically users have to wait for the query to complete before attempting to run another query, whereas previously hitting |
* feat: cancel db query on stop * fix pylint * Add unit tests * Do not bind multiple times * Stop only running queries * Postgres to cancel only the required query * Remove extra log * Add docstring * Better types, docstring and naming * Use python3 format strings * Update superset/sql_lab.py Co-authored-by: Beto Dealmeida <[email protected]> * Add cancel_query_on_windows_unload option to database * Return cancel_query as bool Co-authored-by: Beto Dealmeida <[email protected]>
* feat: cancel db query on stop * fix pylint * Add unit tests * Do not bind multiple times * Stop only running queries * Postgres to cancel only the required query * Remove extra log * Add docstring * Better types, docstring and naming * Use python3 format strings * Update superset/sql_lab.py Co-authored-by: Beto Dealmeida <[email protected]> * Add cancel_query_on_windows_unload option to database * Return cancel_query as bool Co-authored-by: Beto Dealmeida <[email protected]>
* feat: cancel db query on stop * fix pylint * Add unit tests * Do not bind multiple times * Stop only running queries * Postgres to cancel only the required query * Remove extra log * Add docstring * Better types, docstring and naming * Use python3 format strings * Update superset/sql_lab.py Co-authored-by: Beto Dealmeida <[email protected]> * Add cancel_query_on_windows_unload option to database * Return cancel_query as bool Co-authored-by: Beto Dealmeida <[email protected]>
SUMMARY
This is addressing #15402
SQL Lab has Stop button but it only updates the state of superset without actually stopping the database query. Leaving the queries running in the background affect database performance. This PR is to stop running queries in the backend databases when:
Database -> Performance -> Cancel query on window unload event
.It currently supports mysql, postgres and snowflake databases but new database support can be added easily. The PR is supporting sync and async query execution modes.
This is a refactored and extended version of the unmerged and inactive #7611
TESTING INSTRUCTIONS
SELECT sleep(60)
SHOW PROCESSLIST
should not show the query from point 1)ADDITIONAL INFORMATION