-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up presto queries! #10139
Comments
Issue-Label Bot is automatically applying the label Links: app homepage, dashboard and code for this bot. |
@john-bodley @villebro @mistercrunch, is this a viable solution? What are the performance/load implications on polling a cursor 10 times more often? Is there a way where we don't need to poll the cursor but instead use some async magic to await on a response? |
Can probably adjust the polling interval according to number of retries, i.e., longer interval if waited longer.
|
Good idea to use some form of exponential or other backoff. However, perhaps leverage Having said all this, while unnecessary waiting should be avoided, I'm personally of the inclination OLAP-type queries are ok to take a few seconds. As I expect people will tend to be hitting fairly expensive queries through SQL Lab, waiting for a few seconds for something that completed in 0.1 seconds, and conversely waiting 32 seconds for something that took 30.1 seconds, usually isn't that big a deal for me. Especially if it can spare a busy db some unnecessary load. |
Echoing @villebro and @ktmud, some sort of backoff sounds good, so that a 0.1 second query comes back in 0.2 or 0.4 seconds, but the 30 second query comes back in 32. I will note that performance around the 1 to 5 second long queries does appear especially bad on Superset's side, especially because of all the steps required to run the query:
This means that a query that takes a second or 2 on cli, can take 2-3 times as long in sql lab. Any improvements we can make around this without overloading services seems worth considering In the long term, moving to the approach outlined in #9190 will solve a lot of these issues too, but i think updating the cursor wait time here is a valid thing to start with |
I reckon there is no harm at polling every 0.1 seconds, presto jdbc driver does that anyway (website says "Presto is an open source distributed SQL query engine for running interactive analytic queries"). I heard of the 3 second rule with keeping attention span (https://paceadv.com/the-3-second-rule/), if your query takes 3.5seconds you have a bit of disengagement but if its a snappy < 1.5 seconds stay hooked. If others concerned about increasing load on presto then perhaps allow these to be set in config of the data source? |
Is it required for Superset to write query results cache to s3? Why not Redis? And why does it need to download the cache from s3 again? |
@ktmud it is configurable in superset config, s3 is more scalable if you have many users and large results @villebro it seems like poll interval is configurable: https://github.com/dropbox/PyHive/blob/8eb0aeab8ca300f3024655419b93dad926c1a351/pyhive/presto.py#L93 I don't have capacity to make the change, but can review the code if needed to expose it as sqllachemy conn stirng parameter |
SQLAlchemy supports the setting this per here so no changes to PyHive are necessary. |
@ktmud Perhaps I was a bit unclear in my description. When running the query on the async worker, once completed it gets written to a results backend. We use s3 for the reasons @bkyryliuk said. We then mark the query as successfully completed, and add the cache key to the query record Then, we need to get the results to the client. This is done by polling the |
Thanks for the detailed explanation, @etr2460 and @bkyryliuk . |
I started working on this today (bf0025f), but I don't think it makes a lot of sense to add a backoff within Superset if pyhive only supports constant polling intervals. Therefore, I have a few options, ranked in my preference order:
@john-bodley @bkyryliuk @ktmud @villebro @tooptoop4, thoughts? |
I like approach #1. |
@tooptoop4 if you use master branch then you should be able to set the |
"SELECT 1" - the simplest of presto queries takes over 3 seconds (3.18-3.4 seconds for me) when run from superset but only 0.2 seconds when run using presto JDBC.
Sleep 1 second at this line is too long:
https://github.com/apache/incubator-superset/blob/0.36.0/superset/db_engine_specs/presto.py#L760
see dropbox/PyHive#337 for another culprit line
I changed to 0.1 for both cases and now I can get 0.47 seconds from superset!
I can also get select count(1) from sometable in 0.86 seconds from superset after the 2 changes (was 3.5 seconds before)
The text was updated successfully, but these errors were encountered: