-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Ability to Save Results from Previous Queries Using S3 #1797
Comments
Is anyone assigned this issue? |
It was partially implemented here: #2113 |
@bkyryliuk - it sounds like Ben did the work to implement the S3 cache. What's the remaining gap? Actually viewing/downloading the data? |
Notice: this issue has been closed because it has been inactive for 429 days. Feel free to comment and request for this issue to be reopened. |
@mistercrunch I'd like to ask for this to be reopened. Our use case is showing query results from today together with a "snapshot" of yesterdays data. So like @jfeng15 asked, what's the remaining gap that needs to be developed? |
s3 cache works as a place to hold SQL Lab results when the database is operating in async mode. In async mode, the query is executed on a Superset worker, and the worker updates the state of the query and flushes the results to the "results backend" which can be s3, redis, or whatever else. It's documented in the installation docs. |
so it this the right issue for what we're asking? or should I open a feature request? |
Why not creating a summary table (CREATE TABLE AS) and then slice/dicing the result set in that summary table? |
Because on a large scale, this would be abusive. Consider doing this using CTAS, the database would become bloated very quickly. My thought was to have a "save result" feature where you could slice on a saved results set. |
Seems like a data engineering / data preparation would be best in this case. You'd ETL your user conf data into a data warehouse, while keeping track of the history/changes. While Superset has caching features, there's no chart "snapshoting" feature and would make the cache saved as of a point in time for a specific chart. I don't see that coming anytime soon, so your best bet is ETL. |
Currently, we do not provide a way to save query results outside of CTAS (Create Table As). We should add a feature that allows a user to view/download the results of a previous query even after a new query is run, using S3 as the backend for holding those results.
The text was updated successfully, but these errors were encountered: