-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Automatic Workload-Driven Query Acceleration by OpenSearch #128
Comments
Hi @dai-chen the design is excellent, always appreciate it that people like me in the community has the opportunity to read such great design like this! But I have a couple of questions after reading it.
|
thanks @chloe-zh for the comment.
-> OpenSearch will be used for join and intermediate materialized views.. -> no in memory stuff. OpenSearch data structures are awesome. hope this answers some questions , we will post more details shortly here with some demo videos :) |
Is your feature request related to a problem?
In a database engine, there are different ways to optimize query performance. For instance, rule-based/cost-based optimizer and distributed execution layer tries to find best execution plan by cost estimate and equivalent transformation of query plan. Here we're proposing an alternative approach which is to accelerate query execution by materialized view for time-space tradeoff.
What solution would you like?
Architecture
Here is a reference architecture that illustrates components and the entire workflow which essentially is a workload-driven feedback loop:
Basically, feedback is referring to various materialized view prebuilt (either online or offline) which hints acceleration opportunity to query optimizer.
There are 2 areas and paths moving forward for both of which lack open source solutions:
General Acceleration Workflow
1.Workload Telemetry Collecting
Collect query plan telemetry generated in query execution and emit it as feedback generation input.
2.Workload Telemetry Preprocessing
Preprocess query plan telemetry into uniform workload representation.
3.View Selection
Analyze workload data and select sub-query as materialization candidate according to view selection algorithm.
4.View Materialization and Maintenance
Materialize selected view and maintain the consistency between source data and materialized view data, by incrementally refreshing for example.
5.Query Plan Optimization
At last, query optimizer checks the existing materialized view and replace original operator with scan on materialized view.
The text was updated successfully, but these errors were encountered: