-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support logical plan compilation #2648
Conversation
Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
This comment was marked as outdated.
This comment was marked as outdated.
Hi @waynexia and thanks for the contribution. I understand using the JIT to compile expression trees but I don't understand why we would want to compile a logical plan rather than leverage the JIT in the physical plan? Could you explain that some more? The logical plan has the concept of a join but we don't know whether that is a HashJoin, CrossJoin, or SortMergeJoin until we get to the physical plan so I don't see how we could compile anything from the logical plan for joins as one example. |
Hi @andygrove, thanks for your review. My proposal is to leverage JIT in the phase we convert logical plan into physical execution plan (in the
I suppose in this phase we have enough information on how to "physically" execute a SQL. JIT module should only replace some previous And in another perspective, I think it's not easy to compile a physical plan. One main reason is we can translate logical expr to JIT expr, but not physical expr to JIT expr. Logical expr is a sort of AST which is easy to compile, and physical expr is an "opaque" operation. However this is not infeasible at all. As showed in this paper. One of the idea is to let those operators used to operate on data to generate IR, and then use the IR to operate data. We could achieve something like // add a new method to this existing trait
trait ExecutionPlan {
fn jit_compile(&self, jit_ctx: JITContext) {
// add the logic to JIT Context. E.g. a filter plan:
jit_ctx.expr(
// let column_c = column_a + column_b
// if column_c > 0, materialize column_d to output
)
}
}
fn jit_exec(plan: ExecutionPlan, ctx: JITContext) {
// let every physical plans to register their logic to context
// and finalize these logic to executable program.
ctx.compile(plan);
ctx.execute()
} I haven't considered and compared these two ways deeply. But in the current stage, I think they only differ on how to structure our implementation. But they may have an influence on the future topic, like minimizing memory footprint or optimizing (on our side) generated code. |
Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
Ok I think it is ready for review. The generated code and test preparation are verbose 😢 Maybe there is some way to do it better? |
Signed-off-by: Ruihang Xia <[email protected]>
Hi @waynexia I think it would help move this effort forward if we could have some sort of high-level issue with a brief overview of the issues this is seeking to tackle, along with a link to some sort of design document. Recent examples of this might be, #2502 or #2199. This will help provide context for reviewers, and allow discussion of different approaches with all the necessary information available. On a related note, I wonder if perhaps this could live as a repo in https://github.com/datafusion-contrib/ ? |
Appreciate your advice and suggestion @tustvold! I'm drafting the proposal, hope I could put it into discussion this week.
I'm not very familiar with this mechanism. Which aspects are different for living in the main repo or as a standalone repo |
|
This is a good choice, I'm ok with both. What are others' opinions? And by the way, I have drafted a proposal for JIT. Please let me know what do you think on this too 😃 cc @yjshen @Dandandan @alamb @houqp @viirya (sorry for disturbing) |
Thank you for writing this up, I've left some comments on the document. Some high-level thoughts:
|
I agree that plan level JIT is a great idea, thank you @waynexia for writing up the document as well as this PR. I am sorry it took so long to review it. TLDR:
I am not sure if you have seen the following paper, but it gives a good treatment on the various tradeoffs between vectorized and JIT's compilation of query plans and I think it is quite relevant to this discussion: https://db.in.tum.de/~kersten/vectorization_vs_compilation.pdf?lang=de Here is the canonical plan I think of that benefits from JIT'ing:
In this case, the code to filter and update the hashtable are compiled together so that input rows from the data source directly update the hash table without ever leaving registers. I believe this kind of plan can be made super fast and results state of the art in terms of query performance. The idea of JIT'ing multiple plan nodes together is a necessary part of this for sure, but one of the last ones. The first step is fully JIT'ing the expr evaluation. I really like the idea of using DataFusion's extensibility model to develop / prototype this approach in a datafusion-contrib or other repo until it is mature enough to bring into the core codebase. Though to be honest, perhaps the same approach could/should be used for the new scheduler @tustvold is working on too 🤔 In terms of the cache invalidation issues, I think #2199 will help the current vectorized approach to minimize the number of times a batch I will also try to update #2122 with some more specifics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer if this was in datafusion-contrib but I think in the jit module is OK too
let jit_exec_plan = jit_ctx.compile_logical_plan(projection_plan).unwrap(); | ||
|
||
// execute | ||
let output_batch = jit_exec_plan.execute(input_batch).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is pretty neat
Signed-off-by: Ruihang Xia <[email protected]>
Thank you all for the reviews ❤️
How should I get started in
Looks cool. I'll try to get involved. And I really like @alamb's reason 1!
This makes sense. I shall revisit the task list. That paper also gives some interesting data, like compilation sometimes will miss more branches. And besides the conclusion, it shows (me) a concrete way to measure cost and improvement. |
I have spent some time thinking about this and I am not as sure JIT'ing expressions other than multi-column row comparisons will really improve performance over Arrow's optimized kernels -- I think as @waynexia has suggested, the only way we will ever really know is to try it |
Marking as draft so we don't accidentally merge this |
Close this. I'll continue under a separate repo, appreciate it again ❤️ |
Which issue does this PR close?
Closes #.
Rationale for this change
This PR implements logical plan compilation and executes the compiled plan on record batch.
But this implementation only covers a very few types. Only
i64
&f64
array's batch and projection plan are supported. It wouldn't be difficult to extend to all primitive types. However, other complex array types and other plans that may require variable output length (filter) and complex algorithms (join, aggr etc) are hard to implement in pure cranelift IR. It would be unavoidable to call into rust code. This is acceptable in my perspective as the overhead can be alleviated or eliminated (via LLVM, I guess).What changes are included in this PR?
This PR adds serval JIT-related structs. The most important is
JITContext
andJITExecutionPlan
.JITContext
supports compile logical plan into executableJITExecutionPlan
. AndJITExecutionPlan
is a minimalExecutionPlan
which supportexecute()
onRecordBatch
.Now we can do something like
Are there any user-facing changes?
No
Does this PR break compatibility with Ballista?
No