You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
We should be able to compile.
The benefit of this is that we can speed up complex / nested expressions by avoiding unnecesarry allocations
Describe the solution you'd like
We should be able to take in a collection of a RecordBatch / named Arrays and compile an expression like (a + b)/ 2 to a loop that results in a new Array.
The loop itself also must be included in the to-be compiled expression, to remove call overhead and allow for possible use of SIMD instructionsinstructions, either explicitly by instrumenting cranelift enough or through auto-vectorization.
Describe alternatives you've considered
n/a
Additional context
The text was updated successfully, but these errors were encountered:
datafusion PhysicalExpr and arrow-rs library currently evaluate expressions by "materializing intermediate results" -- for example (a + b) + c results in first evaluating (a+b) to a temporary location and then adding c to form the final result.
Note however, there is a tradeoff here between the speed gained using the LLVM optimized vectorized kernels in arrow-rs and cranelift generated JIT expressions where JIT may not actually be faster. I think this is what @Dandandan is referring to when he says "allow for possible use of SIMD instructions, either explicitly by instrumenting cranelift enough or through auto-vectorization."
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
We should be able to compile.
The benefit of this is that we can speed up complex / nested expressions by avoiding unnecesarry allocations
Describe the solution you'd like
We should be able to take in a collection of a RecordBatch / named Arrays and compile an expression like (a + b)/ 2 to a loop that results in a new Array.
The loop itself also must be included in the to-be compiled expression, to remove call overhead and allow for possible use of SIMD instructionsinstructions, either explicitly by instrumenting cranelift enough or through auto-vectorization.
Describe alternatives you've considered
n/a
Additional context
The text was updated successfully, but these errors were encountered: