-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove some recursive cloning from logical planning #9050
Remove some recursive cloning from logical planning #9050
Conversation
It would be interesting see what the effects are on the planning benchmarks in sql_planning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ozankabak
My little concern with this is https://doc.rust-lang.org/std/vec/struct.Vec.html#method.swap_remove
Removes an element from the vector and returns it.
The removed element is replaced by the last element of the vector.
This does not preserve ordering, but is O(1). If you need to preserve the element order, use remove instead.
So we probably need to keep in mind that collection after swap_remove
can be in different order, which may cause surprises?
We basically drain the vector so there shouldn't be any leftover collection if I am not misunderstanding your concern (we make N calls to swap_remove when dealing with an operator with N children). I would also prefer a neater way to drain the vector instead of making swap_remove calls, let me know if you have any suggestions. |
I will run these and report my findings here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @ozankabak -- I went through this PR carefully and I think it is a nice improvement in general. I am in the process of running the sql_planner
benchmarks, but I think this is an improvement in code quality so even if that doesn't show any measurable benefit I still think we should merge this PR
) -> Result<LogicalPlan> { | ||
match self { | ||
// Since expr may be different than the previous expr, schema of the projection | ||
// may change. We need to use try_new method instead of try_new_with_schema method. | ||
LogicalPlan::Projection(Projection { .. }) => { | ||
Projection::try_new(expr, Arc::new(inputs[0].clone())) | ||
Projection::try_new(expr, Arc::new(inputs.swap_remove(0))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think swap_remove
on a one element vector is basically the same as pop().unwrap()
@@ -752,8 +752,8 @@ impl LogicalPlan { | |||
}).collect::<Result<Vec<(Expr, Expr)>>>()?; | |||
|
|||
Ok(LogicalPlan::Join(Join { | |||
left: Arc::new(inputs[0].clone()), | |||
right: Arc::new(inputs[1].clone()), | |||
left: Arc::new(inputs.swap_remove(0)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is correct, but relies on there being exactly 2 inputs which seems like a reasonable assumption at this point
}; | ||
if !join_conditions.is_empty() { | ||
new_exprs.push(join_conditions.into_iter().reduce(Expr::and).unwrap()); | ||
let mut exprs = join_plan.expressions(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is certainly a nice cleanup too 👍
what comes to my mind if we have a Speaking on neater way, nothing comes to my head tbh, perhaps we can play with mem::take or mem::replace, it wont drain though but we eliminate a clone here. Will require some stub on LogicalPlan, so not sure if its worth
|
There's also this https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=9f1c30995ed62e56694195d6edb14db4 The iterator returned by vec.into_iter() returns owned values. |
Here are the results of running the planning benchmark: Command: cargo bench --bench sql_planner I compared to 92104a5 Results show a small but positive improvement (0 - 3%) ✅
|
Thank you for all the reviews! I will keep looking at the logical planning code for further improvements. Also, if we find out neater ways than |
Which issue does this PR close?
Closes #.
Rationale for this change
Investigating why the logical planner seems to run out of available stack depth, I saw that recursive cloning is unfortunately quite prevalent in this part of the codebase.
What changes are included in this PR?
Some internal helper function signatures, their implementations, and the
with_new_exprs
API is changed to get ownedLogicalPlan
objects to avoid unnecessary recursive cloning.Are these changes tested?
Yes, by existing tests (no new features are added).
Are there any user-facing changes?
The
with_new_exprs
API is slightly changed from taking in a slice (which results in unnecessary recursive cloning) to an owned vector.