Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove some recursive cloning from logical planning #9050

Merged
merged 1 commit into from
Jan 30, 2024
Merged

Remove some recursive cloning from logical planning #9050

merged 1 commit into from
Jan 30, 2024

Conversation

ozankabak
Copy link
Contributor

Which issue does this PR close?

Closes #.

Rationale for this change

Investigating why the logical planner seems to run out of available stack depth, I saw that recursive cloning is unfortunately quite prevalent in this part of the codebase.

What changes are included in this PR?

Some internal helper function signatures, their implementations, and the with_new_exprs API is changed to get owned LogicalPlan objects to avoid unnecessary recursive cloning.

Are these changes tested?

Yes, by existing tests (no new features are added).

Are there any user-facing changes?

The with_new_exprs API is slightly changed from taking in a slice (which results in unnecessary recursive cloning) to an owned vector.

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Physical Expressions optimizer Optimizer rules core Core DataFusion crate labels Jan 29, 2024
@github-actions github-actions bot removed physical-expr Physical Expressions core Core DataFusion crate labels Jan 29, 2024
@ozankabak ozankabak added the api change Changes the API exposed to users of the crate label Jan 29, 2024
@ozankabak ozankabak requested a review from alamb January 29, 2024 14:01
@ozankabak ozankabak marked this pull request as ready for review January 29, 2024 14:01
@simonvandel
Copy link
Contributor

It would be interesting see what the effects are on the planning benchmarks in sql_planning.

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ozankabak
My little concern with this is https://doc.rust-lang.org/std/vec/struct.Vec.html#method.swap_remove

Removes an element from the vector and returns it.

The removed element is replaced by the last element of the vector.

This does not preserve ordering, but is O(1). If you need to preserve the element order, use remove instead.

So we probably need to keep in mind that collection after swap_remove can be in different order, which may cause surprises?

@ozankabak
Copy link
Contributor Author

We basically drain the vector so there shouldn't be any leftover collection if I am not misunderstanding your concern (we make N calls to swap_remove when dealing with an operator with N children).

I would also prefer a neater way to drain the vector instead of making swap_remove calls, let me know if you have any suggestions.

@alamb
Copy link
Contributor

alamb commented Jan 29, 2024

It would be interesting see what the effects are on the planning benchmarks in sql_planning.

I will run these and report my findings here

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @ozankabak -- I went through this PR carefully and I think it is a nice improvement in general. I am in the process of running the sql_planner benchmarks, but I think this is an improvement in code quality so even if that doesn't show any measurable benefit I still think we should merge this PR

) -> Result<LogicalPlan> {
match self {
// Since expr may be different than the previous expr, schema of the projection
// may change. We need to use try_new method instead of try_new_with_schema method.
LogicalPlan::Projection(Projection { .. }) => {
Projection::try_new(expr, Arc::new(inputs[0].clone()))
Projection::try_new(expr, Arc::new(inputs.swap_remove(0)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think swap_remove on a one element vector is basically the same as pop().unwrap()

@@ -752,8 +752,8 @@ impl LogicalPlan {
}).collect::<Result<Vec<(Expr, Expr)>>>()?;

Ok(LogicalPlan::Join(Join {
left: Arc::new(inputs[0].clone()),
right: Arc::new(inputs[1].clone()),
left: Arc::new(inputs.swap_remove(0)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is correct, but relies on there being exactly 2 inputs which seems like a reasonable assumption at this point

};
if !join_conditions.is_empty() {
new_exprs.push(join_conditions.into_iter().reduce(Expr::and).unwrap());
let mut exprs = join_plan.expressions();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is certainly a nice cleanup too 👍

@comphead
Copy link
Contributor

We basically drain the vector so there shouldn't be any leftover collection if I am not misunderstanding your concern (we make N calls to swap_remove when dealing with an operator with N children).

I would also prefer a neater way to drain the vector instead of making swap_remove calls, let me know if you have any suggestions.

what comes to my mind if we have a let x = vec![0, 1, 2, 3]
and call s.swap_remove(0) the remaining collection can be either vec![3, 1, 2] or vec![3, 2, 1]. So I'm wondering if it can be an issue, having potentially random element on calling subsequent s.swap_remove(0) ? 🤔

Speaking on neater way, nothing comes to my head tbh, perhaps we can play with mem::take or mem::replace, it wont drain though but we eliminate a clone here. Will require some stub on LogicalPlan, so not sure if its worth

                Projection::try_new(expr, Arc::new(std::mem::replace(&mut inputs[0], LogicalPlan::Empty))))
                    .map(LogicalPlan::Projection)

@simonvandel
Copy link
Contributor

what comes to my mind if we have a let x = vec![0, 1, 2, 3]
and call s.swap_remove(0) the remaining collection can be either vec![3, 1, 2] or vec![3, 2, 1]. So I'm wondering if it can be an issue, having potentially random element on calling subsequent s.swap_remove(0) ? 🤔

There's also this https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=9f1c30995ed62e56694195d6edb14db4

The iterator returned by vec.into_iter() returns owned values.

@alamb
Copy link
Contributor

alamb commented Jan 29, 2024

Here are the results of running the planning benchmark:

Command:

cargo bench --bench sql_planner

I compared to 92104a5

Results show a small but positive improvement (0 - 3%) ✅

logical_select_one_from_700
                        time:   [2.1687 ms 2.1705 ms 2.1725 ms]
                        change: [-1.1189% -0.9112% -0.7395%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe

physical_select_one_from_700
                        time:   [17.260 ms 17.280 ms 17.308 ms]
                        change: [+0.0368% +0.1677% +0.3832%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe

Benchmarking logical_trivial_join_low_numbered_columns: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.3s, enable flat sampling, or reduce sample count to 60.
logical_trivial_join_low_numbered_columns
                        time:   [1.0610 ms 1.0630 ms 1.0651 ms]
                        change: [-0.3738% -0.0743% +0.2045%] (p = 0.61 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

Benchmarking logical_trivial_join_high_numbered_columns: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.5s, enable flat sampling, or reduce sample count to 60.
logical_trivial_join_high_numbered_columns
                        time:   [1.0949 ms 1.0966 ms 1.0984 ms]
                        change: [-1.5726% -0.8182% -0.3244%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

Benchmarking logical_aggregate_with_join: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.9s, enable flat sampling, or reduce sample count to 50.
logical_aggregate_with_join
                        time:   [1.5551 ms 1.5570 ms 1.5592 ms]
                        change: [-0.6869% -0.4593% -0.2266%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch      time:   [4.6632 ms 4.6695 ms 4.6766 ms]
                        change: [-3.8696% -3.4432% -3.1385%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  7 (7.00%) high mild
  3 (3.00%) high severe

@ozankabak
Copy link
Contributor Author

ozankabak commented Jan 30, 2024

Thank you for all the reviews! I will keep looking at the logical planning code for further improvements. Also, if we find out neater ways than swap_remove, we will improve with a quick follow-on PR.

@ozankabak ozankabak merged commit ee9736f into apache:main Jan 30, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate logical-expr Logical plan and expressions optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants