Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract CSE logic to datafusion_common #13002

Merged
merged 3 commits into from
Oct 21, 2024
Merged

Conversation

peter-toth
Copy link
Contributor

@peter-toth peter-toth commented Oct 18, 2024

Which issue does this PR close?

Part of #12599.

Rationale for this change

This PR moves common subexpression elimination logic to datafusion_common to be able to share it with physical expressions.

What changes are included in this PR?

This PR:

  • Moves CSE logic to datafusion_common::cse. The logic is splited to CSE struct, that does the elimination of common subtrees and CSEController trait, that can be implemented for different TreeNode types.
    The PR contains 2 implementations.ExprCSEController can be used to eliminate Expr common subtrees, and TestTreeNodeCSEController is used with TestTreeNodes in tests. A new PhysicalExprCSEController will be added in a follow-up PR to eliminate common dyn PhysicalExpr subtrees.
  • Extracts the HashNode trait to be able to implement it for dyn PhysicalExpr in a folllow-up PR.
  • Makes TestTreeNode crate public for CSE tests.

Are these changes tested?

Yes, with existing but refactored tests.

Are there any user-facing changes?

No.

@github-actions github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules common Related to common crate labels Oct 18, 2024
@peter-toth
Copy link
Contributor Author

cc @alamb

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @peter-toth -- this looks really nice to me. I have a few comment suggestions, and then I would like to run benchmarks to make sure this refactor doesn't accidentally slow down planning

But otherwise this looks great to me

use std::hash::Hasher;
use std::sync::Arc;

pub trait HashNode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we please add some documentation to this trait about its purpose?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I've added it in 1cae61c.


impl HashNode for Expr {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

datafusion/common/src/cse.rs Outdated Show resolved Hide resolved
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add some documentation explaining what is in the module and what the main entry points are/

Something like

Suggested change
//! Common Subexpression Elimination logic: [`SubTreeNodeEliminator`]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, good idea. Added i documentation in 1cae61c.

@alamb
Copy link
Contributor

alamb commented Oct 18, 2024

FYI @andygrove -- CSE on physical plans is happening

@alamb
Copy link
Contributor

alamb commented Oct 20, 2024

I ran the benchmarks and there is no change in performance 🚀

++ critcmp main extract-cse-logic
group                                         extract-cse-logic                      main
-----                                         -----------------                      ----
logical_aggregate_with_join                   1.01  1420.8±40.68µs        ? ?/sec    1.00  1408.6±14.31µs        ? ?/sec
logical_plan_tpcds_all                        1.00    194.6±1.41ms        ? ?/sec    1.00    194.1±2.83ms        ? ?/sec
logical_plan_tpch_all                         1.00     24.0±0.24ms        ? ?/sec    1.01     24.1±0.30ms        ? ?/sec
logical_select_all_from_1000                  1.01      5.4±0.06ms        ? ?/sec    1.00      5.3±0.05ms        ? ?/sec
logical_select_one_from_700                   1.00  1130.2±22.28µs        ? ?/sec    1.00  1127.6±13.80µs        ? ?/sec
logical_trivial_join_high_numbered_columns    1.00  1091.3±34.33µs        ? ?/sec    1.00  1094.4±27.40µs        ? ?/sec
logical_trivial_join_low_numbered_columns     1.01  1073.2±28.06µs        ? ?/sec    1.00  1065.6±17.13µs        ? ?/sec
physical_plan_tpcds_all                       1.01  1237.2±21.72ms        ? ?/sec    1.00  1228.5±13.18ms        ? ?/sec
physical_plan_tpch_all                        1.00     80.3±0.50ms        ? ?/sec    1.00     80.5±0.72ms        ? ?/sec
physical_plan_tpch_q1                         1.02      3.0±0.05ms        ? ?/sec    1.00      2.9±0.02ms        ? ?/sec
physical_plan_tpch_q10                        1.03      4.2±0.10ms        ? ?/sec    1.00      4.1±0.09ms        ? ?/sec
physical_plan_tpch_q11                        1.02      3.5±0.07ms        ? ?/sec    1.00      3.5±0.05ms        ? ?/sec
physical_plan_tpch_q12                        1.01      2.8±0.04ms        ? ?/sec    1.00      2.8±0.02ms        ? ?/sec
physical_plan_tpch_q13                        1.00      2.2±0.04ms        ? ?/sec    1.02      2.3±0.04ms        ? ?/sec
physical_plan_tpch_q14                        1.00      2.5±0.02ms        ? ?/sec    1.02      2.6±0.06ms        ? ?/sec
physical_plan_tpch_q16                        1.00      3.5±0.08ms        ? ?/sec    1.00      3.6±0.07ms        ? ?/sec
physical_plan_tpch_q17                        1.00      3.2±0.03ms        ? ?/sec    1.01      3.2±0.04ms        ? ?/sec
physical_plan_tpch_q18                        1.02      3.8±0.08ms        ? ?/sec    1.00      3.7±0.03ms        ? ?/sec
physical_plan_tpch_q19                        1.00      5.1±0.07ms        ? ?/sec    1.00      5.1±0.05ms        ? ?/sec
physical_plan_tpch_q2                         1.01      6.7±0.14ms        ? ?/sec    1.00      6.7±0.04ms        ? ?/sec
physical_plan_tpch_q20                        1.00      4.1±0.04ms        ? ?/sec    1.00      4.1±0.04ms        ? ?/sec
physical_plan_tpch_q21                        1.00      5.3±0.03ms        ? ?/sec    1.00      5.4±0.03ms        ? ?/sec
physical_plan_tpch_q22                        1.00      3.1±0.03ms        ? ?/sec    1.04      3.3±0.10ms        ? ?/sec
physical_plan_tpch_q3                         1.01      3.1±0.07ms        ? ?/sec    1.00      3.0±0.05ms        ? ?/sec
physical_plan_tpch_q4                         1.00      2.3±0.03ms        ? ?/sec    1.00      2.3±0.05ms        ? ?/sec
physical_plan_tpch_q5                         1.00      4.1±0.07ms        ? ?/sec    1.01      4.2±0.09ms        ? ?/sec
physical_plan_tpch_q6                         1.01  1685.3±43.27µs        ? ?/sec    1.00  1667.0±27.51µs        ? ?/sec
physical_plan_tpch_q7                         1.00      5.2±0.04ms        ? ?/sec    1.02      5.3±0.13ms        ? ?/sec
physical_plan_tpch_q8                         1.02      6.4±0.17ms        ? ?/sec    1.00      6.3±0.12ms        ? ?/sec
physical_plan_tpch_q9                         1.04      5.1±0.15ms        ? ?/sec    1.00      4.9±0.03ms        ? ?/sec
physical_select_all_from_1000                 1.01     40.6±0.19ms        ? ?/sec    1.00     40.1±0.89ms        ? ?/sec
physical_select_one_from_700                  1.00      3.6±0.03ms        ? ?/sec    1.00      3.6±0.03ms        ? ?/sec

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @peter-toth

I do think some of the suggestions I had for comments would help this code, but I also think we could do it as a follow on PR

@peter-toth
Copy link
Contributor Author

Thanks @alamb for the benchmark, I will address your comments tomorrow in this PR.

… shorter names for eliminator and controller, change `CSE::extract_common_nodes()` to return `Result<FoundCommonNodes<N>>` (instead of `Result<Transformed<FoundCommonNodes<N>>>`)
# Conflicts:
#	datafusion-cli/Cargo.lock
@peter-toth
Copy link
Contributor Author

Thank you @peter-toth

I do think some of the suggestions I had for comments would help this code, but I also think we could do it as a follow on PR

I've fixed a few things in 1cae61c and addressed your comments.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @peter-toth

@@ -108,7 +131,8 @@ type CommonNodes<'n, N> = IndexMap<Identifier<'n, N>, (N, String)>;

type ChildrenList<N> = (Vec<N>, Vec<N>);

pub trait SubTreeNodeEliminatorController {
/// The [`TreeNode`] specific definition of elimination.
pub trait CSEController {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for the new name

@alamb alamb merged commit 45a316c into apache:main Oct 21, 2024
26 checks passed
@peter-toth
Copy link
Contributor Author

Thanks for the review @alamb!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
common Related to common crate logical-expr Logical plan and expressions optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants