Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate TreeNode transform and rewrite APIs #8891

Merged
merged 54 commits into from
Mar 4, 2024

Conversation

peter-toth
Copy link
Contributor

@peter-toth peter-toth commented Jan 17, 2024

Which issue does this PR close?

Part of #8913

Rationale for this change

  1. The TreeNode.rewrite() / TreeNodeRewriter is very inconsistent with other TreeNode apply / visit functions:

    • TreeNodeRewriter.pre_visit() currently returns RewriteRecursion so as to control the recursion but that enum is very inconsistent with the VisitRecursion enum that visit functions use.
    • RewriteRecursion::Stop functionality is equal to VisitRecursion::Skip, that is to skip (prune) the subtree (don't recurse into children and don't run post-order functions).
    • RewriteRecursion::Skip doesn't prevent recursing into children. This is different to VisitRecursion::Skip where recursion into children is not executed.
      VisitRecursion::Stop fully stops recursion which functionality isn't available with RewriteRecursion.
    • The transformation defined in TreeNodeRewriter.mutate() is sometimes a top-down (pre-order) but other times a bottom-up (post-order) function depending on TreeNodeRewriter.pre_visit() result.

    This PR removes RewriteRecursion and renames VisitRecursion to a common TreeNodeRecursion to control all (apply / visit / rewrite / transform) APIs. Previous VisitRecursion::Skip got renamed to Jump and its functionality is now extended to bottom-up and combined traversals as follows:

    pub enum TreeNodeRecursion {
        /// Continue recursion with the next node.
        Continue,
    
        /// In top-down traversals, skip recursing into children but continue with the next
        /// node, which actually means pruning of the subtree.
        ///
        /// In bottom-up traversals, bypass calling bottom-up closures till the next leaf node.
        ///
        /// In combined traversals, if it is "f_down" (pre-order) phase, execution "jumps" to
        /// next "f_up" (post_order) phase by shortcutting its children. If it is "f_up" (pre-order)
        /// phase, execution "jumps" to next "f_down" (pre_order) phase by shortcutting its parent
        /// nodes until the first parent node having unvisited children path.
        Jump,
    
        /// Stop recursion.
        Stop,
    }

    and changes TreeNodeRewriter to incorporate an f_down() and a nf_up() methods that both can change the node and both can use the common Transformed struct (containing a TreeNodeRecursion enum) to control how to proceed with recursion:

    pub trait TreeNodeRewriter: Sized {
        /// The node type which is rewritable.
        type Node: TreeNode;
    
        /// Invoked while traversing down the tree before any children are rewritten.
        /// Default implementation returns the node unmodified and continues recursion.
        fn f_down(&mut self, node: Self::Node) -> Result<Transformed<Self::Node>> {
            Ok(Transformed::no(node))
        }
    
        /// Invoked while traversing up the tree after all children have been rewritten.
        /// Default implementation returns the node unmodified.
        fn f_up(&mut self, node: Self::Node) -> Result<Transformed<Self::Node>> {
            Ok(Transformed::no(node))
        }
    }

    This solution is capable to replace all RewriteRecursion functionalities and make the TreeNodeRewriters not just cleaner but more versatile.

  2. This PR adds a new transform_down_up() method as a short form of rewrite() / TreeNodeRewriter to provide a combined traversal:

    fn transform_down_up<FD, FU>(
        self,
        f_down: &mut FD,
        f_up: &mut FU,
    ) -> Result<Transformed<Self>>
    where
        FD: FnMut(Self) -> Result<Transformed<Self>>,
        FU: FnMut(Self) -> Result<Transformed<Self>>,
    • Sometimes we don't need a full blown TreeNodeRewriter but defining 2 closures for top-down and bottom-up transformations is just enough.
  3. This PR changes the Transformed enum the following struct:

    pub struct Transformed<T> {
        pub data: T,
        pub transformed: bool,
        pub tnr: TreeNodeRecursion,
    }
  4. This PR modifies the transform_down() and transform_up() methods to use the new Transformed struct.

  5. This PR modifies the TreeNodeVisitor trait to have its f_down() and f_up() alligned with TreeNodeRewriter trait.

Please note that this PR implements 2. from #7942 and fixes the Transformed enum issue: #8835

What changes are included in this PR?

This PR:

  • Removes RewriteRecursion,
  • Renames VisitRecursion to TreeNodeRecursion. The previous Skip element is renamed to Jump, its top-down functionality is adjusted and its bottom-up functionality is added.
  • Modifies Transformed enum to a new struct.
  • Modifies existing TreeNode transform methods to use the newTransformed struct.
  • Modifies TreeNodeRewriter to incorporate an f_down() and f_up() methods and use Transformed / TreeNodeRecursion,
  • Adds a new TreeNode.transform_down_up() method.
  • Adjusts TreeNodeVisitor to have f_down() and f_up() methods aligned with TreeNodeRewriter.
  • Alligns all apply / visit / transform / rewrite APIs to use f, f_down, f_up closure parameter names.

Are these changes tested?

Existng tests.

Are there any user-facing changes?

Yes.

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Physical Expressions optimizer Optimizer rules core Core DataFusion crate labels Jan 17, 2024
@peter-toth peter-toth changed the title Refactor TreeNode::rewrite() Refactor TreeNode.rewrite() Jan 17, 2024
@peter-toth
Copy link
Contributor Author

cc @alamb, @andygrove, @yahoNanJing, @berkaysynnada this PR is another step to cleaner TreeNode APIs.

@berkaysynnada
Copy link
Contributor

This PR:

Removes RewriteRecursion,
Renames VisitRecursion to TreeNodeRecursion,
Modifies TreeNodeRewriter to incorporate an f_down and f_up methods and use TreeNodeRecursion

These changes seem reasonable to me and, since they propose a similar approach to other recursive flows, they make understanding easier.

However, I am not sure about this change:

Modifies TreeNode.transform() to accept an f_down and f_up closures and use TreeNodeRecursion.

transform() is a simple method for traversing the graph with given self-concerned rules, being not interested in the type of traversal. This new version forces awareness of the traversal order, and there is also an implicit convention of traversing first bottom-up, then top-down.

@peter-toth
Copy link
Contributor Author

peter-toth commented Jan 17, 2024

Thanks for the feedback @berkaysynnada.

transform() is a simple method for traversing the graph with given self-concerned rules, being not interested in the type of traversal. This new version forces awareness of the traversal order, and there is also an implicit convention of traversing first bottom-up, then top-down.

I feel that a simple (simpler than than TreeNodeRewriter) API that is also capable to do both direction transformation in 1 traversal would be useful but I agree, it doesn't necessarily need to replace the existing transform().

Also, just to add a bit more details to this part of the PR: While I was local testing this PR I accidentally changed all transform() to transform_down() (so not to its current aliased transform_up() implementation) and run into stack overflow errors in some tests. This suggested me that despite transform() should be used only with closures that don't require any specific direction, some invocation do require transform to do its currently defined bottom-up traversal. I'm sure that most of the transform() usescases are direction agnostic, but I feel it would be less error prone if we required explicit direction from API users.

and there is also an implicit convention of traversing first bottom-up, then top-down.

I'm not sure I get this part.

@berkaysynnada
Copy link
Contributor

berkaysynnada commented Jan 17, 2024

I'm not sure I get this part.

The docstring you have updated for transform() states that the order is:

1) f_down(ParentNode)
2) f_down(ChildNode1)
3) f_up(ChildNode1)
4) f_down(ChildNode2)
5) f_up(ChildNode2)
6) f_up(ParentNode)

I meant what if we intend to do such:

1) f_up(ChildNode1)
2) f_up(ChildNode2)
3) f_up(ParentNode)
4) f_down(ParentNode)
5) f_down(ChildNode1)
6) f_down(ChildNode2)

If we want to offer such a use, I think we may need to consider this way as well.

@peter-toth
Copy link
Contributor Author

peter-toth commented Jan 17, 2024

I'm not sure I get this part.

The docstring you have updated for transform() states that the order is:

1) f_down(ParentNode)
2) f_down(ChildNode1)
3) f_up(ChildNode1)
4) f_down(ChildNode2)
5) f_up(ChildNode2)
6) f_up(ParentNode)

This is correct. This is because the proposed new transform() does only one traversal. It is exactly the same order how the old and new TreeNode::rewrite / TreeNodeRewriter worked and works.

If someone needs f_down to process all nodes atop-down nd then f_up to process all nodes bottom-up (or vice versa) then they need 2 traversals with subsequent explicit transform_down() / transform_up() calls.

@berkaysynnada
Copy link
Contributor

This is correct. This is because the proposed new transform() does only one traversal. It is exactly the same order how the old and new TreeNode::rewrite / TreeNodeRewriter worked and works.

I checked again. Yes, the order is correct. What I really meant is that when we use transform(), it is assumed that we first pass from top to bottom and then from bottom to top. But if we want to do the opposite, i.e. first from bottom to top and then from top to bottom, we cannot use this transform(), am I right?.

Since the nodes access their children, first top-down and then bottom-up traversals seem to be a single pass, doing it in reverse order is like 2 passes, but as a behavior, it may be desirable to have reverse order and should we provide a way for this?

@peter-toth
Copy link
Contributor Author

This is correct. This is because the proposed new transform() does only one traversal. It is exactly the same order how the old and new TreeNode::rewrite / TreeNodeRewriter worked and works.

I checked again. Yes, the order is correct. What I really meant is that when we use transform(), it is assumed that we first pass from top to bottom and then from bottom to top. But if we want to do the opposite, i.e. first from bottom to top and then from top to bottom, we cannot use this transform(), am I right?.

That's correct. I don't think that botom-up and then top-down can't be achieved in one run.

Since the nodes access their children, first top-down and then bottom-up traversals seem to be a single pass, doing it in reverse order is like 2 passes, but as a behavior, it may be desirable to have reverse order and should we provide a way for this?

API users can explicitly call transform_up() and then transform_down() (2 pass) if this is needed. Do you have a real use case for this?

@berkaysynnada
Copy link
Contributor

API users can explicitly call transform_up() and then transform_down() (2 pass) if this is needed. Do you have a real use case for this?

Since we set out to simplify and provide easy to use TreeNode API and its related implementations as much as possible, IMO we need to reach a simple state at the first step. As far as I observed, there is no use for this version of transform(), and it mixes the VisitRecursion logic with transform logic (early return option is a newly introduced feature for transform() methods). It may make sense to add such things when they are needed.

@peter-toth
Copy link
Contributor Author

Since we set out to simplify and provide easy to use TreeNode API and its related implementations as much as possible, IMO we need to reach a simple state at the first step. As far as I observed, there is no use for this version of transform(), and it mixes the VisitRecursion logic with transform logic (early return option is a newly introduced feature for transform() methods). It may make sense to add such things when they are needed.

Although I used the new transform only once in my PR to replace and simplify a rewrite() (https://github.com/apache/arrow-datafusion/pull/8891/files#diff-f69e720234d9da6cb3a4a178d3a5575fcd34f68191335a8c2465a172e8c4e6f1R653) I plan to use it in the future. But I'm ok with droping 2. from this PR. Let's hear some more feedback on this.

@alamb
Copy link
Contributor

alamb commented Jan 18, 2024

Hi @peter-toth -- thank you for this PR and the various others that are trying to improve the quality of code in DataFusion

I think one thing that is challenging for me is to understand how to evaluate the benefits of this and related PRs in an objective manner.

This PR, for example, may improve the consistency of APIs (which I happen to think it does) but making this change will also come at a cost for DataFusion users:

  1. If they use the TreeNode API they will have to change their code on upgrade
  2. If they were familiar with the existing TreeNode API, they will have to learn the new API

For other breaking API changes, there are often other benefits that come with the cost, such as new features or improved performance.

I wonder if you can help us articulate how this (and the related refactoring PRs) help DataFusion users -- for example do they

  1. permit us to implement Expression or plan node rewrites more efficiently?
  2. Or will they allow us to consolidate / reduce code to reduce the maintenance burden to DataFusion maintainers

If we can connect this work to a benefit I suspect we would be able to drive it forward more quickly

@peter-toth
Copy link
Contributor Author

peter-toth commented Jan 19, 2024

Thanks for the question @alamb! I'm new to DataFusion so this discussion could help me a lot to understand what ways are viable in terms of improving the existing code. I came from a different open-source query engine but I think TreeNode APIs are fundamental for each engine. While I was getting familiar with DataFusion I stumbled upon quite a few inconsistencies and missing features regarding these APIs. I was trying to throw in some ideas with my PRs that could help improving the situation. Some of these PRs are simply trying to improve consistency some others are performance related.

Here are a few things I found weird with the current APIs. These can cause a newcommer (like me) some confusion:

  • The apply / visit functions are capable to prune (skip) subtrees or return immediately (stop) but the transform functions are not. To make it more confusion rewrite is also capable to to prune, but instead of using a common way to control recursion, visit and rewrite use different enums with conflicting semantics.
    See this PR (Consolidate TreeNode transform and rewrite APIs #8891) for details on VisitRecursion vs RewriteRecursion.

  • There is this Transformed enum whose purpose is to explicitely define if any change was made to a node. This enum is used in transform clousres but for some reason it is missing from rewrite. Moreover this explicit information is neither propogatged up to API callee nor it is used for detecting changes in nodes.
    See details in Remove Transformed enum #8835. I believe the current state simply doesn't make sense and just causes confusion.

  • rewrite also seems to be inconsistent with transform_up and transform_down. rewrite probably organically ended up in its current state and capable to do its current job, but what I would expect from such a function is that it simply incorporates 2 closures from transform_up and transform_down to define:

    • what transformation should be done before (pre-order) and after (post-order) transformation of the node's children
    • and how the recursion should continue .

    See this PR (Consolidate TreeNode transform and rewrite APIs #8891) for the details. IMO the current TreeNodeRewriter.pre_visit() seems like a mess and its logic is hard to grasp.

What features I miss from the current API:

  • Pruning capability of transform functions:
    Pruning would be very important as many of the tranformations wouldn't require traversing on the whole tree and could improve performance a lot.
  • Payload propagation:
    I think the the reason why there are so many (4 base + 9 derived) tree implementations (with many code duplications) in DataFusion is that there is no API to describe a transformation while also propagating state. In Transform with payload #8664 with the help of transform_with_payload I not just eleminated 7 derived trees but also improved some of the algorightms to require only one traversal (also a perf improvement).

So to recap my ideal TreeNode transformation API would look like this:

    /// Transforms the tree using `f_down` and `f_up` closures. `f_down` is applied on a
    /// node while traversing the tree top-down (pre-order, before the node's children are
    /// visited) while `f_up` is applied on a node while traversing the tree bottom-up
    /// (post-order, after the the node's children are visited).
    ///
    /// The `f_down` closure takes
    /// - a `PD` type payload from its parent
    /// and returns a tuple made of:
    /// - a possibly modified node,
    /// - a `PC` type payload to pass to `f_up`,
    /// - a `Vec<FD>` type payload to propagate down to its children
    ///    (one `FD` element is propagated down to each child),
    /// - a `TreeNodeRecursion` enum element to control recursion flow.
    ///
    /// The `f_up` closure takes
    /// - a `PC` type payload from `f_down` and
    /// - a `Vec<PU>` type payload collected from its children
    /// and returns a tuple made of:
    /// - a possibly modified node,
    /// - a `FU` type payload to propagate up to its parent,
    /// - a `TreeNodeRecursion` enum element to control recursion flow.
    fn transform_with_payload<FD, PD, PC, FU, PU>(
        self,
        f_down: &mut FD,
        payload_down: PD,
        f_up: &mut FU,
    ) -> Result<(Self, PU)>
    where
        FD: FnMut(Self, PD) -> Result<(Self, Vec<PD>, PC, TreeNodeRecursion)>,
        FU: FnMut(Self, PC, Vec<PU>) -> Result<(Self, PU, TreeNodeRecursion)>,

Obviously the closure return types can be extracted to a type alias or strucs like in the case of f_down could be:

pub struct TransformDownResult<N, PD, PC> {
    pub transformed_node: N,
    pub payload_to_children: Vec<PD>,
    pub payload_to_f_up: PC,
    pub recursion_control: TreeNodeRecursion,
}

All the other functions should be just a specialization of the above:

  • apply (visit_down): Only f_down closure that doesn't return a modified node and payload (only retrurns TreeNodeRecursion)
  • visit / TreeNodeVisitor: Both f_down and f_up closures needed. The closures can be incorporated into a TreeNodeVisitor object but they don't return a modified node and payload
  • transform_down: Only f_down closure that doesn't return payload
  • transform_up: Only f_up closure that doesn't return payload
  • transform: Both f_down and f_up closures needed, but they don't return payloads
  • rewrite / TreeNodeRewriter: Both f_down and f_up are incorporated into a TreeNodeRewriter object, but they don't return a payload
  • transform_down_with_payload: Only f_down closure
  • transform_up_with_payload: Only f_up closure

I believe all these PRs would take us forward to a simple but still versarile APIs that have/are:

Now, as you mentioned some of these proposed changes have conflicting semantics to current APIs. Honestly I can't decide how much turbulance they would cause and so I'm seeking feedback from you and the community. But please note that I don't insist on any of the above functionaly or namings. These are just ideas and I can imagine implementations where:

  • we don't touch any existing API (maybe deprecate old ones)
  • or modify existing APIs to make them consistent, but also keep old API alternatives to ease the transition.

@alamb
Copy link
Contributor

alamb commented Jan 19, 2024

I see -- thank you @peter-toth. Your explanation makes sense.

As a historical note, the current TreeNode API came out of unifying different in inconsistent APIs across the Exprs, LogicalPlan , ExectionPlan and PhysicalExprs. It was a large step forward at the time, but also just getting a single API was a major step forward.

Here are a few things I found weird with the current APIs. These can cause a newcommer (like me) some confusion:

I think this is a good insight -- basically that the rationale for this change is to make it easier for new contributors to contribute to DataFusion as they APIs are more uniform.

Now, as you mentioned some of these proposed changes have conflicting semantics to current APIs. Honestly I can't decide how much turbulance they would cause and so I'm seeking feedback from you and the community. But please note that I don't insist on any of the above functionaly or namings. These are just ideas and I can imagine implementations where:

we don't touch any existing API (maybe deprecate old ones)
or modify existing APIs to make them consistent, but also keep old API alternatives to ease the transition.

I think implementing the "one true API" as you have proposed above and then migrating the rest of the codebase to use it (possibly deprecating the old APIs or changing them one at at time) would be the least disruptive.

To that end, what I will do is to create a new ticket describing this work so it is not lost in this ticket, and then we can use that ticket to make the overall plan visible

@alamb alamb mentioned this pull request Jan 19, 2024
9 tasks
@alamb
Copy link
Contributor

alamb commented Jan 19, 2024

I filed #8913 -- let me know what you think. What do you think about creating a PR with transform_with_payload and then a PR showing how to use it to improve one of the existing passes (or implement TreeNode::rewrite in terms of it or something?)

@peter-toth
Copy link
Contributor Author

peter-toth commented Jan 20, 2024

I filed #8913 -- let me know what you think. What do you think about creating a PR with transform_with_payload and then a PR showing how to use it to improve one of the existing passes (or implement TreeNode::rewrite in terms of it or something?)

Thanks @alamb! I will share my thoughts regarding transform_with_payload there. Actually I wanted to wait with transform_with_payload (#8664) until @ozankabak finishes #8817 as those 2 PRs are conflicting (#8664 (comment)) so it is good that we have dedicated ticket to discuss the 2 solutions.

The API inconsitency of the rewrite API (see 1. in PR description) and the poposed new transform semantics (see 2.) don't requite transform_with_payload as a base API and this PR doesn't conflict with #8664 or #8817 so I think this PR is ready for review and feedback is welcome.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR I think this PR is a significant step forward in getting a consistent API and the unification of pre_visit / post_visit -> up/down and a single TreeNodeRecursion is a significant improvement and I think we could merge this PR

Thank you for all the work on this PR @peter-toth

However I do not fully understand how the code in this PR and the code in #8817 from @ozankabak interact / conflict (anyone else following along can find details on #8913 (comment))

So basically I trust you two to come up with something good. I would be happy to merge this PR, or wait for #8817 and evaluate first.

Again, trying to improve things is very much apprecaited

datafusion/common/src/tree_node.rs Show resolved Hide resolved
@ozankabak
Copy link
Contributor

So basically I trust you two to come up with something good. I would be happy to merge this PR, or wait for #8817 and evaluate first.

Yes -- let's first get #8817 in and then I and @berkaysynnada will review and go through this in detail and the others in detail

@peter-toth
Copy link
Contributor Author

So basically I trust you two to come up with something good. I would be happy to merge this PR, or wait for #8817 and evaluate first.

Yes -- let's first get #8817 in and then I and @berkaysynnada will review and go through this in detail and the others in detail

Sounds good to me. I left a comment #8817 (comment) on #8817. It can help with further steps of the epic: #8913
Thanks @alamb, @ozankabak, @berkaysynnada.

@berkaysynnada
Copy link
Contributor

Thanks everyone. I will carefully review this PR after #8817 is merged and the related updates are done. Thank you again @peter-toth for your efforts!

@peter-toth peter-toth force-pushed the refactor-treenode-rewrite branch from f94e193 to 729c9d2 Compare January 28, 2024 07:58
@peter-toth
Copy link
Contributor Author

peter-toth commented Mar 1, 2024

I've updated the PR description and added .data() to Result<Transformed<T>>. Let me know if you have any more suggestions.

One thing came into my mind that we could keep transform() as an alias to transform_up() as the new combined traversal method is now called transform_down_up().

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks great and is ready to go. Are there any other comments prior to us merging this?

@ozankabak
Copy link
Contributor

I haven't gone through it yet - I will this weekend. I will let you guys know if I have any concerns. Given that I have been following, I don't expect anything major but I'd like to be prudent on this one as it is a somewhat foundational change.

@ozankabak
Copy link
Contributor

I am 80% done reviewing, should be finished tomorrow

Copy link
Contributor

@ozankabak ozankabak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @peter-toth for the great work. I went through all changes carefully and see nothing to hold this up.

I created a simplification PR, can you please merge it to this PR and then I will go ahead and merge it to main.

Two work items for follow-on PRs:

  1. The code in expr/src/tree_node/expr.rs became a little too complex for n-ary (binary and above) expressions. Can you think of a way to simplify that code?
  2. I agree with @alamb that rewrite().data().rewrite().data()....rewrite().data() chain looks a little weird. It would be great if we can find a neat way to simplify that flow. But this matters less than (1).

Just give me mention when you finalize by incorporating the simplification PR and I will merge quickly to avoid attracting merge conflicts (which I see happens somewhat frequently as this PR touches many files).

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Mar 4, 2024
@ozankabak
Copy link
Contributor

@alamb, this is now good to go from my perspective as well. Any reason to wait before merging?

@peter-toth
Copy link
Contributor Author

@ozankabak, thanks for your PR, I've just merged it.

I will try to think about those follow-ups. Expr.map_children() is indeed a bit too complex at first sight. Maybe we can simplify it with some clever macros...

@alamb
Copy link
Contributor

alamb commented Mar 4, 2024

@alamb, this is now good to go from my perspective as well. Any reason to wait before merging?

I don't think so -- this one has been opened for a long time and has had many chances to review. 🚀

Thanks again @peter-toth @berkaysynnada @ozankabak and everyone else who participated in the discussion

@alamb alamb merged commit a84e5f8 into apache:main Mar 4, 2024
24 checks passed
@alamb
Copy link
Contributor

alamb commented Mar 4, 2024

Epic work

@peter-toth
Copy link
Contributor Author

Thanks for the review and feedback @berkaysynnada, @alamb, @ozankabak!

wiedld pushed a commit to wiedld/arrow-datafusion that referenced this pull request Mar 21, 2024
* refactor `TreeNode::rewrite()`

* use handle_tree_recursion in `Expr`

* use macro for transform recursions

* fix api

* minor fixes

* fix

* don't trust `t.transformed` coming from transformation closures, keep the old way of detecting if changes were made

* rephrase todo comment, always propagate up `t.transformed` from the transformation closure, fix projection pushdown closure

* Fix `TreeNodeRecursion` docs

* extend Skip (Prune) functionality to Jump as it is defined in https://synnada.notion.site/synnada/TreeNode-Design-Proposal-bceac27d18504a2085145550e267c4c1

* fix Jump and add tests

* jump test fixes

* fix clippy

* unify "transform" traversals using macros, fix "visit" traversal jumps, add visit jump tests, ensure consistent naming `f` instead of `op`, `f_down` instead of `pre_visit` and `f_up` instead of `post_visit`

* fix macro rewrite

* minor fixes

* minor fix

* refactor tests

* add transform tests

* add apply, transform_down and transform_up tests

* refactor tests

* test jump on both a and e nodes in both top-down and bottom-up traversals

* better transform/rewrite tests

* minor fix

* simplify tests

* add stop tests, reorganize tests

* fix previous merges and remove leftover file

* Review TreeNode Refactor (#1)

* Minor changes

* Jump doesn't ignore f_up

* update test

* Update rewriter

* LogicalPlan visit update and propagate from children flags

* Update tree_node.rs

* Update map_children's

---------

Co-authored-by: Mustafa Akur <[email protected]>

* fix

* minor fixes

* fix f_up call when f_down returns jump

* simplify code

* minor fix

* revert unnecessary changes

* fix `DynTreeNode` and `ConcreteTreeNode` `transformed` and `tnr` propagation

* introduce TransformedResult helper

* fix docs

* restore transform as alias to trassform_up

* restore transform as alias to trassform_up 2

* Simplifications and comment improvements (#2)

---------

Co-authored-by: Berkay Şahin <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate core Core DataFusion crate logical-expr Logical plan and expressions optimizer Optimizer rules physical-expr Physical Expressions sql SQL Planner sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants