Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move all pruning code to the pruner and make it an immutable visitor #32817

Merged
merged 1 commit into from
Jan 20, 2024

Conversation

roji
Copy link
Member

@roji roji commented Jan 14, 2024

NOTE: This is based on top of #32815, review the last commit only

  • As planned, this moves all pruning code out of SelectExpression and into the SqlTreePruner, which is now completely self-contained (except for am unimportant TPC-related exception).
  • In addition, the pruner was reimplemented to be a standard visitor, returning a copy rather than mutating anything.
  • This has important consequences beyond just cleaning up the pruner; since the pruner now treats SelectExpression as an actual immutable expression, any node that gets referenced multiple times gets duplicated if any change is done inside it. This exposed various places further down the pipeline where reference identity was assumed, but now we have distinct instances which are structurally equal instead.
  • SelectExpression.Update() was optimized, and all usages of it cleaned up for much terser code.

Fixes #31276

@roji roji requested a review from a team January 14, 2024 20:11
tables.Add(newTable);
}

var projections = this.VisitAndConvert(selectExpression.Projection);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This both switches to using VisitAndConvert (which visits lists and returns the original if it didn't change), and removes tracking of whether anything changed; SelectExpression.Update() does that now - as is standard with expressions - and returns the original SelectExpression if nothing changed.

@@ -197,7 +197,7 @@ protected override Expression VisitDelete(DeleteExpression deleteExpression)
&& selectExpression.Orderings.Count == 0
&& selectExpression.GroupBy.Count == 0
&& selectExpression.Tables.Count == 1
&& selectExpression.Tables[0] == deleteExpression.Table
&& selectExpression.Tables[0].Equals(deleteExpression.Table)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note places like this where reference identity was assumed; but if the tables were replaced for some reason (e.g. alias rewriting), they now get duplicated (immutable visitors); structural equality is used instead.

@@ -64,20 +64,22 @@ public override ExpressionType NodeType
protected override Expression VisitChildren(ExpressionVisitor visitor)
{
var selectExpression = (SelectExpression)visitor.Visit(SelectExpression);
var table = (TableExpression)visitor.Visit(Table);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a bug where the TableExpression on Delete/UpdateExpression wasn't being visited (this apparently wasn't necessary until now).

throw new InvalidOperationException(RelationalStrings.SelectExpressionUpdateNotSupportedWhileMutable);
}

if (projections == Projection
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* SelectExpression can appear at 2 levels,
* 1. Top most level which is always same reference when translation phase, post-translation it can change in
* ShapedQueryExpression where it would cause reconstruction. Reconstruction is cheaper than computing whole Equals
* 2. Nested level component inside top level SelectExpression where it could change the reference and reconstruct SQL tree.
Copy link
Member Author

@roji roji Jan 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth understanding and thinking about this a bit - the mutable/immutable/equality situation was very squishy up until now.

First, this comment isn't entirely accurate; we sometimes have the same expression node referenced from multiple places in the tree; if non-mutating visitor visits that, the single node gets visited twice and duplicated; and if this happens in post-processing, that could produce two nodes with the same alias. Note also that for almost all other SQL expression nodes, Equals really does implement structural equality - only SelectExpression (and TableExpression) were an exception.

Since this PR turns the pruner into a non-mutating visitor, this now happens more often and exposes issues. For example, SqlNullabilityProcessor had logic which did Equals on two nodes; these happened to be previously the same instance, but pruning can now make them two structurally equal instances instead.

Whether/when we should performing potentially expensive recursive equality - i.e. is it worth it for the aforementioned null semantics optimization - is another question... Similarly, performing a fully recursive hash code calculation isn't required and is probably needlessly expensive (see #19859 and #19860).

@roji roji merged commit def705a into dotnet:main Jan 20, 2024
7 checks passed
@roji roji deleted the CleanupPruning branch January 20, 2024 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants