Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve support for correlated subqueries with GROUP BY or LIMIT #8554

Merged
merged 1 commit into from
Jul 28, 2021

Conversation

kasiafi
Copy link
Member

@kasiafi kasiafi commented Jul 14, 2021

This change introduces a new policy of determining which subquery
symbols are constant.
Before, only symbols equal to some correlation symbols were recognized
as constant.
After this change, also deterministic combinations of correlation
symbols and coerced subquery symbols are recognized as constant.

Recognizing more subquery symbols as constant helps decorrelate
or improves plans for correlated queries with: GROUP BY, LIMIT,
ORDER BY + LIMIT.

@cla-bot cla-bot bot added the cla-signed label Jul 14, 2021
@kasiafi kasiafi force-pushed the 260MoreConstants branch from 462a369 to 8515704 Compare July 14, 2021 14:04
@kasiafi kasiafi requested review from martint and sopel39 July 14, 2021 14:04
}

// based on `UnwrapCastInComparison.Visitor.hasInjectiveImplicitCoercion()`
private boolean hasInjectiveCoercion(Type source, Type target)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to move this to TypeCoercion

// they are derived from the filter predicate, e.g.
// a = corr --> a is constant
// b = f(corr_1, corr_2, ...) --> b is constant provided that f is deterministic
// cast(c AS ...) = corr --> c is constant provided that cast is injective
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this case could be handled by expanding the UnwrapCastInComparison optimizer to deal with correlation symbols on the right hand side. In that case, the expression would be turned to something like:

c = cast(corr as ...)

and the second rule would apply.

That way, we don't need the decorrelator to deal with cast and checking whether it's injective at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The transformation which is done in UnwrapCastInComparison is based on knowing the actual value of the right-hand-side expression. Here, we only know that it's constant. If apply the cast to corr, it might turn out to be a truncating cast, and the result of the comparison might change. Or we could get a cast exception.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. In the future, we might want to add an optimization that works when the right side is constant -- it would be a weaker form of UnwrapCastInComparison.

@kasiafi kasiafi force-pushed the 260MoreConstants branch from eaf0e15 to 92ba3c4 Compare July 23, 2021 13:29
@kasiafi kasiafi requested a review from martint July 23, 2021 17:27
@@ -128,6 +128,30 @@ public boolean isTypeOnlyCoercion(Type source, Type result)
return false;
}

// based on `UnwrapCastInComparison.Visitor.hasInjectiveImplicitCoercion()`
public boolean isInjectiveCoercion(Type source, Type result)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be used by UnwrapCastInComparison, too

@kasiafi kasiafi force-pushed the 260MoreConstants branch from 92ba3c4 to cd8e0b6 Compare July 27, 2021 07:42
This change introduces a new policy of determining which subquery
symbols are constant.
Before, only symbols equal to some correlation symbols were recognized
as constant.
After this change, also deterministic combinations of correlation
symbols and coerced subquery symbols are recognized as constant.

Recognizing more subquery symbols as constant helps decorrelate
or improves plans for correlated queries with: GROUP BY, LIMIT,
ORDER BY + LIMIT.
@kasiafi kasiafi force-pushed the 260MoreConstants branch from cd8e0b6 to 90ff89c Compare July 27, 2021 09:50
@kasiafi kasiafi merged commit 5450e78 into trinodb:master Jul 28, 2021
@kasiafi kasiafi mentioned this pull request Jul 28, 2021
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants