-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove redundant table scan predicate during predicate pushdown #1516
Remove redundant table scan predicate during predicate pushdown #1516
Conversation
|
||
// table scans with none domain should be converted to ValuesNode | ||
checkState(node.getEnforcedConstraint().getDomains().isPresent()); | ||
Map<ColumnHandle, Domain> enforcedColumnDomains = node.getEnforcedConstraint().getDomains().get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a TODO similar to the one in EffectivePredicateExtractor: // TODO: replace with metadata.getTableProperties() when table layouts are fully removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I'm planning to restore using of metadata.getTableProperties()
in predicate pushdown.
Previously, Metadata#getTableProperties
was called in PushPredicateIntoTableScan#pushFilterIntoTableScan
and people didn't complain.
So I think that predate pushdown should use enforced predicate
when called before PushPredicateIntoTableScan
optimization and Metadata#getTableProperties
after PushPredicateIntoTableScan
optimization
if (predicateDomain.getDomains().isPresent()) { | ||
Map<ColumnHandle, Domain> predicateColumnDomains = predicateDomain.getDomains().get(); | ||
|
||
// table scans with none domain should be converted to ValuesNode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But there's no guarantee this will happen before this optimizer runs, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think only PushPredicateIntoTableScan#pushFilterIntoTableScan
can narrow down enforced predicate (and it replaces table scan with values when predicate is none). Is this correct?
@@ -1329,9 +1341,58 @@ public PlanNode visitSample(SampleNode node, RewriteContext<Expression> context) | |||
public PlanNode visitTableScan(TableScanNode node, RewriteContext<Expression> context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we doing this in PredicatePushDown? It seems like something we could just add as a separate rule that matches Filter(TableScan)
and simplifies the filter based on the "guaranteed predicate from the table scan"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's separate rule, then we need to fire such rule always after PredicatePushDown
.
However, it should be easier to test all edge cases.
ea7fb8d
to
bb2b3bb
Compare
ac @martint |
eaa32a1
to
e21563a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments. Also, I'd squash "Extract PushPredicateIntoTableScan#createResultingPredicate" and "Expose PushPredicateIntoTableScan#arePlansSame" into "Remove redundant table scan predicate during predicate pushdown", since they are in support of that change and don't really "stand out" on their own.
context.getSymbolAllocator().getTypes(), | ||
context.getIdAllocator()); | ||
|
||
if (arePlansSame(filterNode, tableScan, rewritten)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems overkill. You could just check whether the new filter is the same as the one in filterNode
.
Map<ColumnHandle, Domain> enforcedColumnDomains = node.getEnforcedConstraint().getDomains().get(); | ||
|
||
ImmutableMap.Builder<ColumnHandle, Domain> unenforcedColumnDomains = ImmutableMap.builder(); | ||
for (ColumnHandle columnHandle : predicateColumnDomains.keySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iterate over the entrySet()
, since it's immediately calling .get(...)
to get the associated value.
Map<ColumnHandle, Domain> enforcedColumnDomains = node.getEnforcedConstraint().getDomains().get(); | ||
|
||
ImmutableMap.Builder<ColumnHandle, Domain> unenforcedColumnDomains = ImmutableMap.builder(); | ||
for (ColumnHandle columnHandle : predicateColumnDomains.keySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm misreading this, if the table scan domain for the column is ALL
, this will do:
intersect(ALL, f) -> f
ALL.contains(f) -> true
Which will result in f
being dropped.
I think what you want is intersect(F, P) == P? => drop F
(where F
is the filter, P
is the table scan predicate). See below:
Putting it all together:
for (Map.Entry<ColumnHandle, Domain> entry : predicateColumnDomains.entrySet()) {
ColumnHandle column = entry.getKey();
Domain filter = entry.getValue();
Domain columnPredicate = enforcedColumnDomains.getOrDefault(column, Domain.all(filter.getType()));
if (!filter.intersect(columnPredicate).equals(columnPredicate)) {
unenforcedColumnDomains.put(column, filter);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed there was a bug there. I've adjusted the code and added test.
In your snipped you don't intersect filter
with enforcedColumnDomain
.
e21563a
to
32dcec6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ac & added some tests
Map<ColumnHandle, Domain> enforcedColumnDomains = node.getEnforcedConstraint().getDomains().get(); | ||
|
||
ImmutableMap.Builder<ColumnHandle, Domain> unenforcedColumnDomains = ImmutableMap.builder(); | ||
for (ColumnHandle columnHandle : predicateColumnDomains.keySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed there was a bug there. I've adjusted the code and added test.
In your snipped you don't intersect filter
with enforcedColumnDomain
.
I usually try to keep mechanical changes separate from logic changes so that reviewer has easier job tracking complicated changes |
depends on: #1515