Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add check for used Cassandra partition columns #11082

Merged

Conversation

s2lomon
Copy link
Member

@s2lomon s2lomon commented Feb 17, 2022

Description

Thanks to this we don't drop partition Predicates, that does not
match in Cassandra, totally out of the query. As it is today results in
returning some rows when none should be returned.

Related issues, pull requests, and links

Documentation

(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
(x) Release notes entries required with the following suggested text:

# Section

* Fixes issue 'Cassandra driver incorrectly applying partition column filtering'. ({https://github.com/trinodb/trino/issues/11001}`11001`)

@cla-bot cla-bot bot added the cla-signed label Feb 17, 2022
@s2lomon s2lomon requested a review from ebyhr February 17, 2022 13:25
@@ -69,8 +71,13 @@ public CassandraPartitionResult getPartitions(CassandraTableHandle cassandraTabl
remainingTupleDomain = tupleDomain;
}
else {
List<ColumnHandle> partitionColumns = ImmutableList.copyOf(partitionKeys);
remainingTupleDomain = tupleDomain.filter((column, domain) -> !partitionColumns.contains(column));
ImmutableSet<ColumnHandle> usedPartitionColumns = partitions.stream()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Declare as Set

remainingTupleDomain = tupleDomain.filter((column, domain) -> !partitionColumns.contains(column));
ImmutableSet<ColumnHandle> usedPartitionColumns = partitions.stream()
.flatMap(partition -> Optional.ofNullable(partition.getTupleDomain()).stream()
.flatMap(tD -> tD.getDomains().stream())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tD -> tupleDomain/domain

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it clashes with the other variables, but domain is ok.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got another idea, take a look now please.

@s2lomon s2lomon force-pushed the s2lomon/cassandra-aggregation-pushdown-bug branch from dba44d1 to 078b187 Compare February 17, 2022 16:33
@s2lomon s2lomon requested a review from losipiuk February 17, 2022 16:36
Thank to this we don't drop partition Predicates, that does not
match in Cassandra, totally out of the query. This prediction droping
results in returning to many rows.
@s2lomon s2lomon force-pushed the s2lomon/cassandra-aggregation-pushdown-bug branch from 078b187 to 369a251 Compare February 18, 2022 15:34
@ebyhr ebyhr merged commit a123f44 into trinodb:master Feb 21, 2022
@ebyhr
Copy link
Member

ebyhr commented Feb 21, 2022

Merged, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Cassandra driver incorrectly applying partition column filtering
3 participants