Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Iceberg dereference pushdown when column has a comment #11104

Merged

Conversation

BlueStalker
Copy link
Contributor

@BlueStalker BlueStalker commented Feb 18, 2022

When Query iceberg table with predicate column including comments, and deference pushdown enabled. Exception occurred if the predicates containing both nested column and non_nested, for example
select count(*) from table where col1 = 1 and col2.n1 = 3
will show exception like

Caused by: java.lang.IllegalArgumentException: Multiple entries with same key: .... 
at com.google.common.collect.ImmutableMap.conflictException(ImmutableMap.java:376) 
at com.google.common.collect.ImmutableMap.checkNoConflict(ImmutableMap.java:370) 
at com.google.common.collect.RegularImmutableMap.checkNoConflictInKeyBucket(RegularImmutableMap.java:153) 
at com.google.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:115) 
at com.google.common.collect.ImmutableMap$Builder.buildOrThrow(ImmutableMap.java:574) 
at io.trino.plugin.iceberg.IcebergPageSourceProvider.getParquetTupleDomain(IcebergPageSourceProvider.java:834) 
at io.trino.plugin.iceberg.IcebergPageSourceProvider.createParquetPageSource(IcebergPageSourceProvider.java:646)

Also refer the testing cases.

@cla-bot cla-bot bot added the cla-signed label Feb 18, 2022
@BlueStalker
Copy link
Contributor Author

@alexjo2144

@alexjo2144
Copy link
Member

Good catch, thanks!

@BlueStalker BlueStalker force-pushed the deference_pushdown_column_comments branch from c808db2 to f4a1146 Compare February 18, 2022 21:33
Copy link
Member

@alexjo2144 alexjo2144 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@findepi mind taking a look?

@findepi
Copy link
Member

findepi commented Feb 21, 2022

@BlueStalker @alexjo2144 the fix is quite indirect. What's the root cause of the failure?

@BlueStalker
Copy link
Contributor Author

@BlueStalker @alexjo2144 the fix is quite indirect. What's the root cause of the failure?

The original code (applyProject), put an new Assignment with column (without comment) into ProjectionApplicationResult, and I assume later in will be available in TableScan Node, When do 'applyFilter' from PushPredicateIntoTableScan, it will contain duplicate (IcebergColumnHandle) one with comment and the other without, which will finally lead to exception of (Multiple entries with same key: .... ), also @phd3 said if applyProjection can just bypass the non nested column, it should also solve the problem.

@BlueStalker BlueStalker force-pushed the deference_pushdown_column_comments branch from f4a1146 to a60ddfd Compare February 22, 2022 20:04
@BlueStalker BlueStalker force-pushed the deference_pushdown_column_comments branch from a60ddfd to a8eaa09 Compare February 25, 2022 06:41
@BlueStalker
Copy link
Contributor Author

@findepi checked it again?

Copy link
Member

@phd3 phd3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@phd3 phd3 force-pushed the deference_pushdown_column_comments branch from a8eaa09 to 41f6761 Compare February 25, 2022 14:37
@phd3
Copy link
Member

phd3 commented Feb 25, 2022

making a couple more small changes here.

@findepi findepi changed the title Fix query iceberg column with comments Fix Iceberg dereference pushdown when column has a comment Feb 25, 2022
@findepi
Copy link
Member

findepi commented Feb 25, 2022

pushed commit title change (and renamed the PR)

@findepi findepi force-pushed the deference_pushdown_column_comments branch from 41f6761 to 15548ab Compare February 25, 2022 14:52
@phd3 phd3 force-pushed the deference_pushdown_column_comments branch from 15548ab to 4066e2b Compare February 25, 2022 15:07
@findepi findepi merged commit 9d30624 into trinodb:master Feb 25, 2022
@findepi findepi mentioned this pull request Feb 25, 2022
@github-actions github-actions bot added this to the 372 milestone Feb 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants