-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply the dereference pushdown at the physical level on parquet in Iceberg #17387
Apply the dereference pushdown at the physical level on parquet in Iceberg #17387
Conversation
1dccc86
to
616281e
Compare
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Show resolved
Hide resolved
616281e
to
6da2aaa
Compare
What's the difference from #17133? |
@ebyhr this PR provides a way to dramatically reduce the amount of Parquet data read from the physical storage while selecting sub-fields from nested fields. See #17145 for details. #17133 is another very useful improvement for being able to push down filters on Parquet so that entire row group get skipped for reading. cc @leetcode-1533 |
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergParquetConnectorTest.java
Outdated
Show resolved
Hide resolved
Rebasing on |
6da2aaa
to
163369d
Compare
255d46c
to
20b6616
Compare
20b6616
to
a33650c
Compare
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Show resolved
Hide resolved
a33650c
to
61d8472
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm % minor comments
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Show resolved
Hide resolved
61d8472
to
b5bd8d5
Compare
b5bd8d5
to
8bc9215
Compare
8bc9215
to
e134272
Compare
Description
Fixes Parquet dereference pushdown on the physical level on #17156
ORC still open
Implementation overview for dereference pushdown at physical level in Parquet
PoC is the implementation for Hive
trino/plugin/trino-hive/src/main/java/io/trino/plugin/hive/parquet/ParquetPageSourceFactory.java
Lines 316 to 330 in adef5f4
What happens in the code:
SELECT parent.child, child
onlyparent
field will actually be selected) will be selectedIf
nested.nested1level1.field4
gets selected and the column type hierarchy looks like this:for creating the paquet schema, there will be used the parquet type having the following hierarchy:
Message
that corresponds to only to the nested columns which are actually selected from the nested rows.Additional context and related issues
Corresponding tests which check also that the dereference pushdown is effective on the storage layer exist already in
BaseConnectorTest
. See 6a4e483e#diff-6be05909e810c0224c1951c4102cad6256e9e088feb91e4c260756bfde89b6d5Lookup for usage of
trino/testing/trino-testing/src/main/java/io/trino/testing/TestingConnectorBehavior.java
Line 69 in adef5f4
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text: