-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid reading min/max/null statistics for planning iceberg inserts #23757
Avoid reading min/max/null statistics for planning iceberg inserts #23757
Conversation
9cb5219
to
6bc6c3f
Compare
d2bbc44
to
7e669a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
My only concern is that we should guard the code against potential future regressions on the write path which could be affected by min/max stats.
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Show resolved
Hide resolved
@@ -346,6 +347,7 @@ | |||
import static org.apache.iceberg.SnapshotSummary.DELETED_RECORDS_PROP; | |||
import static org.apache.iceberg.SnapshotSummary.REMOVED_EQ_DELETES_PROP; | |||
import static org.apache.iceberg.SnapshotSummary.REMOVED_POS_DELETES_PROP; | |||
import static org.apache.iceberg.SnapshotSummary.TOTAL_RECORDS_PROP; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The calling code in getStatisticsCollectionMetadataForWrite was not
using all the statistics and is simplified to only fetch NDVs.
For regression prevention purposes: Can we ensure through a test that the calling code does not depend on min/max stats retrieved from io.trino.plugin.iceberg.TableStatisticsReader#makeTableStatistics
?
I remember raising a while ago the idea promoted by this PR and @findepi was rather cautious in doing this change because of potential regressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The important thing here is to not go through all manifest files for planning inserts rather than usage of min/max/null stats. There are tests which assert on manifest file accesses on filesystem. If this code were getting those stats through some other cheaper means, then we wouldn't care about it.
At worst, there's couple of ways of making a mistake in this code:
- We generate NDV stats even though we don't know existing NDVs and end up under counting NDVs by recording only the NDVs collected on write. The other min/max/nulls stats would still be correct, so the CBO may give worse plans but it's not the end of the world. Eventually the stats will become more accurate, or a call to ANALYZE will fix the whole thing.
- We fail to detect that the table is empty or that NDVs are known and skip generating them on write. Again we get possibly worse plans from CBO while the other non-ndv stats are still intact and a call to ANALYZE will fix the problem.
Either way, these are much more tolerable problems than the problem caused by the current code where it can bottleneck INSERT queries for minutes on planning.
@@ -234,7 +233,6 @@ public void testInsert() | |||
.add(new FileOperation(STATS, "InputFile.newStream")) | |||
.add(new FileOperation(SNAPSHOT, "OutputFile.create")) | |||
.add(new FileOperation(MANIFEST, "OutputFile.create")) | |||
.add(new FileOperation(MANIFEST, "InputFile.newStream")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should outline in the commit comment that the manifest list corresponding to the table snapshot is not read anymore now during writes.
For large tables going through statistics for all files can be slow. The calling code in getStatisticsCollectionMetadataForWrite was not using all the statistics and is simplified to only fetch NDVs.
7e669a2
to
c98d78b
Compare
Description
For large tables going through statistics for all files can be slow.
The calling code in getStatisticsCollectionMetadataForWrite was not
using all the statistics and is simplified to only fetch NDVs.
Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: