-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Iceberg $history table when using REST Catalog #17470
Fix Iceberg $history table when using REST Catalog #17470
Conversation
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/HistoryTable.java
Show resolved
Hide resolved
6fe8047
to
f28ca0d
Compare
@@ -77,14 +76,13 @@ public RecordCursor cursor(ConnectorTransactionHandle transactionHandle, Connect | |||
|
|||
Set<Long> ancestorIds = ImmutableSet.copyOf(SnapshotUtil.currentAncestorIds(icebergTable)); | |||
TimeZoneKey timeZoneKey = session.getTimeZoneKey(); | |||
for (HistoryEntry historyEntry : icebergTable.history()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current version of Iceberg library, when not all the
snapshots are initially loaded (for the REST session catalog),
thesnapshotsLog
does not get reinitialized, causing to
expose only the current snapshots in thehistory()
method
of the Iceberg table.
is this a bug in the library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a bug in the library?
Yes, please see the discussion https://apache-iceberg.slack.com/archives/C03LG1D563F/p1683871245426989
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@findinpath unfortunately the discussion is no longer visible in slack (thread is a bit old).
We recently hit an issue here because we made the assumption that $history
table only contained snapshots that were at some point or another the current snapshot of a table.
I wonder what is the difference now between $snapshots
and $history
? Since I can get pretty much anything from $history
using only $snapshots
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gustavoatt i don't see the replies anymore on the slack thread and lost context :(
@@ -595,6 +595,32 @@ public void testDropTableWithNonExistentTableLocation() | |||
assertFalse(getQueryRunner().tableExists(getSession(), tableName)); | |||
} | |||
|
|||
@Test | |||
public void testMetadataTables() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a code comment why this is in SmokeTest.
Table.snapshots()
method to build the history cursor
@findinpath consider changing the commit message to something like |
In the current version of Iceberg library, when not all the snapshots are initially loaded (for the REST session catalog), the `snapshotsLog` does not get reinitialized, causing to expose only the current snapshots in the `history()` method of the Iceberg table.
f28ca0d
to
ea3b9e0
Compare
Description
In the current version of Iceberg library, when not all the snapshots are initially loaded (for the REST session catalog), the
snapshotsLog
does not get reinitialized, causing to expose only the current snapshots in thehistory()
method of the Iceberg table.Related Iceberg code
https://github.com/apache/iceberg/blob/8865be48bcb20ce07972bf5176955d03a1347d27/core/src/main/java/org/apache/iceberg/TableMetadata.java#L497-L516
Additional context and related issues
Release notes
(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: