-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify getRawSystemTable in Delta Lake #18157
Conversation
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeMetadata.java
Show resolved
Hide resolved
1dba326
to
bb46e79
Compare
@@ -409,6 +409,9 @@ private void testCorruptedTableLocation(String tableName, Path tableLocation, bo | |||
assertQueryReturnsEmptyResult("SELECT column_name, data_type FROM information_schema.columns WHERE table_schema = CURRENT_SCHEMA AND table_name LIKE 'bad\\_person\\_%' ESCAPE '\\'"); | |||
assertQueryReturnsEmptyResult("SELECT column_name, data_type FROM system.jdbc.columns WHERE table_cat = CURRENT_CATALOG AND table_schem = CURRENT_SCHEMA AND table_name LIKE 'bad\\_person\\_%' ESCAPE '\\'"); | |||
|
|||
// $history system table returns an empty result for corrupted table | |||
assertQueryReturnsEmptyResult("SELECT * FROM \"" + tableName + "$history\""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When retrieving the commit entries io.trino.plugin.deltalake.DeltaLakeMetadata#getCommitInfoEntries
(or even more low-level io.trino.plugin.deltalake.transactionlog.checkpoint.TransactionLogTail#loadNewTail(io.trino.filesystem.TrinoFileSystem, java.lang.String, java.util.Optional<java.lang.Long>)
) we should check whether we have at least one commit entry.
If the table doesn't have any commit entry, something is wrong with it and should be signaled in this kind of query.
If the table has at least one commit entry, but they contain corrupted information (e.g. : protocol entry not found in the transaction log), it would still be a bit controversial to return the history entries of the "corrupted table" successfully.
I find value in doing the extra work by getting the table handle and eventually signaling whether there is an issue.
plugin/trino-delta-lake/src/test/java/io/trino/plugin/deltalake/TestDeltaLakeBasic.java
Outdated
Show resolved
Hide resolved
bb46e79
to
4347376
Compare
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeMetadata.java
Show resolved
Hide resolved
String tableLocation = table.get().location(); | ||
TableSnapshot tableSnapshot = getSnapshot(new SchemaTableName(systemTableName.getSchemaName(), tableName), tableLocation, session); | ||
try { | ||
transactionLogAccess.getMetadataEntry(tableSnapshot, session); | ||
} | ||
SchemaTableName systemTableName = new SchemaTableName(tableName.getSchemaName(), DeltaLakeTableName.tableNameWithType(name, tableType.get())); | ||
catch (TrinoException e) { | ||
if (e.getErrorCode().equals(DELTA_LAKE_INVALID_SCHEMA.toErrorCode())) { | ||
return Optional.empty(); | ||
} | ||
throw e; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know the purpose of this code because I have context to the PR.
For maintenance purposes I'd appreciate if these checks would be astracted
String tableLocation = table.get().location(); | |
TableSnapshot tableSnapshot = getSnapshot(new SchemaTableName(systemTableName.getSchemaName(), tableName), tableLocation, session); | |
try { | |
transactionLogAccess.getMetadataEntry(tableSnapshot, session); | |
} | |
SchemaTableName systemTableName = new SchemaTableName(tableName.getSchemaName(), DeltaLakeTableName.tableNameWithType(name, tableType.get())); | |
catch (TrinoException e) { | |
if (e.getErrorCode().equals(DELTA_LAKE_INVALID_SCHEMA.toErrorCode())) { | |
return Optional.empty(); | |
} | |
throw e; | |
} | |
String tableLocation = table.get().location(); | |
if (isTableCorrupted(new SchemaTableName(systemTableName.getSchemaName(), tableName), tableLocation, session)) { | |
return Optional.empty(); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure where isTableCorrupted
method comes from. Can you elaborate on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. I didn't express myself clearly.
I intended to suggest extracting the code to a method to encapsulate the logic related to seeing whether the table is or not corrupt.
Description
Simplify getRawSystemTable in Delta Lake. After this change, we can avoid a new field for
DeltaLakeTableHandle
in #17592Release notes
(x) This is not user-visible or docs only and no release notes are required.