-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Iceberg default warehouse location config #9614
Conversation
In Iceberg, I can define |
d14bf0f
to
cff436a
Compare
There are 2 reasons for this config:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a test covering creation of tables without location, within schema without location set.
This is what becomes possible with this change.
return this; | ||
} | ||
|
||
public String getCatalogWarehouse() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the property Optional<String>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
@@ -51,6 +53,19 @@ public IcebergConfig setCatalogType(CatalogType catalogType) | |||
return this; | |||
} | |||
|
|||
@Config("iceberg.catalog.warehouse") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure about the name. something like "default storage location"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to match https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/CatalogProperties.java#L31. Please let me know if other name is preferred, I can update to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For schemas and tables we use "location". So probably we should use it also here. iceberg.default-schema-location
?
@@ -39,6 +39,7 @@ | |||
private final CatalogType catalogType; | |||
private final boolean isUniqueTableLocation; | |||
private final boolean isUsingSystemSecurity; | |||
private final String warehouse; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be named more meaningfully. Let's figure config property name first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I updated to catalogWarehouse
for now
"please either set 'location' when creating the database, or set 'iceberg.catalog.warehouse' " + | ||
"to allow a default location at 'warehousePath/databaseName.db'", database.getDatabaseName())); | ||
} | ||
database = Database.builder(database) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not build a fake Database object.
Instead, let's modify getTableDefaultLocation
(which already has related logic).
"to allow a default location at 'warehousePath/databaseName.db'", database.getDatabaseName())); | ||
} | ||
database = Database.builder(database) | ||
.setLocation(Optional.of(format("%s/%s.db", warehouse, schemaTableName.getSchemaName()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why .db
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to match https://github.com/apache/iceberg/blob/master/hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java#L433-L440, not sure if Trino has any preference for this
@findepi test is actually quite difficult to add, the file metastore has a hard-coded database location, so that code path would never be exercised. Not sure if there is any other good way to trigger that. |
We can also choose to not have this feature for Hive catalog and only support it with Hadoop and Glue catalog, because based on the current logic of |
cff436a
to
06b11fa
Compare
close in favor of #10151 |
The link no longer seems to be adequate. @jackye1995 can you link to a file in a commit, instead of a branch? |
Support using a configurable warehouse location to determine the default table location.
This is to match the Iceberg side catalog behavior: https://github.com/apache/iceberg/blob/master/hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java#L452-L459.
@losipiuk @findepi