Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extra_properties to iceberg table properties #20393

Merged
merged 3 commits into from
Oct 31, 2024

Conversation

ndrluis
Copy link
Member

@ndrluis ndrluis commented Jan 16, 2024

Description

This PR allows us to pass through additional properties to Iceberg when creating a table in Trino.
The additional properties can be provided in the following format

extra_properties = MAP(ARRAY['extra.property.one', 'extra.property.two'], ARRAY['one', 'two'])

The extra properties will not be exposed in SHOW CREATE.

The purpose of this feature is to add properties that can be used by catalogs or other tools. For example, in Tabular, there is a property to optimize tables. For our use case, we want to create our tables with the optimizer.enabled value set to False, and we cannot achieve this without a way to handle arbitrary properties.

Additional context and related issues

Fixes #17427

Release notes

(x) Release notes are required, with the following suggested text:

# Iceberg
* Add support for `extra_properties` table properties. ({issue}`17427`)

@cla-bot cla-bot bot added the cla-signed label Jan 16, 2024
@github-actions github-actions bot added docs iceberg Iceberg connector labels Jan 16, 2024
@ndrluis ndrluis force-pushed the iceberg-extra-properties branch from 88f8b64 to 2c582fe Compare January 16, 2024 22:00
@ndrluis ndrluis requested review from mosabua and findepi January 16, 2024 22:12
Copy link
Member

@mosabua mosabua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs changes are fine. Java code seems to be using wrong style. Please update - see https://trino.io/development/#code-style

Copy link
Member

@oneonestar oneonestar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this PR will also allow ALTER TABLE SET PROPERTIES extra_properties. Please consider adding this to the documentation.

@findinpath
Copy link
Contributor

regarding maven-checks failure 🔴

do the following before submitting the PR to ensure that the code you are changing follows the code conventions:

./mvnw clean install -P errorprone-compiler -DskipTests -nsu -pl :trino-iceberg

@ndrluis ndrluis force-pushed the iceberg-extra-properties branch from 2c582fe to bc9453d Compare January 17, 2024 18:09
@github-actions github-actions bot added tests:hive hive Hive connector labels Jan 17, 2024
@ndrluis
Copy link
Member Author

ndrluis commented Jan 17, 2024

@findinpath @oneonestar @mosabua I made the requested changes!

@ndrluis ndrluis force-pushed the iceberg-extra-properties branch 2 times, most recently from e9b9e2c to bcc1ed3 Compare January 17, 2024 19:20
@ndrluis ndrluis removed hive Hive connector tests:hive labels Jan 17, 2024
@ndrluis ndrluis force-pushed the iceberg-extra-properties branch 2 times, most recently from ce2d3bf to 1e6de9c Compare January 17, 2024 21:26
@ndrluis ndrluis requested a review from findinpath January 17, 2024 21:26
@github-actions github-actions bot removed the stale label Oct 1, 2024
@mosabua
Copy link
Member

mosabua commented Oct 8, 2024

@ebyhr @findinpath is this ready for merge?

@peterklingelhofer
Copy link

peterklingelhofer commented Oct 27, 2024

Hopeful this can be merged somewhat soon. Being able to access these properties is essential if you're trying to do parallel UPSERT (MERGE INTO in Iceberg), i.e aws/aws-sdk-pandas#2651 (comment)

@sopel39 sopel39 force-pushed the iceberg-extra-properties branch from 6a91d7c to c15b81f Compare October 29, 2024 11:55
@sopel39
Copy link
Member

sopel39 commented Oct 29, 2024

I've rebased and addressed some of the comments. @ebyhr could you PTAL?

@ebyhr
Copy link
Member

ebyhr commented Oct 29, 2024

Sure, let me review tomorrow.

@sopel39 sopel39 force-pushed the iceberg-extra-properties branch 2 times, most recently from 9cd7df1 to 981c6a7 Compare October 29, 2024 14:18
@sopel39
Copy link
Member

sopel39 commented Oct 29, 2024

@ebyhr I've added iceberg.allowed-extra-properties to allow setting only specified extra properties.

@sopel39 sopel39 force-pushed the iceberg-extra-properties branch from 981c6a7 to 71dd6dc Compare October 29, 2024 18:51
@sopel39 sopel39 force-pushed the iceberg-extra-properties branch from 71dd6dc to 329d45c Compare October 30, 2024 11:48
Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ac

@sopel39 sopel39 requested a review from ebyhr October 30, 2024 11:49
Some policies require to add extra properties to Iceberg tables
(e.g. for tracking table origin).

Co-authored-by: Priyansh121096 <[email protected]>
@sopel39 sopel39 force-pushed the iceberg-extra-properties branch from 329d45c to b93155e Compare October 30, 2024 12:48
@sopel39
Copy link
Member

sopel39 commented Oct 30, 2024

@ebyhr ptal

@sopel39 sopel39 force-pushed the iceberg-extra-properties branch from b93155e to f66de17 Compare October 31, 2024 13:19
@sopel39 sopel39 merged commit 9403964 into trinodb:master Oct 31, 2024
9 of 26 checks passed
@github-actions github-actions bot added this to the 465 milestone Oct 31, 2024
@ndrluis ndrluis deleted the iceberg-extra-properties branch October 31, 2024 14:04
@lozbrown
Copy link
Contributor

@ndrluis would this also allow users to set iceberg identifier fields?
https://iceberg.apache.org/spec/#identifier-field-ids

@ndrluis
Copy link
Member Author

ndrluis commented Nov 14, 2024

@lozbrown No, this applies only to the properties of a table.

https://iceberg.apache.org/spec/#table-metadata

properties A string to string map of table properties. This is used to control settings that affect reading and writing and is not intended to be used for arbitrary metadata. For example, commit.retry.num-retries is used to control the number of commit retries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

Add extra_properties to Iceberg table properties