Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Iceberg refs system table #15649

Merged
merged 1 commit into from
Mar 15, 2023
Merged

Conversation

jackye1995
Copy link
Member

Description

Support refs system table in Iceberg

Additional context and related issues

Users can query the system table with:

SELECT * FROM "table$refs"

The schema is the same as the one in Iceberg repo: https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/RefsTable.java

Release notes

( ) This is not user-visible or docs only and no release notes are required.
(x) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

Copy link
Member

@alexjo2144 alexjo2144 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about calling the table REFERENCES spelled out?

This also needs to be added to iceberg.rst for the docs here: https://trino.io/docs/current/connector/iceberg.html#metadata-tables

@jackye1995
Copy link
Member Author

How do you feel about calling the table REFERENCES spelled out?

I am using REFS because that is what Iceberg core library uses, which means Spark and Flink system tables are also names as that. But if you think REFERENCES makes more sense in Trino, happy to name it that way.

@findinpath
Copy link
Contributor

@jackye1995
Copy link
Member Author

@alexjo2144 updated everything except the name REFS, let me know if I need to name it to REFERENCES

}

if (ref.minSnapshotsToKeep() != null) {
pagesBuilder.appendInteger(ref.minSnapshotsToKeep());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could add in PageListBuilder new methods:

  • appendInteger(Integer)
  • appendBigint(Long)

to encapsulate the null check and make the code in this class more concise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added new methods as suggested

@alexjo2144
Copy link
Member

How do you feel about calling the table REFERENCES spelled out?

I am using REFS because that is what Iceberg core library uses, which means Spark and Flink system tables are also names as that. But if you think REFERENCES makes more sense in Trino, happy to name it that way.

@ebyhr or @findepi either of you have thoughts on this? Matching the other engines vs spelling the whole word out.

Copy link
Member

@alexjo2144 alexjo2144 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, pending input from maintainers on the table name.

Comment on lines 1249 to 1250
``$refs`` table
^^^^^^^^^^^^^^^^^^^^^
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these are supposed to match width. @mosabua might want to take a look at the docs addition, but thanks for adding it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to match the width

@github-actions
Copy link

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

@github-actions github-actions bot added the stale label Feb 20, 2023
@bitsondatadev
Copy link
Member

@jackye1995, would you mind replying to the last comments here please?

@ebyhr %comments, is this ready to be merged?

@jackye1995
Copy link
Member Author

@ebyhr sorry did not notice you have one more comment, updated. Let me know if there is anything else needed.

@bitsondatadev
Copy link
Member

bitsondatadev commented Feb 23, 2023

@ebyhr sorry did not notice you have one more comment, updated. Let me know if there is anything else needed.

Thanks @jackye1995! I think there are a few other small naming fixes in #15646 if you can address those as well please! nvm you fixed it already!

After that @ebyhr should take a look shortly! You rock!

@jackye1995 jackye1995 force-pushed the ref-table branch 2 times, most recently from 3b429eb to 72e6606 Compare February 23, 2023 19:20
@electrum
Copy link
Member

I think it's fine to match the other implementations, especially since the intent here is to have the same schema.

@jackye1995
Copy link
Member Author

@electrum @ebyhr @findepi is there any plan to merge this and the related PRs for Iceberg reference support? We are about to cut Iceberg 1.2 branch, and have achieved all the same supports in Spark and Flink, would be great to have feature parity in Trino.

@ebyhr
Copy link
Member

ebyhr commented Mar 8, 2023

Could you rebase on master to resolve conflicts?

@jackye1995
Copy link
Member Author

Can I get a confirmation if this PR is ever gonna be merged once rebased? If not I will just go ahead to apply the patch internally and you can close this PR instead of me continuing to rebase.

@bitsondatadev
Copy link
Member

bitsondatadev commented Mar 9, 2023

@jackye1995 I'm meeting with @electrum tomorrow I'll ask him to respond. I really want this one to land!

@bitsondatadev
Copy link
Member

@jackye1995 would you also be able to do this change suggested by David to go back to REFS instead of REFERENCES?

https://trinodb.slack.com/archives/CP1MUNEUX/p1678319193961979?thread_ts=1677280433.219989&channel=CP1MUNEUX&message_ts=1678319193.961979

I will also update you on what David says about how close this is to merge.

@ebyhr
Copy link
Member

ebyhr commented Mar 9, 2023

would you also be able to do this change suggested by David to go back to REFS instead of REFERENCES?

The current implementation is already refs.

@bitsondatadev
Copy link
Member

would you also be able to do this change suggested by David to go back to REFS instead of REFERENCES?

The current implementation is already refs.

Ooops 🫣🫠

@jackye1995
Copy link
Member Author

Close PR, this will be available in AWS Athena by 3/31 in all AWS regions.

@jackye1995 jackye1995 closed this Mar 9, 2023
@electrum electrum reopened this Mar 9, 2023
@electrum
Copy link
Member

electrum commented Mar 9, 2023

Sorry for the confusion on this. I'll merge after the test passes. Thanks for your work on this!

@github-actions github-actions bot added the iceberg Iceberg connector label Mar 9, 2023
@bitsondatadev
Copy link
Member

@electrum looks like the tests passed! Are we ready to merge this?

@electrum
Copy link
Member

There was another conflict. Rebased again.

@electrum electrum merged commit 1ad04ba into trinodb:master Mar 15, 2023
@github-actions github-actions bot added this to the 411 milestone Mar 15, 2023
@findepi
Copy link
Member

findepi commented Mar 16, 2023

Per @ebyhr 's #16576 it seems the PR broke master builds.
@jackye1995 can you please help us and remind to /test-with-secrets?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

7 participants