Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase max writer version to 4 for delta lake #14573

Conversation

homar
Copy link
Member

@homar homar commented Oct 11, 2022

Description

As we have checks that blocks writing to tables with specific features like CDF and generated columns we can safely update accepted writer version to 4.
I also refactored a little TestDeltaLakeDatabricksInsertCompatibility class

Non-technical explanation

Release notes

(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Delta Lake
* Add support for writing to tables using [version 4 of the writer](https://docs.delta.io/latest/versioning.html#features-by-protocol-version). 
  This does not yet include support for change data feeds and generated columns. ({issue}`14573 `)

@cla-bot cla-bot bot added the cla-signed label Oct 11, 2022
@homar homar requested review from findepi and alexjo2144 October 11, 2022 18:10
@homar homar force-pushed the homar/increase_delta_lake_supported_writer_version_to_4 branch from f6dd86b to 863ed73 Compare October 11, 2022 23:27
@homar homar requested a review from ebyhr October 12, 2022 08:03
@homar homar force-pushed the homar/increase_delta_lake_supported_writer_version_to_4 branch from be43764 to 4ca39a2 Compare October 12, 2022 23:01
@ebyhr
Copy link
Member

ebyhr commented Oct 13, 2022

Currently, the connector uses hard-coded reader & writer version when appending a protocol entry. If we changes metadata (e.g. COMMENT ON COLUMN) on tables having generated columns, it lowers the writer version and Databricks starts ignoring generated columns definition. I think we should copy the version in case of existing tables. #14611

@homar
Copy link
Member Author

homar commented Oct 13, 2022

Currently, the connector uses hard-coded reader & writer version when appending a protocol entry. If we changes metadata (e.g. COMMENT ON COLUMN) on tables having generated columns, it lowers the writer version and Databricks starts ignoring generated columns definition. I think we should copy the version in case of existing tables. #14611

Yes i saw that, great you noticed that and made a fix. But does that affect this pr ? I haven't noticed a failure but there is so much flakiness that I may have missed something

@ebyhr
Copy link
Member

ebyhr commented Oct 13, 2022

#14612 is required as a preparatory commit, but there's no need to change this PR. I found the above issue when I was trying to add some tests based on this PR.

@ebyhr ebyhr force-pushed the homar/increase_delta_lake_supported_writer_version_to_4 branch from 914d469 to e631917 Compare October 14, 2022 03:49
@ebyhr
Copy link
Member

ebyhr commented Oct 14, 2022

Rebased on upstream to resolve conflicts.

@ebyhr ebyhr merged commit b9287f6 into trinodb:master Oct 14, 2022
@ebyhr
Copy link
Member

ebyhr commented Oct 14, 2022

Merged, thanks!

@github-actions github-actions bot added this to the 401 milestone Oct 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants