Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Iceberg statistics accumulation after schema evolution #11091

Merged
merged 1 commit into from
Feb 25, 2022

Conversation

BlueStalker
Copy link
Contributor

This fixes the issues when there is a schema change (drop or change column) on iceberg tables, then the partitions schema table query failed with verification error. With this fix, it will return the rest columns and escape the failed ones.

Before the change, the query on "$partitions" will fail if there are column dropped, now it is just escape the column

@cla-bot
Copy link

cla-bot bot commented Feb 17, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Curt.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@BlueStalker BlueStalker reopened this Feb 17, 2022
@BlueStalker BlueStalker reopened this Feb 17, 2022
@BlueStalker BlueStalker force-pushed the escape_deprecated_column branch from c15b82d to 413f25f Compare February 17, 2022 21:35
@cla-bot cla-bot bot added the cla-signed label Feb 17, 2022
@BlueStalker BlueStalker requested a review from findepi February 17, 2022 21:36
Copy link
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexjo2144 alexjo2144 self-requested a review February 18, 2022 14:53
@BlueStalker BlueStalker force-pushed the escape_deprecated_column branch from 413f25f to 9c8fb86 Compare February 18, 2022 17:32
@vgankidi
Copy link

Currently we loop through idToMetricMap from the datafile and look it up in idToTypeMapping from the table schema. Does it make sense to lookup the ids from idToTypeMapping in idToMetricMap instead?
Since we don't care about ids in data file which are not in table schema as schema evolves.

@BlueStalker BlueStalker force-pushed the escape_deprecated_column branch from 9c8fb86 to 903e526 Compare February 19, 2022 00:12
@BlueStalker
Copy link
Contributor Author

@alexjo2144 Any more comments, can we merge this?

Copy link
Member

@alexjo2144 alexjo2144 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes look good, just need one tweak to the test still.

We need a maintainer to take a look though, I can't merge things into Trino. @findepi ?

@BlueStalker BlueStalker force-pushed the escape_deprecated_column branch from 903e526 to cedd3ec Compare February 22, 2022 19:23
@BlueStalker
Copy link
Contributor Author

Code changes look good, just need one tweak to the test still.

We need a maintainer to take a look though, I can't merge things into Trino. @findepi ?

updated

@BlueStalker
Copy link
Contributor Author

@findepi Any other suggestions?

Copy link
Member

@alexjo2144 alexjo2144 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I'd probably just word the commit/title a little differently.

Maybe something like "Fix Iceberg statistics accumulation after schema evolution"

@findepi findepi changed the title Escape the deprecated data file from Iceberg Fix Iceberg statistics accumulation after schema evolution Feb 25, 2022
@findepi findepi force-pushed the escape_deprecated_column branch from cedd3ec to 0c346b5 Compare February 25, 2022 20:37
@findepi findepi merged commit a585a57 into trinodb:master Feb 25, 2022
@findepi findepi mentioned this pull request Feb 25, 2022
@findepi
Copy link
Member

findepi commented Feb 25, 2022

Merged, thank you @BlueStalker for your PR!
and thank you @alexjo2144 for your review

@github-actions github-actions bot added this to the 372 milestone Feb 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants