Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip unsupported variant type in Delta Lake #22310

Merged
merged 1 commit into from
Jun 10, 2024

Conversation

ebyhr
Copy link
Member

@ebyhr ebyhr commented Jun 7, 2024

Description

Skip unsupported variant type in Delta Lake.
Databricks doesn't allow the type in partition columns.

Release notes

(x) Release notes are required, with the following suggested text:

# Delta Lake
* Fix failure when reading tables with `variant` type. ({issue}`22310`)

@cla-bot cla-bot bot added the cla-signed label Jun 7, 2024
@github-actions github-actions bot added the delta-lake Delta Lake connector label Jun 7, 2024
@ebyhr ebyhr force-pushed the ebi/delta-variant branch 2 times, most recently from fecd850 to c02b7b0 Compare June 7, 2024 04:41
@ebyhr ebyhr requested review from wendigo and pajaks June 7, 2024 08:06
@findinpath findinpath requested review from findepi and alexjo2144 and removed request for alexjo2144 June 7, 2024 11:25
columns.add(mapColumn(typeManager, nodes.next(), mappingMode));
}
catch (UnsupportedTypeException e) {
log.debug("Skip unsupported column type: %s", e.type());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can ignore on read, but can't ignore on writes, if the column has constraints. Which we won't know if we skip it completely.

What about adding a direct support for variant instead, mapping it to json?

Copy link
Member Author

@ebyhr ebyhr Jun 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The write operations are protected by the unsupported writer feature variantType. Why are we caring about this column skipping logic?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if writes are blocked then fine

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a code comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can ignore on read, but can't ignore on writes

I haven't seen yet this "best effort" practice.
I find it misleading from user perspective to see only parts of the columns.

At best, such a functionality should be enabled explicitly by the user through a session property.
This strategy is however not worth the effort.

I'm advocating either for supporting reading variants and succeed the read operation or fail the read operation otherwise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@findinpath Skipping unsupported column types is the default behavior in the existing connectors.
I can't find a reason we want to change the stance only in Delta Lake connector.

columns.add(mapColumn(typeManager, nodes.next(), mappingMode));
}
catch (UnsupportedTypeException e) {
log.debug("Skip unsupported column type: %s", e.type());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if writes are blocked then fine

Comment on lines +112 to +114
public static final String VARIANT_TYPE_FEATURE_NAME = "variantType";
public static final String VARIANT_TYPE_PREVIEW_FEATURE_NAME = "variantType-preview";
Copy link
Contributor

@findinpath findinpath Jun 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. We should add a product test once we get the environment. #22342 is the follow-up issue.

@ebyhr ebyhr force-pushed the ebi/delta-variant branch 2 times, most recently from a574ac5 to 50856d8 Compare June 10, 2024 00:22
@ebyhr ebyhr force-pushed the ebi/delta-variant branch from 50856d8 to 83adcde Compare June 10, 2024 00:25
@ebyhr ebyhr merged commit 73f3104 into trinodb:master Jun 10, 2024
24 checks passed
@ebyhr ebyhr deleted the ebi/delta-variant branch June 10, 2024 00:59
@github-actions github-actions bot added this to the 450 milestone Jun 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed delta-lake Delta Lake connector
Development

Successfully merging this pull request may close these issues.

4 participants