Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip unsupported type widening columns #22458

Merged

Conversation

SemionPar
Copy link
Contributor

@SemionPar SemionPar commented Jun 20, 2024

Description

A change from failing to skipping is for flexibility, because Databricks supports wider variety of type changes.

Draft todos:

  • add case for nested types
  • add case for partition columns
  • UnsupportedTypeException.type -> message?
  • Fix failures when all columns are unsupported or specifically selecting only unsupported columns - these failure messages are expected in such cases
assertQueryFails("SELECT * FROM " + tableName, "(.*)SELECT \\* not allowed from relation that has no columns");
assertQueryFails("SELECT col FROM " + tableName, "(.*)Column 'col' cannot be resolved");

Additional context and related issues

#22142, fixes #22433

Release notes

# Delta Lake
* Add support for reading partitioned tables which the partition column type is changed by [type widening](https://docs.delta.io/latest/delta-type-widening.html). ({issue}`22458`)

@cla-bot cla-bot bot added the cla-signed label Jun 20, 2024
@github-actions github-actions bot added the delta-lake Delta Lake connector label Jun 20, 2024
@SemionPar SemionPar force-pushed the semionpar/delta-type-widening-skipping branch from fb27697 to ffb5ee2 Compare June 20, 2024 16:54
@SemionPar
Copy link
Contributor Author

While testing nested types I discovered that for struct, metadata object is misplaced:

{
  "type": "struct",
  "fields": [
    {
      "name": "s",
      "type": {
        "type": "struct",
        "fields": [
          {
            "name": "field",
            "type": "double",
            "nullable": true,
            "metadata": {
              "delta.typeChanges": [
                {
                  "toType": "short",
                  "fromType": "byte",
                  "tableVersion": 2
                },
                {
                  "toType": "double",
                  "fromType": "short",
                  "tableVersion": 7
                }
              ]
            }
          }
        ]
      },
      "nullable": true,
      "metadata": {}
    },

VS array which has metadata where we look for it now:

    {
      "name": "a",
      "type": {
        "type": "array",
        "elementType": "double",
        "containsNull": true
      },
      "nullable": true,
      "metadata": {
        "delta.typeChanges": [
          {
            "toType": "short",
            "fromType": "byte",
            "tableVersion": 5,
            "fieldPath": "element"
          },
          {
            "toType": "double",
            "fromType": "short",
            "tableVersion": 10,
            "fieldPath": "element"
          }
        ]
      }
    }

I will initially implement skipping of the whole struct if one of the fields type widening is not supported.

@SemionPar SemionPar force-pushed the semionpar/delta-type-widening-skipping branch 2 times, most recently from 1a16d7f to d3fddef Compare July 1, 2024 15:56
@SemionPar SemionPar force-pushed the semionpar/delta-type-widening-skipping branch 3 times, most recently from 419ea35 to 3059148 Compare July 12, 2024 16:30
@SemionPar SemionPar requested a review from ebyhr July 12, 2024 16:33
@SemionPar SemionPar marked this pull request as ready for review July 12, 2024 16:33
@SemionPar SemionPar changed the title [draft] Skip unsupported type widening columns Skip unsupported type widening columns Jul 12, 2024
@SemionPar SemionPar requested a review from nineinchnick July 15, 2024 11:05
@SemionPar SemionPar force-pushed the semionpar/delta-type-widening-skipping branch from 3059148 to 3feada2 Compare July 15, 2024 11:56
@SemionPar SemionPar force-pushed the semionpar/delta-type-widening-skipping branch 2 times, most recently from 0a5d7b4 to 04facb8 Compare July 16, 2024 16:11
@SemionPar SemionPar force-pushed the semionpar/delta-type-widening-skipping branch from 04facb8 to e5d2cab Compare July 17, 2024 16:51
@SemionPar SemionPar requested review from ebyhr and findinpath July 18, 2024 14:19
@SemionPar SemionPar force-pushed the semionpar/delta-type-widening-skipping branch from ba9c62a to a04cd3b Compare July 19, 2024 14:54
@SemionPar
Copy link
Contributor Author

Squashed commits.

A change from failing to skipping is for flexibility, because Databricks
supports wider variety of type changes.
Now it also signifies unsupported type widening errors.
@SemionPar SemionPar force-pushed the semionpar/delta-type-widening-skipping branch from a04cd3b to f084b1f Compare July 23, 2024 12:01
@ebyhr ebyhr merged commit c4a7ec3 into trinodb:master Jul 24, 2024
24 checks passed
@github-actions github-actions bot added this to the 453 milestone Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed delta-lake Delta Lake connector
Development

Successfully merging this pull request may close these issues.

Add support for more type widening in Delta Lake connector
4 participants