Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK][TEST-ONLY] Add more tests for Identity Column #3526

Conversation

zhipengmao-db
Copy link
Contributor

@zhipengmao-db zhipengmao-db commented Aug 12, 2024

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

This PR is part of #1959 .
It add more tests for Identity Column to test

  • logging identity column properties and stats
  • reading table should not see identity column properties
  • compatibility with table of older protocols
  • identity value generation starting at range boundaries of long data type

How was this patch tested?

Test only change.

Does this PR introduce any user-facing changes?

No.

Comment on lines 207 to 215
val commands: Map[Int, () => Dataset[_]] = Map(
1 -> (() => spark.table(tblName)),
2 -> (() => sql(s"SELECT * FROM $tblName")),
3 -> (() => sql(s"SELECT * FROM delta.`$path`")),
4 -> (() => spark.read.format("delta").load(path)),
5 -> (() => spark.read.format("delta").table(tblName)),
6 -> (() => spark.readStream.format("delta").load(path)),
7 -> (() => spark.readStream.format("delta").table(tblName))
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We either don't need the indices here or we just make it the parameter (Int) => Dataset[]. They're just used for "test number". Let's simplify this.

@zhipengmao-db zhipengmao-db requested a review from c27kwan August 12, 2024 11:54
Copy link
Contributor

@c27kwan c27kwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New tests lgtm, thanks for adding them! Don't forget to reference the parent github issue which tracks all Identity Column PRs.

@c27kwan
Copy link
Contributor

c27kwan commented Aug 12, 2024

The PR description could be more descriptive. The list of tests is pretty short so it's not verbose to do it in this case :)

@zhipengmao-db zhipengmao-db changed the title [SPARK] Add more tests for Identity Column [SPARK][TEST-ONLY] Add more tests for Identity Column Aug 12, 2024
@scottsand-db
Copy link
Collaborator

@felipepessoto
Copy link
Contributor

@zhipengmao-db do we have any tests for Identity + MERGE? I'm not sure if the implementation is already done though

@zhipengmao-db
Copy link
Contributor Author

Failed test https://github.com/delta-io/delta/actions/runs/10351172194/job/28649104059?pr=3526 PTAL. Flaky?

The failed test is org.apache.spark.sql.delta.CloneIcebergByNameSuite, which should not be related to this change as the change is test only.

@zhipengmao-db
Copy link
Contributor Author

@zhipengmao-db do we have any tests for Identity + MERGE? I'm not sure if the implementation is already done though

MERGE support has not been done yet but it is on the plan

@scottsand-db scottsand-db merged commit 48def61 into delta-io:master Aug 14, 2024
13 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants