Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Added support to write iceberg tables #5989

Merged
merged 64 commits into from
Nov 22, 2024

Conversation

malhotrashivam
Copy link
Contributor

@malhotrashivam malhotrashivam commented Aug 26, 2024

Closes: #6125

Also moves existing Iceberg tests from Junit4 to Junit5.

@malhotrashivam malhotrashivam added parquet Related to the Parquet integration DocumentationNeeded ReleaseNotesNeeded Release notes are needed s3 iceberg labels Aug 26, 2024
@malhotrashivam malhotrashivam added this to the 0.37.0 milestone Aug 26, 2024
@malhotrashivam malhotrashivam self-assigned this Aug 26, 2024
@malhotrashivam malhotrashivam marked this pull request as draft September 6, 2024 18:27
@malhotrashivam malhotrashivam changed the title feat: [DO NOT MERGE] Added support to write iceberg tables feat: Added support to write iceberg tables Sep 6, 2024
@malhotrashivam malhotrashivam marked this pull request as ready for review October 1, 2024 17:25
Copy link
Contributor

@jmao-denver jmao-denver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor docstring issues.

py/server/deephaven/experimental/iceberg.py Outdated Show resolved Hide resolved
py/server/deephaven/experimental/iceberg.py Outdated Show resolved Hide resolved
py/server/deephaven/experimental/iceberg.py Outdated Show resolved Hide resolved
py/server/deephaven/experimental/iceberg.py Outdated Show resolved Hide resolved
py/server/deephaven/experimental/iceberg.py Show resolved Hide resolved
"""
self.j_object.append(instructions.j_object)

def overwrite(self, instructions: IcebergParquetWriteInstructions):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmao-denver This works, but I have some concerns that it is not as pythonic as putting the args here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of a question than a request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

py/server/deephaven/experimental/iceberg.py Outdated Show resolved Hide resolved
py/server/deephaven/experimental/iceberg.py Outdated Show resolved Hide resolved
py/server/deephaven/experimental/iceberg.py Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iceberg.py has changed significantly since I last looked at this PR, but this file has not. Are we testing all methods, or does the new stuff need tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added tests for the instructions but I still don't have tests for actual appending since we don't have support to unit test iceberg for now. But we have ticket open for that #5656

jmao-denver
jmao-denver previously approved these changes Nov 22, 2024
Copy link
Contributor

@jmao-denver jmao-denver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python changes LGTM

py/server/deephaven/experimental/iceberg.py Show resolved Hide resolved
py/server/deephaven/experimental/iceberg.py Outdated Show resolved Hide resolved
py/server/deephaven/experimental/iceberg.py Outdated Show resolved Hide resolved
py/server/deephaven/experimental/iceberg.py Outdated Show resolved Hide resolved
chipkent
chipkent previously approved these changes Nov 22, 2024
jmao-denver
jmao-denver previously approved these changes Nov 22, 2024
* Return a {@link TableDefinition} which contains only the non-partitioning columns from the provided table
* definition.
*/
private static List<ColumnDefinition<?>> nonPartitioningColumnDefinitions(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should just return new TableDefinition(nonPartitioningColumns), in that way we could take advantage of table.checkMutualCompatibility (it automatically throws a nicely formatted error message). mutual compatibility is almost the same as equals, but allows the columns to be in different orders. (We could advocate for a stricter table.checkEquals, but I'm not too worried about being completely strict like that.)

@devinrsmith devinrsmith merged commit ecdc8e7 into deephaven:main Nov 22, 2024
17 checks passed
@deephaven-internal
Copy link
Contributor

Labels indicate documentation is required. Issues for documentation have been opened:

Community: deephaven/deephaven-docs-community#367

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
DocumentationNeeded iceberg parquet Related to the Parquet integration ReleaseNotesNeeded Release notes are needed s3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support to write deephaven tables to iceberg
7 participants