Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow importing parquet files with arbitrary root column names. #8646

Merged
merged 1 commit into from
Dec 6, 2024

Conversation

nicktobey
Copy link
Contributor

Parquet represents schemas as trees with a root node, and columns as paths from the root to a leaf.

parquet-go, the library Dolt uses to export tables as parquet files, uses parquet_go_root as the default name of the root node, but other names are allowed.

Prior to this PR, Dolt was assuming that the parquet file always had a root node named parquet_go_root, and would fail if any other name was used. This PR changes the behavior to have Dolt read the name from the file instead.

I also added a test that imports a simple parquet file that was not created with parquet-go.

@nicktobey nicktobey requested a review from jennifersp December 6, 2024 03:17
@coffeegoddd
Copy link
Contributor

@nicktobey DOLT

comparing_percentages
100.000000 to 100.000000
version result total
87615ac ok 5937457
version total_tests
87615ac 5937457
correctness_percentage
100.0

Copy link
Contributor

@jennifersp jennifersp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

@nicktobey nicktobey changed the title Import parquet files regardless of their root column name. Allow importing parquet files with arbitrary root column names. Dec 6, 2024
@nicktobey nicktobey merged commit 9984308 into main Dec 6, 2024
34 of 35 checks passed
@nicktobey nicktobey deleted the nicktobey/parquet branch December 6, 2024 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants