-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add column field ID control in parquet writer #10504
Add column field ID control in parquet writer #10504
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-22.06 #10504 +/- ##
================================================
+ Coverage 86.33% 86.37% +0.03%
================================================
Files 140 142 +2
Lines 22289 22356 +67
================================================
+ Hits 19244 19310 +66
- Misses 3045 3046 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine. Apart from @jlowe's concerns, this should be good to go
Looks like the Java build is failing on a chunked write with "Optional has no value". Is there a place that was missed for handling chunked writes?
@res-life Thanks! I just gave you the write access to my repo. Can you please add the JNI bindings and tests to this PR? |
Signed-off-by: Chong Gao <[email protected]>
Signed-off-by: Chong Gao <[email protected]>
Signed-off-by: Chong Gao <[email protected]>
@jlowe help review JNI part |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there!
…f into parquet-field-id-writing
* Set a simple child meta data | ||
* @return this for chaining. | ||
*/ | ||
public T withColumns(boolean nullable, String name, int parquetFieldId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be withColumn
since it's only adding a single column.
…f into parquet-field-id-writing
@gpucibot merge |
Closes #10375
Closes #10376
This PR enables column
field_id
control in the parquet writer. When writing a parquet file, users can specify a column'sfield_id
viacolumn_in_metadata.set_parquet_field_id()
. JNI bindings and uni tests are added as well.