Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closes #2581 - Strings.get_null_indicies Not Reporting Correct Results in Parquet #2582

Merged

Conversation

Ethan-DeBandi99
Copy link
Contributor

Closes #2581

This PR updates the workflow for writing Strings objects to Parquet. Previously, even empty string segments were being written with a definition level=1. This results in the read of those segments showing values_read=1, but the length of the value is 0. Because values_read=1 was being seen, the null indicies were not being set properly. In order to get the values_read=0 as expected, during the write workflow a check of the value length was added. If this is 0, then the definition level is updated to 0. I verified with the test provided by @pierce314159 that this is now functioning as expected in the existing test and the updated test that was provided.

@stress-tess stress-tess enabled auto-merge July 17, 2023 16:24
@stress-tess stress-tess added this pull request to the merge queue Jul 17, 2023
Merged via the queue into Bears-R-Us:master with commit eebf624 Jul 17, 2023
@Ethan-DeBandi99 Ethan-DeBandi99 deleted the 2581_get_null_indicies_bug branch July 18, 2023 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Strings.get_null_indicies Not Reporting Correct Results in Parquet
4 participants