Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix an issue reading struct-of-list types in Parquet. #11910

Merged
merged 1 commit into from
Oct 13, 2022

Conversation

nvdbaranec
Copy link
Contributor

Fixes NVIDIA/spark-rapids#6718

There was a bug introduced recently #11752 where an insufficient check for whether an input column contained repetition information could cause incorrect results for column hierarchies with structs at the root.

@nvdbaranec nvdbaranec added bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. cuIO cuIO issue non-breaking Non-breaking change labels Oct 12, 2022
@nvdbaranec nvdbaranec requested a review from a team as a code owner October 12, 2022 15:12
Copy link
Contributor

@jbrennan333 jbrennan333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 lgtm

@codecov
Copy link

codecov bot commented Oct 12, 2022

Codecov Report

Base: 87.40% // Head: 88.11% // Increases project coverage by +0.70% 🎉

Coverage data is based on head (ea9b295) compared to base (f72c4ce).
Patch coverage: 86.42% of modified lines in pull request are covered.

❗ Current head ea9b295 differs from pull request most recent head 489237d. Consider uploading reports for the commit 489237d to get more accurate results

Additional details and impacted files
@@               Coverage Diff                @@
##           branch-22.12   #11910      +/-   ##
================================================
+ Coverage         87.40%   88.11%   +0.70%     
================================================
  Files               133      133              
  Lines             21833    21881      +48     
================================================
+ Hits              19084    19281     +197     
+ Misses             2749     2600     -149     
Impacted Files Coverage Δ
python/cudf/cudf/core/udf/__init__.py 97.05% <ø> (+47.05%) ⬆️
python/cudf/cudf/io/orc.py 92.94% <ø> (-0.09%) ⬇️
python/cudf/cudf/utils/ioutils.py 79.47% <ø> (ø)
...thon/dask_cudf/dask_cudf/tests/test_distributed.py 18.86% <ø> (+4.94%) ⬆️
python/cudf/cudf/core/_base_index.py 82.20% <43.75%> (-3.35%) ⬇️
python/cudf/cudf/io/text.py 91.66% <66.66%> (-8.34%) ⬇️
python/strings_udf/strings_udf/__init__.py 86.27% <76.00%> (-10.61%) ⬇️
python/cudf/cudf/core/index.py 92.91% <95.16%> (+0.28%) ⬆️
python/cudf/cudf/__init__.py 90.69% <100.00%> (ø)
python/cudf/cudf/core/column/categorical.py 89.34% <100.00%> (ø)
... and 12 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@nvdbaranec
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit fb0922f into rapidsai:branch-22.12 Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] test_iceberg_parquet_read_round_trip FAILED "TypeError: object of type 'NoneType' has no len()"
4 participants