Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix struct scatter to correctly cascade null_mask to children columns #8176

Merged
merged 6 commits into from
May 7, 2021

Conversation

ttnghia
Copy link
Contributor

@ttnghia ttnghia commented May 6, 2021

This fixes a bug in scatter.cuh where the struct column failed to cascade its null_mask to its children. Typically, this is automatically done during struct construction. Unfortunately, the current scatter function constructs a structs column first, then updates the parent null_mask later thus it fails to update its children's null_mask.

@ttnghia ttnghia added bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change labels May 6, 2021
@ttnghia ttnghia self-assigned this May 6, 2021
@ttnghia ttnghia requested a review from a team as a code owner May 6, 2021 16:45
@codecov
Copy link

codecov bot commented May 6, 2021

Codecov Report

Merging #8176 (0184e81) into branch-0.20 (51336df) will decrease coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.20    #8176      +/-   ##
===============================================
- Coverage        82.88%   82.88%   -0.01%     
===============================================
  Files              103      104       +1     
  Lines            17668    17894     +226     
===============================================
+ Hits             14645    14832     +187     
- Misses            3023     3062      +39     
Impacted Files Coverage Δ
python/cudf/cudf/core/tools/datetimes.py 80.42% <0.00%> (-4.11%) ⬇️
python/cudf/cudf/core/column/decimal.py 91.04% <0.00%> (-1.89%) ⬇️
python/cudf/cudf/core/column/datetime.py 88.03% <0.00%> (-1.88%) ⬇️
python/cudf/cudf/core/column/struct.py 94.73% <0.00%> (-1.56%) ⬇️
python/cudf/cudf/utils/dtypes.py 82.20% <0.00%> (-1.24%) ⬇️
python/dask_cudf/dask_cudf/groupby.py 91.28% <0.00%> (-0.88%) ⬇️
python/cudf/cudf/core/series.py 91.17% <0.00%> (-0.56%) ⬇️
python/cudf/cudf/core/index.py 92.52% <0.00%> (-0.55%) ⬇️
python/cudf/cudf/core/column/column.py 88.20% <0.00%> (-0.44%) ⬇️
python/cudf/cudf/core/column/lists.py 86.98% <0.00%> (-0.43%) ⬇️
... and 28 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8ae73d5...0184e81. Read the comment docs.

Copy link
Contributor

@mythrocks mythrocks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I have verified that this resolves #8162, and sorts out the problems I see on #8135.

@mythrocks
Copy link
Contributor

@ttnghia, could we please change the PR title to be more descriptive?
"Fix child column null-masks for scattered struct columns", or something that describes what is being fixed.

@ttnghia ttnghia changed the title Fix struct scatter Fix struct scatter to correctly cascade null_mask to children columns May 6, 2021
@ttnghia
Copy link
Contributor Author

ttnghia commented May 7, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit db21232 into rapidsai:branch-0.20 May 7, 2021
@ttnghia ttnghia deleted the fix_struct_scatter branch May 27, 2021 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] scatter() on struct columns does not set child null masks correctly
3 participants