Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure all of device bitmask is initialized in from_arrow #12668

Conversation

wence-
Copy link
Contributor

@wence- wence- commented Feb 1, 2023

Description

libcudf bitmasks are allocated to a multiple of 64 bytes, in contrast the arrow spec only requires of a column with a null mask that "the validity bitmap must be large enough to have at least 1 bit for each array slot". When the number of rows is not a multiple of 64, the trailing part of the device allocation (which doesn't contribute to actually masking anything) is left uninitialized. While probably benign, this produces errors when running with compute-sanitizer in initcheck mode (since those data are touched and are uninitialized).

To fix this, memset the trailing allocation to zero.

Closes #8873.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

libcudf bitmasks are allocated to a multiple of 64 bytes, in contrast
the arrow spec only requires of a column with a null mask that "the
validity bitmap must be large enough to have at least 1 bit for each
array slot". When the number of rows is not a multiple of 64, the
trailing part of the device allocation (which doesn't contribute to
actually masking anything) is left uninitialized. While probably
benign, this produces errors when running with compute-sanitizer in
initcheck mode (since those data are touched and _are_ uninitialized).

To fix this, memset the trailing allocation to zero.

Closes rapidsai#8873.
@wence- wence- requested a review from a team as a code owner February 1, 2023 16:50
@wence- wence- requested review from harrism and mythrocks February 1, 2023 16:50
@wence- wence- added libcudf Affects libcudf (C++/CUDA) code. bug Something isn't working non-breaking Non-breaking change labels Feb 1, 2023
@wence- wence- self-assigned this Feb 1, 2023
@wence-
Copy link
Contributor Author

wence- commented Feb 1, 2023

This is actually @harrism's code from #8873 (hence the author spoofing).

cpp/src/interop/from_arrow.cu Outdated Show resolved Hide resolved
cpp/src/interop/from_arrow.cu Outdated Show resolved Hide resolved
cpp/src/interop/from_arrow.cu Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Feb 1, 2023

Codecov Report

❗ No coverage uploaded for pull request base (branch-23.04@6a67e8f). Click here to learn what that means.
Patch has no changes to coverable lines.

Additional details and impacted files
@@               Coverage Diff               @@
##             branch-23.04   #12668   +/-   ##
===============================================
  Coverage                ?   85.82%           
===============================================
  Files                   ?      158           
  Lines                   ?    25184           
  Branches                ?        0           
===============================================
  Hits                    ?    21614           
  Misses                  ?     3570           
  Partials                ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@harrism
Copy link
Member

harrism commented Feb 1, 2023

Nice. For future reference, how did you make it look like I authored a commit on this PR?

@wence-
Copy link
Contributor Author

wence- commented Feb 2, 2023

Nice. For future reference, how did you make it look like I authored a commit on this PR?

git commit --author='Mark Harris <email>'

(Although I use magit so it was fewer keystrokes because I got autocompletion of the existing committer names)

Copy link
Contributor

@davidwendt davidwendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, I don't think this change is worth the possible performance impact just to workaround compute-sanitizer.

cpp/src/interop/from_arrow.cu Show resolved Hide resolved
@wence-
Copy link
Contributor Author

wence- commented Feb 3, 2023

FWIW, this is a minimal thing to ensure that loading a bitmask fully initialises things, but isn't enough to make most operations initcheck clean. When we copy a column with a mask from the python side, we end up doing bools_to_mask which doesn't zero out the trailing allocation (for example).

I still need to do some benchmarking of this change before/after to satisfy that it doesn't obviously regress. I'll open up a discussion issue about the broader initcheck story, to see if it is worthwhile to do any more (and if so, how).

@wence-
Copy link
Contributor Author

wence- commented Mar 13, 2023

I still need to do some benchmarking of this change before/after to satisfy that it doesn't obviously regress. I'll open up a discussion issue about the broader initcheck story, to see if it is worthwhile to do any more (and if so, how).

Just to note, I did that in #12668 (comment), and no regression was identified.

I think this is still good, and could do with one more C++ review (@bdice or @mythrocks ?)

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good and the benchmarks appear to have no significant loss of performance.

@wence-
Copy link
Contributor Author

wence- commented Mar 13, 2023

/merge

@rapids-bot rapids-bot bot merged commit 0723f3f into rapidsai:branch-23.04 Mar 13, 2023
@wence- wence- deleted the wence/fix/from-arrow-bitmask-initialization branch March 13, 2023 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Uninitialized memory reads in fillna
4 participants