Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove all references to UNKNOWN_NULL_COUNT in Python #13345

Merged
merged 8 commits into from
May 16, 2023

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented May 12, 2023

Description

Part of #11968

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@vyasr vyasr added 3 - Ready for Review Ready for review by team Python Affects Python cuDF API. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels May 12, 2023
@vyasr vyasr self-assigned this May 12, 2023
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label May 12, 2023
@vyasr vyasr marked this pull request as ready for review May 12, 2023 21:19
@vyasr vyasr requested review from a team as code owners May 12, 2023 21:19
Copy link
Contributor

@shwina shwina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, but do you think it makes sense for libcudf to provide a utility for computing the null count given a base mask and an offset? (and could it potentially do that without having to copy any data?)

@vyasr
Copy link
Contributor Author

vyasr commented May 16, 2023

This looks good to me, but do you think it makes sense for libcudf to provide a utility for computing the null count given a base mask and an offset? (and could it potentially do that without having to copy any data?)

Great question! I tried to stick to the simplest rewrite of the existing code path as possible, but in fact the copying is not at all necessary. The API you want is what I already exposed in this PR, I just need to use it. Making that change now.

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small questions/comments. Resolve as you see fit. :)

cpp/src/bitmask/null_mask.cu Show resolved Hide resolved
python/cudf/cudf/_lib/column.pyx Outdated Show resolved Hide resolved
@vyasr
Copy link
Contributor Author

vyasr commented May 16, 2023

/merge

@rapids-bot rapids-bot bot merged commit fc43b7e into rapidsai:branch-23.06 May 16, 2023
@vyasr vyasr deleted the feat/python_null_count branch May 16, 2023 21:40
rapids-bot bot pushed a commit that referenced this pull request May 18, 2023
Fixes #13353.
Depends on #13345.
    
In preparation for #11968, this change ensures that columns constructed from CUDF JNI do not have their null counts set to `UNKNOWN_NULL_COUNT` (i.e. `-1`). In cases where the caller invokes JNI functions with `UNKNOWN_NULL_COUNT`, the JNI layer computes the concrete null count from the validity mask, and sets this value in the column.

The current Java API remains unchanged; there should be no impact to user code.
    
The option to specify an optional null count through the Java API will likely be removed at a later date.
    
Signed-off-by: MithunR <[email protected]>

Authors:
  - MithunR (https://github.com/mythrocks)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Jason Lowe (https://github.com/jlowe)
  - Nghia Truong (https://github.com/ttnghia)

URL: #13355
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants