Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify null count checking in column equality comparator #13312

Merged

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented May 9, 2023

Description

This PR removes an extra code path used for checking the equality of the null count when verifying if columns are equivalent (not equal). The purpose of this code path was to verify a specific definition of equivalence for columns containing unsanitized nulls, i.e. by ignoring the stored null count and directly verifying the validity of the underlying null mask. This is no longer necessary because we required sanitized null masks to be output from all libcudf APIs now (see the "libcudf expects nested types to have sanitized null masks" section in the developer guide), and this requirement will be enforced with the merge of #14559.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@vyasr vyasr added 2 - In Progress Currently a work in progress tests Unit testing for project libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels May 9, 2023
@vyasr vyasr self-assigned this May 9, 2023
@vyasr
Copy link
Contributor Author

vyasr commented May 10, 2023

This looks to be safe. As discussed offline with @nvdbaranec, this special function for null counting is to handle non-empty nulls, which we no longer support. The remaining todo here is to add a test with an EXPECT_FAIL for a column containing non-empty nulls to ensure that our assertion functions like expect_columns_equivalent will fail when provided such a column, ensuring that we won't accidentally have tests passing on such inputs in the future.

EDIT: This testing is now explicitly handled by those assertion functions as of #14559

@vyasr vyasr force-pushed the chore/simplify_test_comparator branch from e5265d8 to 3375c37 Compare December 18, 2023 17:34
Copy link

copy-pr-bot bot commented Dec 18, 2023

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@vyasr vyasr changed the base branch from branch-23.06 to branch-24.02 December 18, 2023 17:34
@vyasr vyasr force-pushed the chore/simplify_test_comparator branch from 3375c37 to 0e1a778 Compare December 18, 2023 17:38
@vyasr vyasr marked this pull request as ready for review December 18, 2023 17:39
@vyasr vyasr requested a review from a team as a code owner December 18, 2023 17:40
@vyasr vyasr added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Dec 18, 2023
@vyasr vyasr force-pushed the chore/simplify_test_comparator branch from 0e1a778 to ad5265d Compare December 18, 2023 22:51
@vyasr
Copy link
Contributor Author

vyasr commented Dec 18, 2023

/merge

@rapids-bot rapids-bot bot merged commit 8b695e3 into rapidsai:branch-24.02 Dec 19, 2023
70 checks passed
@vyasr vyasr deleted the chore/simplify_test_comparator branch December 19, 2023 05:48
@GregoryKimball
Copy link
Contributor

@vyasr do you expect a performance impact from this simplification?

@GregoryKimball
Copy link
Contributor

GregoryKimball commented Jan 10, 2024

@vyasr do you expect a performance impact from this simplification?

No, this is only a test change

@vyasr
Copy link
Contributor Author

vyasr commented Jan 10, 2024

Confirming the last comment, which I discussed with Greg offline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change tests Unit testing for project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants