-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] cudf::structs::detail::superimpose_parent_nulls
does not purge nulls for the children
#12027
Comments
ttnghia
added
bug
Something isn't working
Needs Triage
Need team to review and classify
labels
Oct 28, 2022
ttnghia
changed the title
[BUG]
[BUG] Oct 28, 2022
cudf::structs::detail::superimpose_parent_nulls
does not purge null listscudf::structs::detail::superimpose_parent_nulls
does not purge nulls for the children
This was referenced Nov 22, 2022
rapids-bot bot
pushed a commit
that referenced
this issue
Nov 30, 2022
There are several overloads of these functions which work differently. They are classified into 2 groups: * `superimpose_parent_nulls(null_mask, null_count, column)`: Performs superimposing nulls from somewhere else into the input column, * `superimpose_parent_nulls(column_view/table_view)`: Perform superimposing nulls of the input column(s) into their children columns-the input root column(s) are not affected. That is confusing. They should have different names to reflect their purposes. This PR renames these groups into more meaningful names: `superimpose_nulls` and `push_down_nulls`. No implementation has been changed. This also supports #12027. Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - David Wendt (https://github.com/davidwendt) - MithunR (https://github.com/mythrocks) URL: #12230
rapids-bot bot
pushed a commit
that referenced
this issue
Dec 16, 2022
…12239) The current implementation of `cudf::structs::detail::superimpose_nulls` and `cudf::structs::detail::push_down_nulls` does not do null sanitization. In particular, they only apply the given null mask on the input columns, or push down the parent null mask into the children columns. If there are lists/strings being superimposed by a null bit, they are not sanitized (i.e., they still remain non-empty). This PR fixes that behavior, sanitizing all non-empty nulls for the output column(s) of these functions. Since there are some changes in the function signatures, this PR is flagged as breaking change. No new unit tests are needed because the existing tests are already capable to discover the sanitization issue. They were just using `CUDF_TEST_EXPECT_COLUMNS_EQUIVALENT` which allows non-empty nulls to be compared as equal to empty nulls. Changing to `CUDF_TEST_EXPECT_COLUMNS_EQUAL` in these existing tests should be sufficient. Last but not least, the breaking changes in this PR alter the output results of the `cudf::make_lists_column` function, causing breaking in some other tests such as row bit count tests. Closes: * #12027 Depends on: * #12230 Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - David Wendt (https://github.com/davidwendt) - MithunR (https://github.com/mythrocks) URL: #12239
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
As the title said, the current implementation of
cudf::structs::detail::superimpose_parent_nulls
only set null mask for the children columns. If a child column is a lists/structs/strings/dictionary column, it also needs to be sanitized bypurge_nonempty_nulls
.Reference:
cudf/cpp/src/structs/utilities.cpp
Lines 214 to 218 in 07eb723
cudf/cpp/src/structs/utilities.cpp
Lines 261 to 262 in 07eb723
The text was updated successfully, but these errors were encountered: