Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] result indices in group_argmin was not initialized to -1 as comment says #16542

Closed
thirtiseven opened this issue Aug 13, 2024 · 0 comments · Fixed by #17591
Closed

[BUG] result indices in group_argmin was not initialized to -1 as comment says #16542

thirtiseven opened this issue Aug 13, 2024 · 0 comments · Fixed by #17591
Assignees
Labels
bug Something isn't working

Comments

@thirtiseven
Copy link
Contributor

thirtiseven commented Aug 13, 2024

Describe the bug
In sorted-based group_argmin 's comment it says that the indices was initialized to ARGMIN_SENTINEL but it was not in the current code.

// The functor returns the index of minimum in the sorted values.
// We need the index of minimum in the original unsorted values.
// So use indices to gather the sort order used to sort `values`.
// The values in data buffer of indices corresponding to null values was
// initialized to ARGMIN_SENTINEL. Using gather_if.
// This can't use gather because nulls in gathered column will not store ARGMIN_SENTINEL.

Steps/Code to reproduce bug
Current code will contains null value with mask set and value of 0 (instead of -1). So the following gather_if seems to be unnecessary because the predicate will always be true:

[] __device__(auto i) { return (i != cudf::detail::ARGMIN_SENTINEL); });

It will also affect the logic of min which uses argmin

// We make a view of ARG(MIN/MAX) result without a null mask and gather
// using this map. The values in data buffer of ARG(MIN/MAX) result
// corresponding to null values was initialized to ARG(MIN/MAX)_SENTINEL
// which is an out of bounds index value (-1) and causes the gathered
// value to be null.
column_view null_removed_map(

It didn't show any correctness issues under my tests and I don't think it did the wrong thing but the comment is confusing.

Expected behavior
We can remove the comment and update the related code, like using gather instead of gather_if, if the initialization is really not needed. Or if it's a bug we need to fix it.

Additional context
group_argmax also has the same issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants