-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix null check when comparing structs in min
and max
reduction/groupby operations
#9994
Conversation
Reference: NVIDIA/spark-rapids#4434 (that work is on spark-rapids plugin implementing |
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me outside of the copyrights.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - @ttnghia can you please update the PR description to cross-reference the issues, PRs, and/or comments that led to discovering this bug and its fix? You referenced NVIDIA/spark-rapids#4434 but I don't immediately see the relationship to this PR.
edit: I think this comment is the one that explains it. This PR fixes the difference in CPU/GPU behavior from this thread: #8974 (comment)
@gpucibot merge |
min
and max
reduction/groupby operationsmin
and max
reduction/groupby operations
When comparing structs, we need to flatten its view into a
table_view
and compare the table's rows. Nulls check for the comparison needs to be done by checking nulls in the input structs column at all levels.This PR fixes a bug that checks for nulls only at the top level. Unit tests designed specifically to detect this bug have also been added.