Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] cudf::hash_join accepts null equality parameter at probe time #9155

Closed
jlowe opened this issue Aug 31, 2021 · 6 comments · Fixed by #10260
Closed

[BUG] cudf::hash_join accepts null equality parameter at probe time #9155

jlowe opened this issue Aug 31, 2021 · 6 comments · Fixed by #10260
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code.

Comments

@jlowe
Copy link
Member

jlowe commented Aug 31, 2021

When constructing a cudf::hash_join instance, one must specify whether nulls will compare as equal as part of building the hash table. Then later the caller can specify the null equality again when invoking the probe table join methods, however it will only perform correctly if the parameter matches the value specified when the instance was constructed.

cudf::hash_join instances should remember the setting when they were constructed and use the same setting when the probing methods are called later. This prevents bugs on the part of the caller where the parameter values can mismatch between construction and probe invocations and lead to data corruption.

@jlowe jlowe added bug Something isn't working Needs Triage Need team to review and classify libcudf Affects libcudf (C++/CUDA) code. labels Aug 31, 2021
@jrhemstad jrhemstad removed the Needs Triage Need team to review and classify label Sep 2, 2021
@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@jlowe
Copy link
Member Author

jlowe commented Nov 15, 2021

Still desired for this to be fixed. Low priority.

@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@jlowe
Copy link
Member Author

jlowe commented Dec 15, 2021

Still relevant, low priority.

@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@jlowe
Copy link
Member Author

jlowe commented Jan 20, 2022

Still relevant, low priority.

rapids-bot bot pushed a commit that referenced this issue Feb 10, 2022
Closes #9155

This PR removes the probe-time `cudf::null_equality` parameter in `cudf::hash_join` to avoid potential mismatching bugs between building and probing a hash join object.

Authors:
  - Yunsong Wang (https://github.com/PointKernel)
  - Jason Lowe (https://github.com/jlowe)

Approvers:
  - Conor Hoekstra (https://github.com/codereport)
  - Robert (Bobby) Evans (https://github.com/revans2)

URL: #10260
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants