-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Define and implement non equi join #3628
[WIP] Define and implement non equi join #3628
Conversation
#2792 is a pretty significant undertaking. I think we should spend some time talking about how it's going to be implemented before we attempt to define an API. |
Codecov Report
@@ Coverage Diff @@
## branch-0.14 #3628 +/- ##
============================================
Coverage 87.39% 87.39%
============================================
Files 49 49
Lines 9380 9380
============================================
Hits 8198 8198
Misses 1182 1182 Continue to review full report at Codecov.
|
cudf::size_type left_column_idx; ///< index of left column to compare | ||
cudf::size_type right_column_idx; ///< index of right column to compare | ||
bool only_left_in_output; ///< if true, only output the left column | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style comment : I really like this style (justification of member names). I wish this was more common :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do specify this in our style guide for enums, but we should add it for struct members (work in progress): https://docs.google.com/document/d/10Ff4VZlSsizHdF5r9Zrw7dIwlPD0yWWSmNJ6qter7r8/edit#heading=h.u9cu9e14hobq
// the existing API for equi join and the new internal mechanism for non-equi join? Perhaps | ||
// I implement this public API toward that purpose and we can decide whether to deprecate the | ||
// old interface? | ||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these functions are supersets of inner/outer join, I would think this would replace their implementations entirely. So their interfaces in this header might still exist, but all they'd be doing is calling into this function with an EQUAL operation. Caveat : I am in no way an expert on database stuff so this comment may be moot :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll have to see how the implementations unfold. I was actually imagining the opposite... that perhaps the interfaces would combine but if the ONLY comparison operator is EQUAL then we would call the existing implementations. It is possible when all is said and done that this implementation would be comparable in performance to the original, but I'm not convinced at this point.
* join_ops: { {GREATER_THAN, 0, 0, false} } | ||
* Result: { a: {2}, b: {1}, c: {5} } | ||
* | ||
* @throws cudf::logic_error if either table is empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like there are some other possibilities here: Out of bounds indices in columns_in_common, out of bounds indices in the join_ops structs.
Can one of the admins verify this patch? |
This PR has been marked rotten due to no recent activity in the past 90d. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates. |
This is really stale. Closing, but we can reopen or cherry pick from it if needed. |
Closes #2792
This should be merged after #3224 and #3538
At this point the PR only contains the API definition. I included the entire join.hpp file from PR 3538 for now. Once the dependent PRs are merged I'll merge any final updates.