-
Notifications
You must be signed in to change notification settings - Fork 928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Port unary to libcudf++ #3214
Conversation
293c698
to
8826ad8
Compare
Codecov Report
@@ Coverage Diff @@
## branch-0.11 #3214 +/- ##
============================================
Coverage 87.35% 87.35%
============================================
Files 49 49
Lines 9329 9329
============================================
Hits 8149 8149
Misses 1180 1180 Continue to review full report at Codecov.
|
[ISSUE] 2950 Port null_ops.cu
8826ad8
to
14a3f63
Compare
@codereport please avoid force pushes (e.g. use merge rather than rebase) once you have an open PR associated with a feature branch. Github doesn't handle force pushes well. It's most important once people have reviewed the code, because force pushing disassociates comments from the current state of the code. |
Will avoid these in the future! Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there was some miscommunication and porting these APIs was already done here: #3182
@jrhemstad Unary ops encompasses a lot more than just null_ops! But yes, null_ops is a new feature added by @rgsl888prabhu in #2962 and immediately ported to libcudf in #3182. @codereport I don't think you need to touch null_ops, just everything else in unary.hpp |
@jrhemstad @harrism : @rgsl888prabhu did mention he had code changes for I will focus on the 3 other files. |
null_ops did not exist when I originally created #2950. |
@harrism Sorry Mark! I wasn't clear - I meant that if Ram had have let me know what the PR #'s were I could have checked. I will just make sure in the future to ask for them :) |
I apologize for any inconvenience caused by me. |
Remove port of null_ops.cu It is duplicate work that has been done in 3182
GitHub has a pretty good PR search function. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing actually for ops to review. Maybe from a merge or force push, but approving to prevent blocking the PR.
[ISSUE] 2950 Initial port math_ops.cu & unary_ops.cuh
@codereport just as a general note, I think the old implementation could be vastly improved by replacing the custom kernel with a |
Thanks @jrhemstad! I will definitely do that |
@harrism I am just in the process of adding a couple extra tests, will mark as [REVIEW] once I have added these. |
Change .begin/end to .cbegin/cend
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nearly there. Some test suggestions.
cpp/src/unary/math_ops.cu
Outdated
} | ||
|
||
template <typename T, | ||
typename std::enable_if_t<std::is_same<T, cudf::experimental::bool8>::value>* = nullptr> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about factoring out the casting so that all operators can just call a single function rather than maintaining two copies of every operator? Something that calls the actual op with casts only for types that need it, like this. Could be part of a base class that all device_ops inherit.
template<typename T, typename Op, std::enable_if_t<!std::is_same<T, cudf::experimental::bool8>::value>* = nullptr>
__device__ T normalized_unary_op()(Op op, T data) { return Op(data); }
template<typename T, std::enable_if_t<std::is_same<T, cudf::experimental::bool8>::value>* = nullptr>
__device__ T normalized_unary_op()(T data) { return static_cast<T>(Op(static_cast<float>(data)); }
- update CHANGELOG msg - refactor math_ops SFINAE to reduce code bloat - add some "empty" unit tests - increase test/vector size
cpp/src/unary/math_ops.cu
Outdated
} | ||
|
||
template <typename T, | ||
typename std::enable_if_t<std::is_same<T, cudf::experimental::bool8>::value>* = nullptr> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be resolved in the bool8
type and not externally in functions like this. I'm not sure what the right combinations of implicit/explicit conversion and operators is necessary, but it's not sustainable to have to do things like what was done here in providing two versions of every operator.
cpp/src/unary/math_ops.cu
Outdated
|
||
if (input.size() == 0) return output; | ||
|
||
auto output_view = output->mutable_view(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern of passing in the output as a mutable_view prevents any future support of string or other non fixed-width type columns. Granted, most of these unary ops don't make sense on strings. However, I believe these should be modified to return their output rather than pass via a mutable view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dispatcher / launcher is updated to return std::unique_ptr<cudf::column_view>
- additional empty column test - remove stream from public (non-detail) API
- add tests to cover failure cases
[ISSUE] 2950
Port math_ops.cu
Port unary_ops.cuh
null_ops was done in: 3182
cast_ops was done in: 3397