-
Notifications
You must be signed in to change notification settings - Fork 920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Review] replace.hpp strings functionality #3469
Conversation
Also, the strings |
Codecov Report
@@ Coverage Diff @@
## branch-0.11 #3469 +/- ##
==============================================
Coverage ? 87.38%
==============================================
Files ? 49
Lines ? 9376
Branches ? 0
==============================================
Hits ? 8193
Misses ? 1183
Partials ? 0 Continue to review full report at Codecov.
|
cudf::column_view const& values_to_replace, | ||
cudf::column_view const& replacement_values, | ||
rmm::mr::device_memory_resource* mr, | ||
cudaStream_t stream = 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you are still working on this but I think it would be good to refactor this into a strings::detail
function and source/header file like we do for gather, scatter, copy, fill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that this would get much re-use if factored out that way, am I missing a use case for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just trying to keep all the code that builds strings column in some central location in case we need to change the children types/layout. I suppose I can split the code out in a later PR.
cpp/src/replace/replace.cu
Outdated
CUDA_TRY(cudaMemcpy(&size, offsets_data + offsets_view.size() - 1, sizeof(int32_t), cudaMemcpyDefault)); | ||
|
||
// Allocate chars array and output null mask | ||
std::unique_ptr<cudf::column> output_chars = cudf::make_numeric_column(cudf::data_type(cudf::type_id::INT8), size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the correct way to create the chars column. There are some missing edge cases. Please use the create_chars_child_column
method in cpp/include/cudf/strings/detail/utilities.hpp
if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to be more careful and consistent with stream semantics. The kernels are also unnecessarily complex and branchy.
cpp/src/replace/replace.cu
Outdated
output_is_valid = replacement.is_valid_nocheck(i); | ||
} | ||
|
||
if (phase == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplify lines 581-617 to something like this, it's a lot shorter and clearer.
cudf::string_view out;
if (input_is_valid) {
out = input.element<cudf::string_view>(i);
} else if (output_is_valid) {
out = replacement.element<cudf::string_view>(i);
}
bool nonzero_output = (input_is_valid || output_is_valid);
if (phase == 0) {
offsets[i] = nonzero_output ? out.size_bytes() : 0;
} else if (phase == 1) {
if (nonzero_output)
std::memcpy(chars + offsets[i], out.data(), out.size_bytes())
uint32_t bitmask = __ballot_sync(active_mask, output_is_valid);
if (0 == lane_id) {
output_valid[cudf::word_index(i)] = bitmask;
valid_sum += __popc(bitmask);
}
}
Note I think you can also replace that out.size_bytes()
in phase 1 with offsets[i+1]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
cpp/src/replace/replace.cu
Outdated
offsets_view.data<cudf::size_type>(), | ||
nullptr, | ||
valid_count); | ||
cudaStreamSynchronize(stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to synchronize here.
cudaStreamSynchronize(stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
cpp/src/replace/replace.cu
Outdated
offsets_data + offsets_view.size(), | ||
offsets_data); | ||
int32_t size; | ||
CUDA_TRY(cudaMemcpy(&size, offsets_data + offsets_view.size() - 1, sizeof(int32_t), cudaMemcpyDefault)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUDA_TRY(cudaMemcpy(&size, offsets_data + offsets_view.size() - 1, sizeof(int32_t), cudaMemcpyDefault)); | |
CUDA_TRY(cudaMemcpyAsync(&size, offsets_data + offsets_view.size() - 1, sizeof(int32_t), cudaMemcpyDefault, stream)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
cpp/src/replace/replace.cu
Outdated
offsets_view.data<cudf::size_type>(), | ||
output_chars_view.data<char>(), | ||
valid_count); | ||
cudaStreamSynchronize(stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cudaStreamSynchronize(stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
cpp/src/replace/replace.cu
Outdated
bool const input_is_valid = input.is_valid_nocheck(i); | ||
output_is_valid = input_is_valid; | ||
bitmask = __ballot_sync(active_mask, input_is_valid); | ||
if (input_is_valid) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a ton of duplication in this complex chain of nested if/else/elseifs. Factor out the calculation of output and memcpy to reduce duplication. See my comments on the other kernel for similar guidance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
cpp/src/replace/replace.cu
Outdated
cudf::column_device_view& input, | ||
cudf::column_device_view& values_to_replace, | ||
cudf::column_device_view& replacement_values) { | ||
cudf::string_view inputString = input.element<cudf::string_view>(idx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cudf::string_view inputString = input.element<cudf::string_view>(idx); | |
cudf::string_view input_string = input.element<cudf::string_view>(idx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
cpp/src/replace/replace.cu
Outdated
cudf::string_view inputString = input.element<cudf::string_view>(idx); | ||
int match = -1; | ||
for (int i = 0; i < values_to_replace.size(); i++) { | ||
cudf::string_view valueString = values_to_replace.element<cudf::string_view>(i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cudf::string_view valueString = values_to_replace.element<cudf::string_view>(i); | |
cudf::string_view value_string = values_to_replace.element<cudf::string_view>(i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
cpp/src/replace/replace.cu
Outdated
|
||
// Allocate chars array and output null mask | ||
std::unique_ptr<cudf::column> output_chars = | ||
cudf::strings::detail::create_chars_child_column(input_col.size(), 0, size, mr, stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The null-count parameter here must be accurate for the output strings column and not the chars column in order for this to work properly. This may mean you need to compute the nulls first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nearly there.
bool input_is_valid = true; | ||
if (input_has_nulls) | ||
input_is_valid = input.is_valid_nocheck(i); | ||
if (input_is_valid){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion, delete lines 118-120 and
if (input_is_valid){ | |
if (input_has_nulls && input.is_valid_nocheck(i)){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't take care of the case where input_has_nulls is false, I don't think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. This would, I think.
if (input_is_valid){ | |
if (!input_has_nulls || input.is_valid_nocheck(i)){ |
Co-Authored-By: Mark Harris <[email protected]>
Co-Authored-By: Mark Harris <[email protected]>
Co-Authored-By: Mark Harris <[email protected]>
Co-Authored-By: Mark Harris <[email protected]>
Co-Authored-By: Mark Harris <[email protected]>
rerun tests |
} | ||
return output; | ||
} | ||
|
||
template<typename col_type, | ||
std::enable_if_t<not cudf::is_fixed_width<col_type>()>* = nullptr> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strings may currently be the only non fixed-width type, but that won't be true in the future. This code path should only be enabled from string_view
and not for all non fixed-width types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated all of these to have an explicit specialization for string_view and the general non-fixed-width type case will throw an exception.
cpp/src/replace/replace.cu
Outdated
auto device_indices = cudf::mutable_column_device_view::create(indices_view); | ||
|
||
int valid_chars = cudf::bitmask_allocation_size_bytes(input_col.size()); | ||
rmm::device_buffer valid_bits(valid_chars, stream, mr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use create_null_mask
instead https://github.com/rapidsai/cudf/blob/branch-0.11/cpp/include/cudf/null_mask.hpp#L78
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
cpp/src/replace/replace.cu
Outdated
// Compute the offsets from the sizes | ||
int32_t * offsets_data = offsets_view.data<int32_t>(); | ||
thrust::exclusive_scan(rmm::exec_policy(stream)->on(stream), | ||
offsets_data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
offsets_data, | |
offsets_view.begin<int32_t>(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
cpp/src/replace/replace.cu
Outdated
int32_t * offsets_data = offsets_view.data<int32_t>(); | ||
thrust::exclusive_scan(rmm::exec_policy(stream)->on(stream), | ||
offsets_data, | ||
offsets_data + offsets_view.size(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
offsets_data + offsets_view.size(), | |
offsets_view.end<int32_t>(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
cpp/src/replace/replace.cu
Outdated
thrust::exclusive_scan(rmm::exec_policy(stream)->on(stream), | ||
offsets_data, | ||
offsets_data + offsets_view.size(), | ||
offsets_data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
offsets_data); | |
offsets_view.begin<int32_t>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -87,6 +90,108 @@ __device__ auto get_new_value(cudf::size_type idx, | |||
return thrust::make_pair(new_value, output_is_valid); | |||
} | |||
|
|||
__device__ int get_new_string_value(cudf::size_type idx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_new_value
should be updated and then there is no need for two different functions. It should look like:
template <typename Element, bool replacement_has_nulls>
__device__ auto get_new_value(size_type i, column_device_view input, cudf::column_device_view values_to_replace, column_device_view replacement_values){
auto found = thrust::find(thrust::seq, values_to_replace.begin<Element>(), values_to_replace.end<Element>(), input.element<Element>());
Element new_value;
bool output_is_valid{true};
if(found != values_to_replace.end<Element>()){
auto d = thrust::distance(values_to_replace.begin<Element>(), found);
new_value = replacement_values.element<Element>(d);
if(replacement_has_nulls){
output_is_valid = replacement_values.is_valid(d);
}
} else {
new_value = input.element<Element>(i);
}
return thrust::make_pair(new_value, output_is_valid);
}
Otherwise separate function is just extra baggage that needs to be maintained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for having two functions is that the one for strings returns the index of the found value, rather than the value itself. This prevents the second pass kernel from needing to do a search again.
@jrhemstad one more time... |
Fixes #2944 |
This depends on PR 3187.
This is a work in progress PR and is not yet complete.
This is adding string support to the functions find_and_replace_all(), and replace_nulls() defined in replace.hpp.