Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalized null support in user defined functions #8213

Merged
merged 148 commits into from
Jul 16, 2021

Conversation

brandon-b-miller
Copy link
Contributor

@brandon-b-miller brandon-b-miller commented May 11, 2021

Draft

  • Adds DataFrame.apply similar to Pandas
  • Adds support for automatically including the validity of the operand columns in the computation of the result
  • Adds support for involving cudf.NA in user defined functions explicitly

This PR creates the following API:

@nulludf
def func_gdf(x, y):
    if x is cudf.NA:
        return y
    else:
        return x + y


gdf = cudf.DataFrame({
    'a':[1,None,3, None],
    'b':[4,5,None, None]
})
gdf.apply(lambda row: func_gdf(row['a'], row['b']), axis=1)

# 0       5
# 1       5
# 2    <NA>
# 3    <NA>
# dtype: int64

python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
@@ -53,6 +53,12 @@ std::unique_ptr<column> transform(
bool is_ptx,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

std::unique_ptr<column> generalized_masked_op(
table_view data_view,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically we pass in table_view const& as copying it may involve recursively copying its children column_view which is more expensive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may need to be modified to use table_view const& (not just this, but in other places too).

* limitations under the License.
*/

// Include Jitify's cstddef header first
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? The convention in cudf is to include from "near" to "far". So, you include <transform/...> first, then <cudf/...>, then <cuda/...>, then std headers finally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem here is that technically when this file is runtime compilated later, transform/jit/operation-udf.hpp gets string replaced by by an actual function definition that might contain the types in the std headers. So I think at least the order of those two headers is critical.

@@ -15,6 +15,7 @@
*/

#include <jit_preprocessed_files/transform/jit/kernel.cu.jit.hpp>
#include <jit_preprocessed_files/transform/jit/masked_udf_kernel.cu.jit.hpp>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that jit headers should be included after cudf headers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines 76 to 83
template_types.reserve(data_view.num_columns() + 1);

template_types.push_back(cudf::jit::get_type_name(outcol_view.type()));
for (auto const& col : data_view) {
template_types.push_back(cudf::jit::get_type_name(col.type()) + "*");
template_types.push_back(mskptr_type);
template_types.push_back(offset_type);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I see that you call push_back by 3*num_cols() + 1 times instead of num_cols() + 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch - this was unsafe. Fixed

cpp/src/transform/transform.cpp Outdated Show resolved Hide resolved
Comment on lines 133 to 139
rmm::cuda_stream_view generic_stream;
cudf::jit::get_program_cache(*transform_jit_masked_udf_kernel_cu_jit)
.get_kernel(generic_kernel_name,
{},
{{"transform/jit/operation-udf.hpp", generic_cuda_source}},
{"-arch=sm_."}) //
->configure_1d_max_occupancy(0, 0, 0, generic_stream.value()) //
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why generic_stream is used without initialization? Are you using the default stream? If so, call default stream directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed.

Comment on lines 124 to 131
data_ptrs.push_back(cudf::jit::get_data_ptr(col));
mask_ptrs.push_back(col.null_mask());
offsets.push_back(col.offset());

kernel_args.push_back(&data_ptrs[col_idx]);
kernel_args.push_back(&mask_ptrs[col_idx]);
kernel_args.push_back(&offsets[col_idx]);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use some type of std::transform instead? Using raw loop is discouraged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is difficult due to the 1->3 transform going on here. I kept trying to do the same, but couldn't get anything that was cleaner.

Copy link
Contributor

@ttnghia ttnghia Jul 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using thrust::zip_iterator (host callable)? You can output to 3 values at the same time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to use zip_iterator to replace about half the logic here. One loop though I did not see how to simplify, open to suggestions here.

cpp/src/transform/transform.cpp Show resolved Hide resolved
mutable_column_view outmsk_view,
rmm::mr::device_memory_resource* mr)
{
std::vector<std::string> template_types = make_template_types(outcol_view, data_view);
Copy link
Contributor

@ttnghia ttnghia Jul 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing I want to note is that, you can use auto const for declaring almost everything, instead of writing lengthy types like this. I.e.,

auto const template_types =...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@brandon-b-miller
Copy link
Contributor Author

I think this is ready for another look cc @ttnghia

@@ -53,6 +53,12 @@ std::unique_ptr<column> transform(
bool is_ptx,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

std::unique_ptr<column> generalized_masked_op(
table_view data_view,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may need to be modified to use table_view const& (not just this, but in other places too).

cpp/src/transform/transform.cpp Outdated Show resolved Hide resolved
cpp/src/transform/transform.cpp Outdated Show resolved Hide resolved
cpp/src/transform/transform.cpp Outdated Show resolved Hide resolved
@brandon-b-miller
Copy link
Contributor Author

rerun tests

@brandon-b-miller brandon-b-miller added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Jul 16, 2021
@brandon-b-miller
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 7ff4724 into rapidsai:branch-21.08 Jul 16, 2021
rapids-bot bot pushed a commit that referenced this pull request Dec 1, 2021
This PR removes the c++ side of the original masked UDF code introduced in #8213. These kernels had some limitations and are now superseded by the numba-generated versions we moved to in #9174. As far as I can tell, cuDF python was the only thing consuming this API for the short time it has existed. However I am marking this breaking just in case.

Authors:
  - https://github.com/brandon-b-miller

Approvers:
  - Mark Harris (https://github.com/harrism)
  - David Wendt (https://github.com/davidwendt)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #9792
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4 - Needs Review Waiting for reviewer to review or respond 5 - Ready to Merge Testing and reviews complete, ready to merge CMake CMake build issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants