-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TASK][JNI] Investigate train of null_count
after explode
#11923
Comments
I'd like to cross-reference this issue with #11968. It's likely that the |
I'm not following. cudf/cpp/include/cudf/lists/explode.hpp Line 72 in 7d173c9
Are you then constructing a If so, then yeah, you're going to have a problem with computing the null count of each of those To make that efficient, you'd have to do what we do in cudf/cpp/include/cudf/detail/null_mask.hpp Lines 186 to 204 in 9c06330
|
While analyzing an nsys trace for a Spark job with deeply nested tables, we see an
explode
kernel call that is followed by a train ofnull_count
, which end inis_valid
.After we call
cudf::explode
we build up a table, and construct javaColumnVector
objects. I think the construction of these objects is triggering it.This task is to confirm that the columns with missing a null count are coming from the
explode
kernels. If they are coming fromexplode
, it would be great ifexplode
could compute null count as part of that kernel.In this screenshot, it is the ~20ms at the end after
explode
:The text was updated successfully, but these errors were encountered: