-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable prefetching in cudf.pandas.install() #16439
Conversation
if (data_ptr == nullptr) { | ||
// Do not call chars_size if the data_ptr is nullptr. | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bug. We can't call chars_size
on string column views that are not fully initialized like this one:
cudf/cpp/src/column/column_factories.cu
Lines 60 to 62 in 8def2ec
// Since we are setting every row to the scalar, the fill() never needs to access | |
// any of the children in the strings column which would otherwise cause an exception. | |
column_view sc{value.type(), size, nullptr, nullptr, 0}; |
Errors from calling chars_size()
looked like this:
terminate called after throwing an instance of 'cudf::logic_error'
what(): CUDF failure at: /home/coder/cudf/cpp/src/strings/strings_column_view.cpp:34: strings column has no children
Aborted (core dumped)
// Don't try to prefetch nullptrs or empty data. Sometimes libcudf has column | ||
// views that use nullptrs with a nonzero size as an optimization. | ||
if (ptr == nullptr) { | ||
if (prefetch_config::instance().debug) { | ||
std::cerr << "Skipping prefetch of nullptr" << std::endl; | ||
} | ||
return cudaSuccess; | ||
} | ||
if (size == 0) { | ||
if (prefetch_config::instance().debug) { | ||
std::cerr << "Skipping prefetch of size 0" << std::endl; | ||
} | ||
return cudaSuccess; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure if this is necessary or not, but it seems like it would improve our library safety to avoid calling the cudaMemPrefetchAsync
API unless we know we have a non-null pointer and non-zero size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving C++ changes.
Verified that this works in local testing. |
Description
This PR enables
cudf.pandas
managed memory prefetching incudf.pandas.install()
, to ensure that prefetching is enabled for all methods of enablingcudf.pandas
.I also fixed a bug in libcudf's prefetching logic, where it tried to compute the number of characters in a strings column view even if the string column view's data is
nullptr
. This errors, so we must avoid thechars_size()
call and stop the prefetch attempt early.Checklist