[FEA] Allow input split on CUDF too long exceptions #7869
Labels
feature request
New feature or request
reliability
Features to improve reliability or bugs that severly impact the reliability of the plugin
Is your feature request related to a problem? Please describe.
Once #7866 goes in we will have the ability to split input data when executing expressions. But one of the limitations with CUDF is that of a int as the index in offsets. This limits the maximum size any column can be, including strings and lists/arrays. This is especially problematic for deeply nested types.
We should have a way to catch an exception from cudf and split/retry the operation if that exception is one that indicates that the output was too long for CUDF to support. We are likely going to have to make changes to CUDF and to the CUDF JNI APIs so it is easy to tell when this happens.
The text was updated successfully, but these errors were encountered: