Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Allow input split on CUDF too long exceptions #7869

Open
revans2 opened this issue Mar 9, 2023 · 3 comments
Open

[FEA] Allow input split on CUDF too long exceptions #7869

revans2 opened this issue Mar 9, 2023 · 3 comments
Labels
feature request New feature or request reliability Features to improve reliability or bugs that severly impact the reliability of the plugin

Comments

@revans2
Copy link
Collaborator

revans2 commented Mar 9, 2023

Is your feature request related to a problem? Please describe.
Once #7866 goes in we will have the ability to split input data when executing expressions. But one of the limitations with CUDF is that of a int as the index in offsets. This limits the maximum size any column can be, including strings and lists/arrays. This is especially problematic for deeply nested types.

We should have a way to catch an exception from cudf and split/retry the operation if that exception is one that indicates that the output was too long for CUDF to support. We are likely going to have to make changes to CUDF and to the CUDF JNI APIs so it is easy to tell when this happens.

@revans2 revans2 added feature request New feature or request ? - Needs Triage Need team to review and classify reliability Features to improve reliability or bugs that severly impact the reliability of the plugin labels Mar 9, 2023
@ttnghia
Copy link
Collaborator

ttnghia commented Mar 10, 2023

I think we can implement a specific exception class in the Java JNI layer. Then we have a dedicated API in JNI to check for column size. Something like:

static void throwIfSizeExceedCudfLimit(...)
  if (inputColumn.numRows() > long(INT_MAX)) {
    throw new ColumnSizeExceedCudfLimitException(...);
  }
}

The plugin just try+catch that specific exception for split+retry.

Probably that throwIfSizeExceedCudfLimit check is called in ColumnVector (ColumnView?) constructor so we always have new column being checked.

@revans2
Copy link
Collaborator Author

revans2 commented Mar 10, 2023

The problem is not with the input, but the output of operations. CUDF is the one that hopefully finds that the output is too large and throws an exception.

https://github.com/rapidsai/cudf/blob/2969b241c0654a11d1a61e29664bcaecd7bc4a15/cpp/include/cudf/strings/detail/strings_children.cuh#L82-L84

But that is not guaranteed in all cases because of overflow. The ideal would be that when a string, or any other column type, would exceed the CUDF limits then there could be an exception class that is specific to this so we could know that it happened and then retry the operation with a smaller input to avoid the issue.

@ttnghia
Copy link
Collaborator

ttnghia commented Mar 10, 2023

I see. Recently cudf introduced several new exception class (rapidsai/cudf#12426) to throw in case of invalid input type. It should be reasonable to add a new exception type for our specific need.

I've filed a related issue: rapidsai/cudf#12925.

@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request reliability Features to improve reliability or bugs that severly impact the reliability of the plugin
Projects
None yet
Development

No branches or pull requests

3 participants