-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Diversify libcudf's exceptions to simplify exception handling #10200
Comments
Note that RMM's version already has this: https://github.com/rapidsai/rmm/blob/0515ca47bc6abdc4619a099976bdbf059cbde624/include/rmm/detail/error.hpp#L87-L120
I totally support this so long as we have a precise list of the exact exceptions you'd want libcudf to throw and precisely when you'd want them to be thrown.
I do not support this. The
This is the correct approach. If you want to know about a situation that occurred that doesn't correspond to one of the exceptions Cython already knows about, then libcudf should introduce a custom exception type that cudf knows to catch that type. |
This issue has been labeled |
I have been looking into both C++ and Python exceptions. Here is a brief summary of the investigation. C++ Exceptions: https://en.cppreference.com/w/cpp/error/exception There are 26 C++ exceptions and 42 Python exception (+ or - depending on how you count them). Before looking at the Cython mappings, I managed to line up 7 of their 11 mappings. I missed both These are the recommended Cython mappings: And this is an "exploded" version to highlight the missing C++ exceptions:
Probably the place to start is just by tacking a single pairing, such as |
I think the place to start is to find where C++ exception |
One consideration that recently popped up on the Spark side is differentiating errors that are recoverable by reattempting the task vs. errors that are fatal to the process. For example, CUDA's Failing that, it would be nice to have an exception specific to CUDA errors that includes the CUDA error code so the application can discern these "fatal" errors when appropriate. Today CUDA errors are encoded in the exception string rather than accessible directly within the exception. |
This would be nice, but it is harder than it sounds because CUDA doesn't really document all that well which errors are like this and I believe it can depend on the context in which they occurred.
Definitely. I could have sworn we already did that, but I guess not. |
Hm, one way we could hack around this is to call A heavier hammer would be to do a So like this:
|
My observations of this kind of parsing is why I suggested starting with the |
We do have |
I filed #10553 separately as it is orthogonal to the rest of the exception improvements. |
This issue has been labeled |
This issue has been labeled |
We should not be encouraging users to rely specific error messages. Anywhere that is currently doing so is likely an indication that libcudf should be throwing a more specific type of exception instead of just a `cudf::logic_error`. This PR removes the testing utilities that were previously used for this purpose and reworks the relevant tests. Related to #10200. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Nghia Truong (https://github.com/ttnghia) - David Wendt (https://github.com/davidwendt) - Bradley Dice (https://github.com/bdice) - Karthikeyan (https://github.com/karthikeyann) URL: #12076
This change enables libcudf APIs throwing exceptions other than `cudf::logic_error`. This code was adapted from rmm. This contributes to #10200. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Yunsong Wang (https://github.com/PointKernel) - Lawrence Mitchell (https://github.com/wence-) URL: #12078
This PR removes support for checking exception messages from `assert_exceptions_equal`. See #12076 for the same change made in C++, #10200 for more context, and #7917 for the change in cudf Python's developer documentation where we officially changed our policy to not consider exception payloads part of the stable API. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Ashwin Srinath (https://github.com/shwina) URL: #12424
Partial progress towards rapidsai#10200, this will enable catching and re-raising a useful overflow message in to_csv if the requested dataframe write cannot be converted to a single string column without overflow.
Since writing to CSV files is implemented by converting all columns in a dataframe to strings, and then concatenating those columns, when we attempt to write a large dataframe to CSV without specifying a chunk size, we can easily overflow the maximum column size. Currently the error message is rather inscrutable: that the requested size of a string column exceeds the column size limit. To help the user, catch this error and provide a useful error message that points them towards setting the `chunksize` argument. So that we don't produce false positive advice, tighten the scope by only catching `OverflowError`, to do this, make partial progress towards resolving #10200 by throwing `std::overflow_error` when checking for overflow of string column lengths. Closes #12690. Authors: - Lawrence Mitchell (https://github.com/wence-) - Karthikeyan (https://github.com/karthikeyann) Approvers: - David Wendt (https://github.com/davidwendt) - Ashwin Srinath (https://github.com/shwina) - Nghia Truong (https://github.com/ttnghia) - Karthikeyan (https://github.com/karthikeyann) URL: #12705
Is your feature request related to a problem? Please describe.
Currently the vast majority of exceptions thrown by libcudf are
cudf::logic_error
due to the use of theCUDF_EXPECTS
andCUDF_FAIL
macros. These are in large part used for input validation. Aside fromcudf::logic_error
, the only other exceptions that are thrown with any frequency arestd::bad_alloc
orcuda_error
. While the exception payloads are typically sufficiently informative for direct users of the C++ API, they are less useful for users of higher-level language libraries that rely on libcudf. In particular, many cuDF Python functions rely on parsing the exception payloads to translate C++ exceptions into suitable exception types on the Python side.Describe the solution you'd like
It would be helpful for libcudf to make use of a wider range of standard exceptions to provide more informative exceptions for higher-level libraries to parse. From the Python perspective, it would be especially valuable if libcudf threw exceptions that matched the exceptions that Cython already knows how to translate into Python. While it's not clear that all of these would be useful, at least a few of them certainly could be. For example, a number of cases would be nicely handled if libcudf threw a
std::invalid_argument
rather than acudf::logic_error
since Cython will automatically translate the former into a PythonValueError
. One way to accomplish this would be to allowCUDF_EXPECTS
to accept a third parameter indicating the exception type to throw. An example of this already exists in rmm with the correspondingRMM_EXPECTS
macro.For cases where no existing mapping exists, it would be helpful if exception payloads were standardized in some way so that the messages could be parsed more easily. For instance, a very simple approach might be for
CUDF_EXPECTS
to stringify and embed the name of the exception into the message. It is also possible for the Python layer to introduce a custom exception handler, so we could do that if cudf introduces additional custom exceptions. I am open to systematically embedding additional information in the message as well if others think of useful ways to do so.Describe alternatives you've considered
libcudf could also use some sort of logging framework rather than relying entirely on exceptions. Such a framework would be significantly more work to implement, but it would have the benefit of also offering additional information. cuDF Python uses warnings in some non-failing cases to provide information to users, and a C++ logging framework could provide information that we could translate to Python warnings.
Additional context
I'm not familiar with how our JNI code currently handles this or whether there are any utilities to simplify translation of C++ exceptions into Java. Ideally libcudf would adopt an exception handling pattern that is suitable for use with all of the higher level libraries that rely on it, so it would be valuable to get some feedback from someone with more experience with cudf's Java interface.
The text was updated successfully, but these errors were encountered: