-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace message parsing with throwing more specific exceptions #12426
Replace message parsing with throwing more specific exceptions #12426
Conversation
The longer term plan here is to update all of libcudf's exception handling to throw something more useful than a
|
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## branch-23.04 #12426 +/- ##
===============================================
Coverage ? 85.81%
===============================================
Files ? 158
Lines ? 25110
Branches ? 0
===============================================
Hits ? 21548
Misses ? 3562
Partials ? 0 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
For the Java side of things I see the general pattern to be that each time you add in a new exception type in C++, it would be good to add a corresponding subclass of CudfException, and set up a mapping between the two in CATCH_STD. This would maintain backwards compatibility with all existing code, but allow us to add in much more specific exception types. If you want to you can also keep the code as is for now and we can handle adding in the mappings we want to add in. The important part for us is to be able to distinguish between different types of errors so we can start to understand if there are better ways to recover from the error besides just crashing and letting Spark retry it. |
It sounds like inevitably, there's going to be cases where libcudf will throw the wrong kind of error. I'd say that it's a success if we're able to get rid of exception message parsing for the most part, while still having a few stragglers. |
Retargeted to 23.04 |
69cc04e
to
eb87e13
Compare
eb87e13
to
88ac435
Compare
@davidwendt I requested your review here since you had thoughts in the last meeting, let me know what you think of the changes in this PR. As you requested, I can rename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think data_type_error
instead of dtype_error
would fit better with libcudf naming.
Co-authored-by: David Wendt <[email protected]>
/merge |
Description
This PR identifies places where cuDF Python is parsing exception messages from libcudf in order to throw an appropriate Python exception and modifies the associated libcudf error macros (
CUDF_EXPECTS
orCUDF_FAIL
) to throw a more descriptive exception. In order to fully support this behavior with arbitrary libcudf exceptions, this PR also introduces a custom Cython exception handler that can be extended to catch and handle any new type of exception thrown by libcudf. To cases where issues with the dtype of an argument is treated as a TypeError rather than a ValueError in pandas, a new exception typedtype_error : public logic_error
is defined and mapped to a TypeError in Python. This PR does not aim to exhaustively improve the choice of thrown exceptions throughout libcudf, only in places that are immediately relevant to removing message parsing in cudf Python.Resolves #10632
Checklist