Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace message parsing with throwing more specific exceptions #12426

Merged
merged 27 commits into from
Feb 25, 2023

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Dec 20, 2022

Description

This PR identifies places where cuDF Python is parsing exception messages from libcudf in order to throw an appropriate Python exception and modifies the associated libcudf error macros (CUDF_EXPECTS or CUDF_FAIL) to throw a more descriptive exception. In order to fully support this behavior with arbitrary libcudf exceptions, this PR also introduces a custom Cython exception handler that can be extended to catch and handle any new type of exception thrown by libcudf. To cases where issues with the dtype of an argument is treated as a TypeError rather than a ValueError in pandas, a new exception type dtype_error : public logic_error is defined and mapped to a TypeError in Python. This PR does not aim to exhaustively improve the choice of thrown exceptions throughout libcudf, only in places that are immediately relevant to removing message parsing in cudf Python.

Resolves #10632

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@vyasr vyasr added 2 - In Progress Currently a work in progress code quality libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API. improvement Improvement / enhancement to an existing function breaking Breaking change labels Dec 20, 2022
@vyasr vyasr self-assigned this Dec 20, 2022
@vyasr
Copy link
Contributor Author

vyasr commented Dec 20, 2022

The longer term plan here is to update all of libcudf's exception handling to throw something more useful than a cudf::logic_error. I'd like to use this PR as a test case for that to sort through some questions. There are a few issues that need to be discussed to move this work forward:

  • @jlowe @revans2 I'm not sure how changes like this impact Java. My understanding is that all exceptions get mapped to CudfException or CudaException or so according to the logic defined in jni_utils.hpp. Based on what I see there, any exception that is not one of the "known types" in CATCH_STD (basically an rmm OOM or a CUDA error) gets mapped to a CudfException. If that is true, then AFAICT changing the exception types in C++ would have no effect at all on the JNI in its current state because even any std::* exception would just get mapped to a CudfException. Is that correct?
  • @jlowe @revans2 Aside from making sure that this change is non-breaking for the JNI, are there ways in which this change to libcudf could be more useful in the Java plugin?
  • @shwina @jrhemstad currently this PR is incomplete and does not remove all cases where cuDF Python is parsing the exception message. Many of the remaining examples are in lists.py. The problem is that there are cases where the Cython/C++ exception mapping is not just incomplete, but rather disagrees with what pandas (and therefore cudf) wants to throw. The most common example is that there are places where pandas throws a TypeError based on the dtype of an input. Here is a case where libcudf is checking the input parameters for some equality condition. I would expect this to throw a std::invalid_argument. However, in Python a std::invalid_argument is translated to a ValueError, whereas what cudf actually wants to throw here is a TypeError. I am not sure what the best way to handle this kind of issue is. If there is only one way that a particular exception can be thrown by a libcudf API, then it's not a problem; we can just catch a ValueError and reraise it as a TypeError. The much more troublesome case is when there are multiple parameters, each of which could be invalid, so a std::invalid_exception from libcudf could be coming from any of them. We could try defining additional exception types in libcudf that are more specific in order to differentiate these cases, but I suspect that some of these case are not fundamentally different, they're just viewed as different by the pandas API. As a result, we would be leaking pandas details into the types defined in libcudf. Any thoughts on the best way to handle this?

@codecov
Copy link

codecov bot commented Dec 20, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-23.04@eb4da93). Click here to learn what that means.
Patch has no changes to coverable lines.

❗ Current head bdc5ef8 differs from pull request most recent head 7b5131b. Consider uploading reports for the commit 7b5131b to get more accurate results

Additional details and impacted files
@@               Coverage Diff               @@
##             branch-23.04   #12426   +/-   ##
===============================================
  Coverage                ?   85.81%           
===============================================
  Files                   ?      158           
  Lines                   ?    25110           
  Branches                ?        0           
===============================================
  Hits                    ?    21548           
  Misses                  ?     3562           
  Partials                ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@revans2
Copy link
Contributor

revans2 commented Dec 20, 2022

For the Java side of things I see the general pattern to be that each time you add in a new exception type in C++, it would be good to add a corresponding subclass of CudfException, and set up a mapping between the two in CATCH_STD. This would maintain backwards compatibility with all existing code, but allow us to add in much more specific exception types.

If you want to you can also keep the code as is for now and we can handle adding in the mappings we want to add in. The important part for us is to be able to distinguish between different types of errors so we can start to understand if there are better ways to recover from the error besides just crashing and letting Spark retry it.

@shwina
Copy link
Contributor

shwina commented Jan 5, 2023

As a result, we would be leaking pandas details into the types defined in libcudf. Any thoughts on the best way to handle this?

It sounds like inevitably, there's going to be cases where libcudf will throw the wrong kind of error. I'd say that it's a success if we're able to get rid of exception message parsing for the most part, while still having a few stragglers.

@shwina shwina changed the base branch from branch-23.02 to branch-23.04 January 26, 2023 16:45
@shwina
Copy link
Contributor

shwina commented Jan 26, 2023

Retargeted to 23.04

@vyasr vyasr force-pushed the feat/no_message_parsing branch from 69cc04e to eb87e13 Compare February 6, 2023 19:07
@vyasr vyasr force-pushed the feat/no_message_parsing branch from eb87e13 to 88ac435 Compare February 6, 2023 22:36
@github-actions github-actions bot added the CMake CMake build issue label Feb 6, 2023
@vyasr vyasr added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Feb 6, 2023
@vyasr vyasr requested a review from davidwendt February 16, 2023 00:29
@vyasr
Copy link
Contributor Author

vyasr commented Feb 16, 2023

@davidwendt I requested your review here since you had thoughts in the last meeting, let me know what you think of the changes in this PR. As you requested, I can rename dtype_error to data_type_error or so.

Copy link
Contributor

@davidwendt davidwendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think data_type_error instead of dtype_error would fit better with libcudf naming.

cpp/tests/strings/json_tests.cpp Show resolved Hide resolved
cpp/tests/copying/concatenate_tests.cu Outdated Show resolved Hide resolved
cpp/include/cudf/utilities/error.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/utilities/error.hpp Outdated Show resolved Hide resolved
@vyasr vyasr requested a review from ttnghia February 24, 2023 00:48
@vyasr
Copy link
Contributor Author

vyasr commented Feb 25, 2023

/merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team breaking Breaking change CMake CMake build issue improvement Improvement / enhancement to an existing function Java Affects Java cuDF API. libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Remove Parsing of C++ Exception String
6 participants