-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[aot] C-API error handling mechanism #5847
Conversation
✅ Deploy Preview for docsite-preview ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
fb34dca
to
35fddff
Compare
c121bc1
to
3d542fe
Compare
2068e7b
to
3ae5785
Compare
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Can we also add some tests for these errors? ;)
I see. Will be in another PR. Probably after TiQueue and before finalizing TiTexture |
Co-authored-by: Ailing <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The C-API interfaces essentially runs C++ code under the hood, so we probably have to prevent any direct C++ exception throw (std::throw, TI_ERROR, ....).
The easiest but rather cumbersome way I can think of is to wrap each of the C-API implementation with a big "try...catch", so as to make sure it returns a C-style error code?
@jim19930609 That's right. But one problem with exception is the missing callstack which basically makes it undebuggable. So if viableI even want the runtime library to be built without C++ exception so it becomes a run-or-crash situation. |
Let's implement this in a separate PR, probably along with the failure tests PR? We can also add some test cases to purposely trigger some C++ failures and make sure C-API is able to capture them. |
Oh I think the key idea is to make sure our C-API function always return instead of throw half way. Since C does not have exception handling, C-programmers strongly rely on return codes to process the exceptions/errors. Otherwise if they integrate our C-API in an online system written in C, the program can crash all the time inside Taichi and they can do nothing about it. We can turn the stack info (the original exception message) into a standard output. It is fine as long as it does not exit immediately. |
This PR proposed a error handling mechanism for the Taichi Runtime C-API. There are many choices for error reporting and handling for C interfaces:
GetLastError
as in Win32 programming;This PR implemented a hybrid of option 1 and 4. The error code is reported as return value and an optional textual extra message can be extracted by args.
The problem with mere error codes is that, without any textual message, the user wouldn't be able to know that concretely has gone wrong without some effort checking every data entry feeded to the API. If the error code is returned from a per-call basis (that every fallible API could return an error code), the user procedure would have to check every call to ensure there is no error happening, leading to extra logics. With
GetLastError
, the user can anchor the error handling logic at any point in the execution and any error can be tracked; and the users can decide not to check on errors if they are certain that things wouldn't go wrong.Error Codes
The provided error codes are:
TI_ERROR_INCOMPLETE
: Message buffer to a call toti_get_last_error
is not enough to store the entire error message.TI_ERROR_SUCCESS
: Nothing's wrong.TI_ERROR_NOT_SUPPORTED
: Feature, backend or API is not supported yet.TI_ERROR_CORRUPTED_DATA
: Data is corrupted. This error is given when the AOT module cannot be loaded from the filesystem.TI_ERROR_NAME_NOT_FOUND
: Kernel or compute cannot be found by the provided name.TI_ERROR_INVALID_ARGUMENT
: Argument value doesn't follow its valid usage.TI_ERROR_ARGUMENT_NULL
: A required argument has a null value.TI_ERROR_ARGUMENT_OUT_OF_RANGE
: Argument value is out of a valid range. Usually from a invalid enum value.TI_ERROR_ARGUMENT_NOT_FOUND
: A variable list of arguments don't have all the neccessary entries. No enough arguments to launch a kernel/compute graph.TI_ERROR_INVALID_INTEROP
: Import/export is not allowed because the object doesn't allow such usage.Thread Safety
In the implementation, there is an error cache in each thread, so parallel use of the APIs won't bother error handling. Although we don't support parallel use of Taichi right now, we might allow the user to do so in the future with
TiQueue
APIs.