Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aot] C-API error handling mechanism #5847

Merged
merged 11 commits into from
Aug 24, 2022

Conversation

PENGUINLIONG
Copy link
Member

This PR proposed a error handling mechanism for the Taichi Runtime C-API. There are many choices for error reporting and handling for C interfaces:

  1. GetLastError as in Win32 programming;
  2. Error codes (in return values) as in many graphics APIs;
  3. Error codes (via nullable params) as in OpenCL;
  4. Textual error messages via a global buffer.

This PR implemented a hybrid of option 1 and 4. The error code is reported as return value and an optional textual extra message can be extracted by args.

The problem with mere error codes is that, without any textual message, the user wouldn't be able to know that concretely has gone wrong without some effort checking every data entry feeded to the API. If the error code is returned from a per-call basis (that every fallible API could return an error code), the user procedure would have to check every call to ensure there is no error happening, leading to extra logics. With GetLastError, the user can anchor the error handling logic at any point in the execution and any error can be tracked; and the users can decide not to check on errors if they are certain that things wouldn't go wrong.

Error Codes

The provided error codes are:

  • TI_ERROR_INCOMPLETE: Message buffer to a call to ti_get_last_error is not enough to store the entire error message.
  • TI_ERROR_SUCCESS: Nothing's wrong.
  • TI_ERROR_NOT_SUPPORTED: Feature, backend or API is not supported yet.
  • TI_ERROR_CORRUPTED_DATA: Data is corrupted. This error is given when the AOT module cannot be loaded from the filesystem.
  • TI_ERROR_NAME_NOT_FOUND: Kernel or compute cannot be found by the provided name.
  • TI_ERROR_INVALID_ARGUMENT: Argument value doesn't follow its valid usage.
  • TI_ERROR_ARGUMENT_NULL: A required argument has a null value.
  • TI_ERROR_ARGUMENT_OUT_OF_RANGE: Argument value is out of a valid range. Usually from a invalid enum value.
  • TI_ERROR_ARGUMENT_NOT_FOUND: A variable list of arguments don't have all the neccessary entries. No enough arguments to launch a kernel/compute graph.
  • TI_ERROR_INVALID_INTEROP: Import/export is not allowed because the object doesn't allow such usage.

Thread Safety

In the implementation, there is an error cache in each thread, so parallel use of the APIs won't bother error handling. Although we don't support parallel use of Taichi right now, we might allow the user to do so in the future with TiQueue APIs.

@netlify
Copy link

netlify bot commented Aug 23, 2022

Deploy Preview for docsite-preview ready!

Name Link
🔨 Latest commit 8f0d70f
🔍 Latest deploy log https://app.netlify.com/sites/docsite-preview/deploys/63058f8ca1b64100087603bc
😎 Deploy Preview https://deploy-preview-5847--docsite-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

Copy link
Contributor

@ailzhang ailzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Can we also add some tests for these errors? ;)

c_api/src/taichi_core_impl.cpp Outdated Show resolved Hide resolved
@PENGUINLIONG
Copy link
Member Author

I see. Will be in another PR. Probably after TiQueue and before finalizing TiTexture

Copy link
Contributor

@jim19930609 jim19930609 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C-API interfaces essentially runs C++ code under the hood, so we probably have to prevent any direct C++ exception throw (std::throw, TI_ERROR, ....).

The easiest but rather cumbersome way I can think of is to wrap each of the C-API implementation with a big "try...catch", so as to make sure it returns a C-style error code?

@PENGUINLIONG
Copy link
Member Author

The C-API interfaces essentially runs C++ code under the hood, so we probably have to prevent any direct C++ exception throw (std::throw, TI_ERROR, ....).

The easiest but rather cumbersome way I can think of is to wrap each of the C-API implementation with a big "try...catch", so as to make sure it returns a C-style error code?

@jim19930609 That's right. But one problem with exception is the missing callstack which basically makes it undebuggable. So if viableI even want the runtime library to be built without C++ exception so it becomes a run-or-crash situation.

@jim19930609
Copy link
Contributor

The C-API interfaces essentially runs C++ code under the hood, so we probably have to prevent any direct C++ exception throw (std::throw, TI_ERROR, ....).

The easiest but rather cumbersome way I can think of is to wrap each of the C-API implementation with a big "try...catch", so as to make sure it returns a C-style error code?

Let's implement this in a separate PR, probably along with the failure tests PR? We can also add some test cases to purposely trigger some C++ failures and make sure C-API is able to capture them.

@jim19930609
Copy link
Contributor

jim19930609 commented Aug 24, 2022

The C-API interfaces essentially runs C++ code under the hood, so we probably have to prevent any direct C++ exception throw (std::throw, TI_ERROR, ....).
The easiest but rather cumbersome way I can think of is to wrap each of the C-API implementation with a big "try...catch", so as to make sure it returns a C-style error code?

@jim19930609 That's right. But one problem with exception is the missing callstack which basically makes it undebuggable. So if viableI even want the runtime library to be built without C++ exception so it becomes a run-or-crash situation.

Oh I think the key idea is to make sure our C-API function always return instead of throw half way. Since C does not have exception handling, C-programmers strongly rely on return codes to process the exceptions/errors. Otherwise if they integrate our C-API in an online system written in C, the program can crash all the time inside Taichi and they can do nothing about it.

We can turn the stack info (the original exception message) into a standard output. It is fine as long as it does not exit immediately.

@PENGUINLIONG PENGUINLIONG merged commit 52db011 into taichi-dev:master Aug 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants