-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No clear separation between "fast API" (unsafe) and "safe API" #61
Comments
See issue #31 "The C API is weakly typed" which is related. |
One practical problem is the lack of a real "debug mode" to debug bugs or crashes in the unsafe fast API: see issue #36 "Debug mode that can be easily activated without recompilation" for example. |
Issue #31 asks also to raise TypeError rather than SystemError in functions like PyTuple_Size(), PyList_Append() or PyDict_GetItemWithError() if the first parameter has an unexpected type (these functions expect respectively: tuple, list, dict; or subtypes of these types). That's another difference: passing the wrong type to a fast API is treated as a bug in the C extension by calling Whereas generic safe API like PyObject_GetItem() raises TypeError if the argument is not a mapping (nor a sequence). That's not "a bug" but the "normal expected behavior". In Python, you get the same exception when using |
CPython, Cython and pybind11 would be clear consumers of a fast API: they implement their own efficient checks in the calling site, so calling unsafe API is perfectly fine. Cython is used to use private APIs to get best performance; it even uses the internal API in some places. For example, CPython uses such fast-path for the dict type:
This code cannot fail because the first parameter has the wrong type, since the caller does check the type. Consumers of the "Safe API" are all C extensions which use directly the C API, without using higher level API like Cython. This code is sometimes called "hand written" code. Using directly the C API is hard: bugs are common, especially related to reference counting. Any help to reduce the error rate is welcomed. It can be an opt-in debug mode to run a test suite in slower mode to catch some common known bugs. Since there is no clear separation between the "Fast API" and the "Safe API", this category of users sometimes pick the "Fast API" "by mistake": either because they "believe" that it can make their extension way faster (it's worth it!), or because they didn't notice that a safer API is available. When I write code with the Python C API, honestly, it's common that I just copy/paste code around without fully understanding its purpose. Since this code "just works", I'm happy that my new code also works. Sometimes, I make tiny changes, and oops, suddenly it does crash or leak a reference :-) |
Just a note: if you call it "fast API", everyone will instinctively try to use it. If you call it "intrusive API", people will be more cautious. You could also call it "unsafe API", but it's only unsafe if you're using it wrong, of course.
Do we care about non-conforming compilers? |
About the naming, IMO Unsafe is more a better name than "Fast" or "Unchecked". For example, I don't think that PyUnicodeFastCopyCharacters() was a clever choice. The other existing name is "Unchecked": PyThreadStateUncheckedGet(). "Unsafe" means that "it is unsafe unless you check assumptions in the caller". In many cases, it's easy to check these assumptions. For example, PyDict_GetItem() makes the assumption that you pass a dictionary. (Well, this one currently that the argument type is a dict.) It's common that function creates a dict or store a dict, and so know the argument type. I prefer to mark unsafe functions as "Unsafe" (in their name) to help people discovering the API that: oh, maybe they should avoid these functions until they fully understand their usage. The "PyUnstable" API seems to have a different purpose. |
I don't think that you should pretend that putting "Fast" in a function name makes it "faster" :-) I'm not convinced that removing a few pre-conditions and avoiding error handling in general has a significant performance difference. Maybe avoiding indirections by using PyDict_GetItem() instead of PyObject_GetItem() makes a larger difference. I don't know. People just have believes and nobody runs benchmarks :-) |
Well, if you're accessing all (or most) elements in a list, avoiding checks every time probably makes a difference. If you're doing a single access, then it probably doesn't matter. |
Follow-up issue: capi-workgroup/api-evolution#45. I close this issue since there is no concrete action which can be triggered to enhance the situation. If I manage to design a more concrete action, I will open a new issue. |
For historical reasons, the Python C API is a mixture of various kinds of functions:
In Python 3.7, a new
Include/internal/
subdirectory was added: internal API which must not be used outside CPython have been moved slowly (one by one) there. Most of these functions are no longer exported: so technically it's no longer possible to use them outside Python. Moreover, thePy_BUILD_CORE
macro must be defined to access it: it's an opt-in option (similar to issue #54 idea) to clearly point out that: hey, you're in the danger zone!Recently, in Python 3.12, the concept of "Unstable API" was added: PEP 689 – Unstable C API tier. This one is a middleground for function that we want to expose as "public", but are known to be unstable between Python versions, like PyCore_New() which became: **PyUnstable_**Code_New().
There is still a gray area between "fast API" and "safe API".
Examples of unsafe fast APIs vs safe APIs:
PyTuple_GET_ITEM()
vsPyTuple_GetItem()
(runtime type check, check index bounds)_PyThreadState_UncheckedGet()
(can return NULL) vsPyThreadState_Get()
(fatal error instead of returning NULL)Differences between "fast API" and "safe API":
It's more common for the "fast API" to rely on the compiler to produce efficient code (ex: inline code). Sadly, it may cause more portability issues (compiler not fully implementing a C standard or a specific function).
The text was updated successfully, but these errors were encountered: