-
Notifications
You must be signed in to change notification settings - Fork 933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Catch and propagate rmm out of memory errors during copy operations up to Python #13743
Comments
This is a little tricky. The reason that copy fails but concat succeeds (in the sense of the error being propagated to Python) is that concat calls a C++ function that is annotated in cython as
In cython, this is spelt: As a consequence, some cython code like this: from libcpp.utility cimport move
from libcpp.memory cimport make_unique, unique_ptr
from libcpp.vector cimport vector
def some_function(int n):
cdef unique_ptr[vector[double]] vec
with nogil:
vec = move(make_unique[vector[double]](n))
return vec.get().size() Is transpiled to: /*try:*/ {
/* "cython_except.pyx":10
*
* with nogil:
* vec = move(make_unique[vector[double]](n)) # <<<<<<<<<<<<<<
*
* return vec.get().size()
*/
__pyx_v_vec = cython_std::move<std::unique_ptr<std::vector<double> > >(std::make_unique<std::vector<double> >(__pyx_v_n));
} Notice how the (potentially throwing) The initial reason to not annotate with try {
tmp = std::move(std::make_unique<...>(...));
} catch (...) {
...
}
result = std::move(tmp); But, when the PR was introduced, instead we would get: try {
tmp = std::move(std::make_unique<...>(...));
} catch (...) {
...
}
result = tmp; // no std::move here, bad! But, that was seven years ago, let's just try adding the try {
__pyx_t_1 = std::make_unique<std::vector<double> >(__pyx_v_n);
} catch(...) {
#ifdef WITH_THREAD
PyGILState_STATE __pyx_gilstate_save = __Pyx_PyGILState_Ensure();
#endif
__Pyx_CppExn2PyErr();
#ifdef WITH_THREAD
__Pyx_PyGILState_Release(__pyx_gilstate_save);
#endif
__PYX_ERR(0, 10, __pyx_L4_error)
}
__pyx_v_vec = cython_std::move<std::unique_ptr<std::vector<double> > >(__PYX_STD_MOVE_IF_SUPPORTED(__pyx_t_1)); (That was with Cython 3, with Cython < 3 I have): try {
__pyx_t_2 = std::make_unique<cudf::column>(__pyx_v_input_column_view);
} catch(...) {
#ifdef WITH_THREAD
PyGILState_STATE __pyx_gilstate_save = __Pyx_PyGILState_Ensure();
#endif
__Pyx_CppExn2PyErr();
#ifdef WITH_THREAD
__Pyx_PyGILState_Release(__pyx_gilstate_save);
#endif
__PYX_ERR(0, 86, __pyx_L4_error)
}
__pyx_v_c_result = cython_std::move<std::unique_ptr<cudf::column> >(__pyx_t_2); Which also looks fine ( And with that, I get the python-level memory error for copy. |
@wence- Could we shift this logic to C++ and call make_unique in C++ instead? Seems like a viable workaround to just get back a unique_ptr from a factory in C++ (where we control exception behavior of its Cython binding) that calls make_unique for us. (if the Cython fix is delayed, for example, this could be a short term solution) |
We could also consider writing some custom C++ that is inlined in the Cython, if adding a function to libcudf here introduces too much overhead (in terms of human time) or is too intrusive. |
make_unique in Cython's libcpp headers is not annotated with `except +`. As a consequence, if the constructor throws, we do not catch it in Python. To work around this (see cython/cython#5560 for details), provide our own implementation. Due to the way assignments occur to temporaries, we need to now explicitly wrap all calls to `make_unique` in `move`, but that is arguably preferable to not being able to catch exceptions. - Closes rapidsai#13743
make_unique in Cython's libcpp headers is not annotated with `except +`. As a consequence, if the constructor throws, we do not catch it in Python. To work around this (see cython/cython#5560 for details), provide our own implementation. Due to the way assignments occur to temporaries, we need to now explicitly wrap all calls to `make_unique` in `move`, but that is arguably preferable to not being able to catch exceptions. - Closes rapidsai#13743
make_unique in Cython's libcpp headers is not annotated with `except +`. As a consequence, if the constructor throws, we do not catch it in Python. To work around this (see cython/cython#5560 for details), provide our own implementation. Due to the way assignments occur to temporaries, we need to now explicitly wrap all calls to `make_unique` in `move`, but that is arguably preferable to not being able to catch exceptions. - Closes rapidsai#13743
make_unique in Cython's libcpp headers is not annotated with `except +`. As a consequence, if the constructor throws, we do not catch it in Python. To work around this (see cython/cython#5560 for details), provide our own implementation. Due to the way assignments occur to temporaries, we need to now explicitly wrap all calls to `make_unique` in `move`, but that is arguably preferable to not being able to catch exceptions, and will not be necessary once we move to Cython 3. - Closes rapidsai#13743
`make_unique` in Cython's libcpp headers is not annotated with `except +`. As a consequence, if the constructor throws, we do not catch it in Python. To work around this (see cython/cython#5560 for details), provide our own implementation. Due to the way assignments occur to temporaries, we need to now explicitly wrap all calls to `make_unique` in `move`, but that is arguably preferable to not being able to catch exceptions, and will not be necessary once we move to Cython 3. - Closes #13743 Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Ashwin Srinath (https://github.com/shwina) URL: #13746
Can this be closed? |
Yes, it wasn't auto-closed because we didn't merge to 23.08 |
I'd like for out-of-memory errors caused by copying data to be caught and propagated up to Python like other out-of-memory errors.
If I run out of memory during a
copy
operation, the error is uncaught and my kernel crashes:But, for example, when I try to concatenate and run out of memory, the error is caught and I can continue using my IPython kernel.
Environment:
The text was updated successfully, but these errors were encountered: