You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Cython example below demonstrates an attempt to use CUDA Python to interact with some external C++ code. Note that the "external code" is included inline in the Cython.
The external code is a function foo that accepts a cudaMemAllocationHandleType. We attempt to invoke that function from Cython by passing in a cuda.ccudart.cudaMemAlloccationHandleType, but this fails with an error like:
error: cannot convert '__pyx_t_4cuda_7ccudart_cudaMemAllocationHandleType' to 'cudaMemAllocationHandleType'4857| foo(__pyx_e_4cuda_7ccudart_cudaMemHandleTypeNone);
To reproduce the problem, save the example above to a flle foo.pyx, then run cythonize -i foo.pyx.
Why this happens
This is because the function foo expects a cudaMemAllocationHandleType that is defined in the CUDA runtime library. But CUDA Python "rewrites" the runtime library at the Cython layer, and has its owncudaMemAllocationHandleType (which ends up with a mangled name when transpiled from Cython to C++). The two are not interchangeable.
A potential solution
A potential solution, proposed by @leofang in an offline discussion, is to use extern declarations for types in ccudart.pxd, rather than to redefine them. For example:
Currently, we ship a single version of CUDA Python that is built with the latest CUDA toolkit, and we expect it to work for older minor versions of the CUDA toolkit by leveraging CUDA enhanced compatibility.
Historically, there have been cases when the runtime API has changed across minor versions of the CUDA toolkit. In particular, the names/ordering of enum members have changed between minor versions. For example, in CUDA 10.1, there was a typo in the enum member cudaErrorDeviceUninitilialized that was fixed in 10.2.
It's not clear how we would handle the situation if something like that were to happen again. In the example above, we would have to have separate extern declarations for 10.1 and 10.2 somehow.
The text was updated successfully, but these errors were encountered:
The problem
The Cython example below demonstrates an attempt to use CUDA Python to interact with some external C++ code. Note that the "external code" is included inline in the Cython.
The external code is a function
foo
that accepts acudaMemAllocationHandleType
. We attempt to invoke that function from Cython by passing in acuda.ccudart.cudaMemAlloccationHandleType
, but this fails with an error like:To reproduce the problem, save the example above to a flle
foo.pyx
, then runcythonize -i foo.pyx
.Why this happens
This is because the function
foo
expects acudaMemAllocationHandleType
that is defined in the CUDA runtime library. But CUDA Python "rewrites" the runtime library at the Cython layer, and has its owncudaMemAllocationHandleType
(which ends up with a mangled name when transpiled from Cython to C++). The two are not interchangeable.A potential solution
A potential solution, proposed by @leofang in an offline discussion, is to use extern declarations for types in
ccudart.pxd
, rather than to redefine them. For example:Gotcha
Currently, we ship a single version of CUDA Python that is built with the latest CUDA toolkit, and we expect it to work for older minor versions of the CUDA toolkit by leveraging CUDA enhanced compatibility.
Historically, there have been cases when the runtime API has changed across minor versions of the CUDA toolkit. In particular, the names/ordering of enum members have changed between minor versions. For example, in CUDA 10.1, there was a typo in the enum member
cudaErrorDeviceUninitilialized
that was fixed in 10.2.It's not clear how we would handle the situation if something like that were to happen again. In the example above, we would have to have separate extern declarations for 10.1 and 10.2 somehow.
The text was updated successfully, but these errors were encountered: