You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There's a race condition between wrapper lookup and wrapper deallocation where a Python wrapper may be returned that's in the process of being deallocated. I have a reproducer for the free threading build. I think the problem can also affect the default (GIL-enabled) build as well, but I don't have a reproducer yet.
nb_type_put will lookup an existing Python wrapper object in inst_c2p. The inst_dealloc function removes the wrapper from inst2_cp when the wrapper is deallocated.
During the nb_type_put call, it's possible that the found wrapper has a reference count of 0 and is in the process of being deallocated, but not yet removed from inst2_cp. In the free threading build, this can happen because nb_type_put can be run concurrently with inst_deallocup to the acquisition of the shard lock. I think this can also happen in the default (GIL-enabled) build, because things like Py_CLEAR(*dict) can call arbitrary code that may temporarily release the GIL.
Suggested fix
nb_type_put should only incref and return a wrapper if the reference count is not zero. In the GIL-enabled build, this is roughly:
if (Py_REFCNT(seq.inst) > 0) {
Py_INCREF(seq.inst);
return seq.inst;
}
In the free threading build, we'll want to use PyUnstable_TryIncref when it's available, or implement that logical like we're doing in pybind11.
Problem description
There's a race condition between wrapper lookup and wrapper deallocation where a Python wrapper may be returned that's in the process of being deallocated. I have a reproducer for the free threading build. I think the problem can also affect the default (GIL-enabled) build as well, but I don't have a reproducer yet.
This the counterpart of the pybind11 bug:
Explanation
nb_type_put
will lookup an existing Python wrapper object ininst_c2p
. Theinst_dealloc
function removes the wrapper frominst2_cp
when the wrapper is deallocated.During the
nb_type_put
call, it's possible that the found wrapper has a reference count of 0 and is in the process of being deallocated, but not yet removed frominst2_cp
. In the free threading build, this can happen becausenb_type_put
can be run concurrently withinst_dealloc
up to the acquisition of the shard lock. I think this can also happen in the default (GIL-enabled) build, because things likePy_CLEAR(*dict)
can call arbitrary code that may temporarily release the GIL.Suggested fix
nb_type_put
should only incref and return a wrapper if the reference count is not zero. In the GIL-enabled build, this is roughly:In the free threading build, we'll want to use
PyUnstable_TryIncref
when it's available, or implement that logical like we're doing in pybind11.See also
_Py_TryIncref
public as an unstable API asPyUnstable_TryIncref()
python/cpython#128844Reproducible example code
The text was updated successfully, but these errors were encountered: