Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting state of an exception object with a dic crashes Python 3.8.14 #97591

Closed
xiaxinmeng opened this issue Sep 27, 2022 · 17 comments
Closed

Setting state of an exception object with a dic crashes Python 3.8.14 #97591

xiaxinmeng opened this issue Sep 27, 2022 · 17 comments
Labels
3.10 only security fixes 3.11 only security fixes 3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@xiaxinmeng
Copy link

Crash report

The following programs defined a class C. In C, we perform dic clear() operation. When we set state of exception object with the dic, it causes segmentation fault on main branch (68c46ae) and latest stable Python 3.8.14. But it does not trigger any crashing on Python-3.11.0rc2 and Python 3.9.0.

class C(str):
    def __hash__(self):
        d.clear()
        return 0

d = {}
d[C()] = C()

e = Exception()
e.__setstate__(d)

Error messages

Segmentation Fault (Core dumped)

Your environment

  • CPython versions tested on: Python 3.8.14 and main branch (68c46ae)
  • Operating system and architecture: [GCC 7.5.0] on linux
@xiaxinmeng xiaxinmeng added the type-crash A hard crash of the interpreter, possibly with a core dump label Sep 27, 2022
@ronaldoussoren ronaldoussoren added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Sep 27, 2022
@gvanrossum
Copy link
Member

I ran this in main under the debugger and it looks like the crash actually occurs during final cleanup, in ClearWeakRefs called from free_keys_object. Not sure if I have the guts to continue debugging.

@ofey404
Copy link
Contributor

ofey404 commented Sep 30, 2022

Reproduced this in python 3.10.2, and latest main (83a3de4)

I would try to dig a little deeper but can't guarantee there will be any output (my knowledge about the python runtime is limited).

@ofey404
Copy link
Contributor

ofey404 commented Sep 30, 2022

I debugged with following code and found:

  • The cause is the d.clear() in Key.__hash__().
  • Any value type will trigger the segfault, except some types (str and dict won't cause crush, but list would.)
import gc

class Key(str):
    def __hash__(self):
        d.clear()
        return 0

class Value(str):
    pass

d = {}
d[Key()] = Value()

e = Exception()
e.__setstate__(d)

gc.collect() # Segfault happens here

According the stacktrace, it dies in deduce_unreachable in GC module.

cpython/Modules/gcmodule.c

Lines 1095 to 1103 in ff54dd9

deduce_unreachable(PyGC_Head *base, PyGC_Head *unreachable) {
validate_list(base, collecting_clear_unreachable_clear);
/* Using ob_refcnt and gc_refs, calculate which objects in the
* container set are reachable from outside the set (i.e., have a
* refcount greater than 0 when all the references within the
* set are taken into account).
*/
update_refs(base); // gc_prev is used for gc_refs
subtract_refs(base);

And finally if (Py_TYPE(v)->tp_repr == NULL) triggered the segfault.

cpython/Objects/object.c

Lines 409 to 411 in ff54dd9

if (Py_TYPE(v)->tp_repr == NULL)
return PyUnicode_FromFormat("<%s object at %p>",
Py_TYPE(v)->tp_name, v);

@ofey404
Copy link
Contributor

ofey404 commented Sep 30, 2022

It seems that crash happening during garbage collector subtracting internal references:

cpython/Modules/gcmodule.c

Lines 465 to 482 in ff54dd9

/* Subtract internal references from gc_refs. After this, gc_refs is >= 0
* for all objects in containers, and is GC_REACHABLE for all tracked gc
* objects not in containers. The ones with gc_refs > 0 are directly
* reachable from outside containers, and so can't be collected.
*/
static void
subtract_refs(PyGC_Head *containers)
{
traverseproc traverse;
PyGC_Head *gc = GC_NEXT(containers);
for (; gc != containers; gc = GC_NEXT(gc)) {
PyObject *op = FROM_GC(gc);
traverse = Py_TYPE(op)->tp_traverse;
(void) traverse(op,
(visitproc)visit_decref,
op);
}
}

While traversing through the objects, it meets an object, and the _PyObject_ASSERT failed:

  • The parent is not freed, but op is freed, so it triggers the assertion.

cpython/Modules/gcmodule.c

Lines 448 to 450 in ff54dd9

visit_decref(PyObject *op, void *parent)
{
_PyObject_ASSERT(_PyObject_CAST(parent), !_PyObject_IsFreed(op));

Then it wants to dump the parent (the dict in this scenario) with message, but segfault happens during the process.

(During printing the representation of the value. Strange.)

s = PyObject_Repr(value);

@ofey404
Copy link
Contributor

ofey404 commented Sep 30, 2022

What confused me is that, if I comment out the e.__setstate__ line, nothing happens during garbage collection.

Does the __setstate__ call put the dict into an inconsistent state? Or the problem lays in the d.clear() in key's hash function.

@ofey404
Copy link
Contributor

ofey404 commented Sep 30, 2022

The problem lies on the value's refcount: it becomes -2459565876494606881.

image

@ofey404
Copy link
Contributor

ofey404 commented Sep 30, 2022

Find the cause of wrongly set refcount:

cpython/Objects/typeobject.c

Lines 4017 to 4032 in 9f2f1dd

/* Internal API to look for a name through the MRO, bypassing the method cache.
This returns a borrowed reference, and might set an exception.
'error' is set to: -1: error with exception; 1: error without exception; 0: ok */
static PyObject *
find_name_in_mro(PyTypeObject *type, PyObject *name, int *error)
{
Py_hash_t hash;
if (!PyUnicode_CheckExact(name) ||
(hash = _PyASCIIObject_CAST(name)->hash) == -1)
{
hash = PyObject_Hash(name);
if (hash == -1) {
*error = -1;
return NULL;
}
}

The hash = PyObject_Hash(name); is exactly the line which touch the refcount.

image


So the problem is in _PyType_Lookup at _PyObject_GenericSetAttrWithDict. Finally it will call __hash__ method again (call d.clear()), and the refcount of value would be wrongly toggled.

cpython/Objects/object.c

Lines 1370 to 1390 in ff54dd9

_PyObject_GenericSetAttrWithDict(PyObject *obj, PyObject *name,
PyObject *value, PyObject *dict)
{
PyTypeObject *tp = Py_TYPE(obj);
PyObject *descr;
descrsetfunc f;
int res = -1;
if (!PyUnicode_Check(name)){
PyErr_Format(PyExc_TypeError,
"attribute name must be string, not '%.200s'",
Py_TYPE(name)->tp_name);
return -1;
}
if (tp->tp_dict == NULL && PyType_Ready(tp) < 0)
return -1;
Py_INCREF(name);
Py_INCREF(tp);
descr = _PyType_Lookup(tp, name);

Full stack trace when the refcount is changed.

libc.so.6!__memset_avx2_unaligned_erms (Unknown Source:0)
_PyMem_DebugRawFree(void * ctx, void * p) (\home\ofey\Code\cpython-source-reading\cpython\Objects\obmalloc.c:2575)
_PyMem_DebugFree(void * ctx, void * ptr) (\home\ofey\Code\cpython-source-reading\cpython\Objects\obmalloc.c:2709)
PyObject_Free(void * ptr) (\home\ofey\Code\cpython-source-reading\cpython\Objects\obmalloc.c:742)
PyObject_GC_Del(void * op) (\home\ofey\Code\cpython-source-reading\cpython\Modules\gcmodule.c:2366)
unicode_dealloc(PyObject * unicode) (\home\ofey\Code\cpython-source-reading\cpython\Objects\unicodeobject.c:1557)
subtype_dealloc(PyObject * self) (\home\ofey\Code\cpython-source-reading\cpython\Objects\typeobject.c:1573)
_Py_Dealloc(PyObject * op) (\home\ofey\Code\cpython-source-reading\cpython\Objects\object.c:2383)
Py_DECREF(PyObject * op, int lineno, const char * filename) (\home\ofey\Code\cpython-source-reading\cpython\Include\object.h:549)
Py_XDECREF(PyObject * op) (\home\ofey\Code\cpython-source-reading\cpython\Include\object.h:625)
free_keys_object(PyDictKeysObject * keys) (\home\ofey\Code\cpython-source-reading\cpython\Objects\dictobject.c:672)
dictkeys_decref(PyDictKeysObject * dk) (\home\ofey\Code\cpython-source-reading\cpython\Objects\dictobject.c:324)
PyDict_Clear(PyObject * op) (\home\ofey\Code\cpython-source-reading\cpython\Objects\dictobject.c:2070)
dict_clear(PyDictObject * mp, PyObject * _unused_ignored) (\home\ofey\Code\cpython-source-reading\cpython\Objects\dictobject.c:3379)
method_vectorcall_NOARGS(PyObject * func, PyObject * const * args, size_t nargsf, PyObject * kwnames) (\home\ofey\Code\cpython-source-reading\cpython\Objects\descrobject.c:454)
_PyObject_VectorcallTstate(PyThreadState * tstate, PyObject * callable, PyObject * const * args, size_t nargsf, PyObject * kwnames) (\home\ofey\Code\cpython-source-reading\cpython\Include\internal\pycore_call.h:92)
PyObject_Vectorcall(PyObject * callable, PyObject * const * args, size_t nargsf, PyObject * kwnames) (\home\ofey\Code\cpython-source-reading\cpython\Objects\call.c:300)
_PyEval_EvalFrameDefault(PyThreadState * tstate, _PyInterpreterFrame * frame, int throwflag) (\home\ofey\Code\cpython-source-reading\cpython\Python\ceval.c:4173)
_PyEval_EvalFrame(int throwflag, struct _PyInterpreterFrame * frame, PyThreadState * tstate) (\home\ofey\Code\cpython-source-reading\cpython\Include\internal\pycore_ceval.h:95)
_PyEval_Vector(PyThreadState * tstate, PyFunctionObject * func, PyObject * locals, PyObject * const * args, size_t argcount, PyObject * kwnames) (\home\ofey\Code\cpython-source-reading\cpython\Python\ceval.c:5791)
_PyFunction_Vectorcall(PyObject * func, PyObject * const * stack, size_t nargsf, PyObject * kwnames) (\home\ofey\Code\cpython-source-reading\cpython\Objects\call.c:396)
_PyObject_VectorcallTstate(PyObject * kwnames, size_t nargsf, PyObject * const * args, PyObject * callable, PyThreadState * tstate) (\home\ofey\Code\cpython-source-reading\cpython\Include\internal\pycore_call.h:92)
PyObject_CallOneArg(PyObject * func, PyObject * arg) (\home\ofey\Code\cpython-source-reading\cpython\Objects\call.c:378)
call_unbound_noarg(int unbound, PyObject * func, PyObject * self) (\home\ofey\Code\cpython-source-reading\cpython\Objects\typeobject.c:1759)
slot_tp_hash(PyObject * self) (\home\ofey\Code\cpython-source-reading\cpython\Objects\typeobject.c:8044)
PyObject_Hash(PyObject * v) (\home\ofey\Code\cpython-source-reading\cpython\Objects\object.c:769)
find_name_in_mro(PyTypeObject * type, PyObject * name, int * error) (\home\ofey\Code\cpython-source-reading\cpython\Objects\typeobject.c:4027)
_PyType_Lookup(PyTypeObject * type, PyObject * name) (\home\ofey\Code\cpython-source-reading\cpython\Objects\typeobject.c:4097)
_PyObject_GenericSetAttrWithDict(PyObject * obj, PyObject * name, PyObject * value, PyObject * dict) (\home\ofey\Code\cpython-source-reading\cpython\Objects\object.c:1390)
PyObject_GenericSetAttr(PyObject * obj, PyObject * name, PyObject * value) (\home\ofey\Code\cpython-source-reading\cpython\Objects\object.c:1463)
PyObject_SetAttr(PyObject * v, PyObject * name, PyObject * value) (\home\ofey\Code\cpython-source-reading\cpython\Objects\object.c:1022)
BaseException_setstate(PyObject * self, PyObject * state) (\home\ofey\Code\cpython-source-reading\cpython\Objects\exceptions.c:170)
method_vectorcall_O(PyObject * func, PyObject * const * args, size_t nargsf, PyObject * kwnames) (\home\ofey\Code\cpython-source-reading\cpython\Objects\descrobject.c:482)
_PyObject_VectorcallTstate(PyThreadState * tstate, PyObject * callable, PyObject * const * args, size_t nargsf, PyObject * kwnames) (\home\ofey\Code\cpython-source-reading\cpython\Include\internal\pycore_call.h:92)
PyObject_Vectorcall(PyObject * callable, PyObject * const * args, size_t nargsf, PyObject * kwnames) (\home\ofey\Code\cpython-source-reading\cpython\Objects\call.c:300)
_PyEval_EvalFrameDefault(PyThreadState * tstate, _PyInterpreterFrame * frame, int throwflag) (\home\ofey\Code\cpython-source-reading\cpython\Python\ceval.c:4173)
_PyEval_EvalFrame(int throwflag, struct _PyInterpreterFrame * frame, PyThreadState * tstate) (\home\ofey\Code\cpython-source-reading\cpython\Include\internal\pycore_ceval.h:95)
_PyEval_Vector(PyThreadState * tstate, PyFunctionObject * func, PyObject * locals, PyObject * const * args, size_t argcount, PyObject * kwnames) (\home\ofey\Code\cpython-source-reading\cpython\Python\ceval.c:5791)
PyEval_EvalCode(PyObject * co, PyObject * globals, PyObject * locals) (\home\ofey\Code\cpython-source-reading\cpython\Python\ceval.c:591)
run_eval_code_obj(PyThreadState * tstate, PyCodeObject * co, PyObject * globals, PyObject * locals) (\home\ofey\Code\cpython-source-reading\cpython\Python\pythonrun.c:1713)
run_mod(mod_ty mod, PyObject * filename, PyObject * globals, PyObject * locals, PyCompilerFlags * flags, PyArena * arena) (\home\ofey\Code\cpython-source-reading\cpython\Python\pythonrun.c:1734)
pyrun_file(FILE * fp, PyObject * filename, int start, PyObject * globals, PyObject * locals, int closeit, PyCompilerFlags * flags) (\home\ofey\Code\cpython-source-reading\cpython\Python\pythonrun.c:1629)
_PyRun_SimpleFileObject(FILE * fp, PyObject * filename, int closeit, PyCompilerFlags * flags) (\home\ofey\Code\cpython-source-reading\cpython\Python\pythonrun.c:439)
_PyRun_AnyFileObject(FILE * fp, PyObject * filename, int closeit, PyCompilerFlags * flags) (\home\ofey\Code\cpython-source-reading\cpython\Python\pythonrun.c:78)
pymain_run_file_obj(PyObject * program_name, PyObject * filename, int skip_source_first_line) (\home\ofey\Code\cpython-source-reading\cpython\Modules\main.c:360)
pymain_run_file(const PyConfig * config) (\home\ofey\Code\cpython-source-reading\cpython\Modules\main.c:379)
pymain_run_python(int * exitcode) (\home\ofey\Code\cpython-source-reading\cpython\Modules\main.c:610)
Py_RunMain() (\home\ofey\Code\cpython-source-reading\cpython\Modules\main.c:689)
pymain_main(_PyArgv * args) (\home\ofey\Code\cpython-source-reading\cpython\Modules\main.c:719)
Py_BytesMain(int argc, char ** argv) (\home\ofey\Code\cpython-source-reading\cpython\Modules\main.c:743)
main(int argc, char ** argv) (\home\ofey\Code\cpython-source-reading\cpython\Programs\python.c:15)

@ofey404
Copy link
Contributor

ofey404 commented Sep 30, 2022

So... What should we do with it?

I think I need some guidance/reference now, about what _PyType_Lookup does, and what's the MRO.

@gvanrossum
Copy link
Member

Thanks for the deep research!

I think the root cause is something that happens in BaseException_setstate(), in Python/exceptions.c. In particular these four lines:

        while (PyDict_Next(state, &i, &d_key, &d_value)) {
            if (PyObject_SetAttr(self, d_key, d_value) < 0)
                return NULL;
        }

are likely to end up missing an INCREF when the dict is cleared by the hash call. I suspect it's happening in the PyObject_SetAttr() call, since PyDict_Next() has no need for the hash.

Looking at the code for PyObject_SetAttr(), it does an INCREF(name) / DECREF(name) around the important bits, but it doesn't do the same for the value. Possibly it should -- or in this case the caller should?

@ofey404
Copy link
Contributor

ofey404 commented Oct 1, 2022

I tried to play with it, adding INCREF(value) / DECREF(value) pair around some suspicious place, but the Makefile crushed before I can run and debug what's happending inside.

I wonder how could I build and inspect INCREF and DECREF behavior correctly, and I would post this question in the Core-Mentorship list.

make -s -j
# make: *** [Makefile:1306: Python/deepfreeze/deepfreeze.c] Segmentation fault (core dumped)

@kumaraditya303
Copy link
Contributor

It is indeed missing an incref before calling tp_hash slot in PyObject_SetAttr. See similar issue #92930

The following patch fixes it:

diff --git a/Objects/exceptions.c b/Objects/exceptions.c
index 3703fdcda4..3680c6068b 100644
--- a/Objects/exceptions.c
+++ b/Objects/exceptions.c
@@ -167,8 +167,14 @@ BaseException_setstate(PyObject *self, PyObject *state)
             return NULL;
         }
         while (PyDict_Next(state, &i, &d_key, &d_value)) {
-            if (PyObject_SetAttr(self, d_key, d_value) < 0)
+            Py_INCREF(d_key);
+            Py_INCREF(d_value);
+            int res = PyObject_SetAttr(self, d_key, d_value);
+            Py_DECREF(d_key);
+            Py_DECREF(d_value);
+            if (res < 0) {
                 return NULL;
+            }
         }
     }
     Py_RETURN_NONE;

@ofey404
Copy link
Contributor

ofey404 commented Oct 1, 2022

Wow, I'll check the issue. With this patch, the bug disappeared. Should I make a PR for the patch? @kumaraditya303


I guess it's fixed before, but the patch is reversed for some reason. I would look into the line history.

In the latest main, the inc/dec ref pair is still missing:

static PyObject *
BaseException_setstate(PyObject *self, PyObject *state)
{
PyObject *d_key, *d_value;
Py_ssize_t i = 0;
if (state != Py_None) {
if (!PyDict_Check(state)) {
PyErr_SetString(PyExc_TypeError, "state is not a dictionary");
return NULL;
}
while (PyDict_Next(state, &i, &d_key, &d_value)) {
if (PyObject_SetAttr(self, d_key, d_value) < 0)
return NULL;
}
}
Py_RETURN_NONE;
}

@kumaraditya303
Copy link
Contributor

Wow, I'll check the issue. With this patch, the bug disappeared. Should I make a PR for the patch?

Feel free to create a PR with tests.

@kumaraditya303 kumaraditya303 added 3.11 only security fixes 3.10 only security fixes 3.12 bugs and security fixes labels Oct 1, 2022
@ofey404
Copy link
Contributor

ofey404 commented Oct 1, 2022

I check the line history but didn't find much useful information. These lines seem to be unchanged since the cpython repo migrated from SVN in 2006? This is the commit.

git log -L '/PyObject_SetAttr(self, d_key, d_value)/,+5:Objects/exceptions.c'

  commit 4d70c3d9dded0f0fa7a73c67217a71111d05df4d
  Author: Thomas Wouters <[email protected]>
  Date:   Thu Jun 8 14:42:34 2006 +0000

....

  diff --git a/Objects/exceptions.c b/Objects/exceptions.c
  --- a/Objects/exceptions.c
  +++ b/Objects/exceptions.c
  @@ -167,0 +170,5 @@
  +            if (PyObject_SetAttr(self, d_key, d_value) < 0)
  +                return NULL;
  +        }
  +    }
  +    Py_RETURN_NONE;

@ofey404
Copy link
Contributor

ofey404 commented Oct 1, 2022

Feel free to create a PR with tests.

I would create a test triggering this behavior in Lib/test/test_exception.py ;)

@ofey404
Copy link
Contributor

ofey404 commented Oct 1, 2022

PR is created.

Lib/test/test_baseexception.py is a better place for this test to go, since:

class ExceptionClassTests(unittest.TestCase):
"""Tests for anything relating to exception objects themselves (e.g.,
inheritance hierarchy)"""

miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 2, 2022
…es before calling `tp_hash` slot (pythonGH-97700)

(cherry picked from commit d639438)

Co-authored-by: Ofey Chan <[email protected]>
gvanrossum pushed a commit that referenced this issue Oct 2, 2022
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 2, 2022
…es before calling `tp_hash` slot (pythonGH-97700)

(cherry picked from commit d639438)

Co-authored-by: Ofey Chan <[email protected]>
@gvanrossum
Copy link
Member

Thanks for your investigation and patch!

miss-islington added a commit that referenced this issue Oct 2, 2022
…ore calling `tp_hash` slot (GH-97700)

(cherry picked from commit d639438)

Co-authored-by: Ofey Chan <[email protected]>
miss-islington added a commit that referenced this issue Oct 2, 2022
…ore calling `tp_hash` slot (GH-97700)

(cherry picked from commit d639438)

Co-authored-by: Ofey Chan <[email protected]>
serhiy-storchaka pushed a commit to serhiy-storchaka/cpython that referenced this issue Oct 2, 2022
carljm added a commit to carljm/cpython that referenced this issue Oct 3, 2022
* main: (2069 commits)
  pythongh-96512: Move int_max_str_digits setting to PyConfig (python#96944)
  pythongh-94808: Coverage: Check picklablability of calliter (python#95923)
  pythongh-94808: Add test coverage for PyObject_HasAttrString (python#96627)
  pythongh-94732: Fix KeyboardInterrupt race in asyncio run_forever() (python#97765)
  Fix typos in `bltinmodule.c`. (pythonGH-97766)
  pythongh-94808: `_PyLineTable_StartsLine` was not used (pythonGH-96609)
  pythongh-97681: Remove Tools/demo/ directory (python#97682)
  Fix typo in unittest docs (python#97742)
  pythongh-97728: Argument Clinic: Fix uninitialized variable in the Py_UNICODE converter (pythonGH-97729)
  pythongh-95913: Fix PEP number in PEP 678 What's New ref label (python#97739)
  pythongh-95913: Copyedit/improve New Modules What's New section (python#97721)
  pythongh-97740: Fix bang in Sphinx C domain ref target syntax (python#97741)
  pythongh-96819: multiprocessing.resource_tracker: check if length of pipe write <= 512 (python#96890)
  pythongh-97706: multiprocessing tests: Delete unused variable `rand` (python#97707)
  pythonGH-85447: Clarify docs about awaiting future multiple times (python#97738)
  [docs] Update logging cookbook with recipe for using a logger like an output… (pythonGH-97730)
  pythongh-97607: Fix content parsing in the impl-detail reST directive (python#97652)
  pythongh-95975: Move except/*/finally ref labels to more precise locations (python#95976)
  pythongh-97591: In `Exception.__setstate__()` acquire strong references before calling `tp_hash` slot (python#97700)
  pythongh-95588: Drop the safety claim from `ast.literal_eval` docs. (python#95919)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.10 only security fixes 3.11 only security fixes 3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

5 participants