Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash Due to Exception in threading._shutdown() #113148

Open
ericsnowcurrently opened this issue Dec 14, 2023 · 1 comment
Open

Crash Due to Exception in threading._shutdown() #113148

ericsnowcurrently opened this issue Dec 14, 2023 · 1 comment
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-subinterpreters type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented Dec 14, 2023

One of the first thing that happens during interpreter finalization is waiting for all non-daemon threads to finish. This is implemented by calling threading._shutdown(). If an exception is raised there (e.g. in a threading "atexit" function), it gets ignored and we end up with non-daemon threads still running. In subinterpreters that results in a fatal error almost immediately after, since we make sure there's only one thread state left at that point.

Reproducer:
$ cat > /tmp/crash-interp-lingering-thread.py << EOF
import threading
from test.support import interpreters

interp = interpreters.create()
interp.exec_sync(f"""if True:
    import threading
    import time

    done = False

    def notify_fini():
        global done
        done = True
        raise Exception  # <-------
        t.join()
    threading._register_atexit(notify_fini)

    def task():
        while not done:
            time.sleep(0.1)
    t = threading.Thread(target=task)
    t.start()
""")
interp.close()


EOF
$ ./python /tmp/crash-interp-lingering-thread.py
Exception ignored on threading shutdown:
Traceback (most recent call last):
  File "./cpython/Lib/threading.py", line 1623, in _shutdown
    atexit_call()
  File "<string>", line 10, in notify_fini
Exception:
Fatal Python error: Py_EndInterpreter: not the last thread
Python runtime state: initialized

Current thread 0x00007fc8bf76a100 (most recent call first):
  <no Python frame>

Thread 0x00007fc8bd3d5700 (most recent call first):
  File "<string>", line 16 in task
  File "./cpython/Lib/threading.py", line 1020 in run
  File "./cpython/Lib/threading.py", line 1083 in _bootstrap_inner
  File "./cpython/Lib/threading.py", line 1040 in _bootstrap
Aborted (core dumped)

The solution? Make sure threading._shutdown() finishes its fundamental job: stop all non-daemon threads and wait for them to finish. At the very least, this means ignoring exceptions from functions registered with threading._register_atexit().

Linked PRs

@ericsnowcurrently ericsnowcurrently added type-bug An unexpected behavior, bug, or error interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-subinterpreters 3.12 bugs and security fixes 3.13 bugs and security fixes labels Dec 14, 2023
@ericsnowcurrently ericsnowcurrently added type-crash A hard crash of the interpreter, possibly with a core dump and removed type-bug An unexpected behavior, bug, or error labels Feb 27, 2024
@devdanzin
Copy link
Contributor

On code generated by fuzzing, I get the same error as above 99.9% of the time, but there's a rare segfault coming from the same code (in a GILfull build), here's the backtrace:

Thread 49 "Thread-43 (init" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffb6c006c0 (LWP 1768452)]
tstate_delete_common (tstate=tstate@entry=0xdddddddddddddddd, release_gil=release_gil@entry=0) at Python/pystate.c:1769
1769        assert(tstate->_status.cleared && !tstate->_status.finalized);
(gdb) bt
#0  tstate_delete_common (tstate=tstate@entry=0xdddddddddddddddd, release_gil=release_gil@entry=0) at Python/pystate.c:1769
#1  0x0000555555827626 in zapthreads (interp=interp@entry=0x7fffcc001b80) at Python/pystate.c:1840
#2  0x0000555555827de7 in PyInterpreterState_Delete (interp=interp@entry=0x7fffcc001b80) at Python/pystate.c:978
#3  0x0000555555822b03 in finalize_interp_delete (interp=0x7fffcc001b80) at Python/pylifecycle.c:1905
#4  0x00005555558248e6 in Py_EndInterpreter (tstate=tstate@entry=0x7fffd01b26c0) at Python/pylifecycle.c:2414
#5  0x00005555558281e2 in _PyInterpreterState_IDDecref (interp=<optimized out>) at Python/pystate.c:1287
#6  0x00007ffff6f308e7 in interp_decref (self=self@entry=0x7ffff6419130, args=args@entry=0x7ffff64d8370, kwds=kwds@entry=0x0) at ./Modules/_interpretersmodule.c:1365
#7  0x00005555556bc106 in cfunction_call (func=func@entry=0x7ffff6451190, args=args@entry=0x7ffff64d8370, kwargs=kwargs@entry=0x0) at Objects/methodobject.c:551
#8  0x0000555555666499 in _PyObject_MakeTpCall (tstate=tstate@entry=0x555556cfe740, callable=callable@entry=0x7ffff6451190, args=args@entry=0x7ffff6dfb2c0, 
    nargs=<optimized out>, keywords=keywords@entry=0x0) at Objects/call.c:242
#9  0x00005555556666d9 in _PyObject_VectorcallTstate (tstate=0x555556cfe740, callable=callable@entry=0x7ffff6451190, args=args@entry=0x7ffff6dfb2c0, 
    nargsf=<optimized out>, kwnames=kwnames@entry=0x0) at ./Include/internal/pycore_call.h:165
#10 0x000055555566672f in PyObject_Vectorcall (callable=callable@entry=0x7ffff6451190, args=args@entry=0x7ffff6dfb2c0, nargsf=<optimized out>, 
    kwnames=kwnames@entry=0x0) at Objects/call.c:327
#11 0x00005555557a4239 in _PyEval_EvalFrameDefault (tstate=0x555556cfe740, frame=0x7ffff6dfb248, throwflag=0) at Python/generated_cases.c.h:3573
#12 0x00005555557b99d7 in _PyEval_EvalFrame (tstate=tstate@entry=0x555556cfe740, frame=frame@entry=0x7ffff6dfb1b0, throwflag=throwflag@entry=0)
    at ./Include/internal/pycore_ceval.h:116
#13 0x00005555557b9bdb in _PyEval_Vector (tstate=0x555556cfe740, func=0x7ffff64559d0, locals=locals@entry=0x0, args=0x7fffb6bff928, argcount=1, kwnames=0x0)
    at Python/ceval.c:1746
#14 0x00005555556662ea in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:413
#15 0x000055555566967f in _PyObject_VectorcallTstate (tstate=tstate@entry=0x555556cfe740, callable=callable@entry=0x7ffff64559d0, args=args@entry=0x7fffb6bff928, 
    nargsf=nargsf@entry=1, kwnames=kwnames@entry=0x0) at ./Include/internal/pycore_call.h:167
#16 0x00005555556698e4 in method_vectorcall (method=<optimized out>, args=0x555555b4dee0 <_PyRuntime+104992>, nargsf=0, kwnames=0x0) at Objects/classobject.c:72
#17 0x0000555555667f3b in _PyVectorcall_Call (tstate=tstate@entry=0x555556cfe740, func=0x555555669707 <method_vectorcall>, callable=callable@entry=0x7ffff64d0d10, 
    tuple=tuple@entry=0x555555b4dec8 <_PyRuntime+104968>, kwargs=kwargs@entry=0x7ffff64d1730) at Objects/call.c:273
#18 0x000055555566825b in _PyObject_Call (tstate=0x555556cfe740, callable=callable@entry=0x7ffff64d0d10, args=args@entry=0x555555b4dec8 <_PyRuntime+104968>, 
    kwargs=kwargs@entry=0x7ffff64d1730) at Objects/call.c:348
#19 0x0000555555668297 in PyObject_Call (callable=callable@entry=0x7ffff64d0d10, args=args@entry=0x555555b4dec8 <_PyRuntime+104968>, 
    kwargs=kwargs@entry=0x7ffff64d1730) at Objects/call.c:373
#20 0x00005555557a056d in _PyEval_EvalFrameDefault (tstate=0x555556cfe740, frame=0x7ffff6dfb128, throwflag=0) at Python/generated_cases.c.h:2277
#21 0x00005555557b99d7 in _PyEval_EvalFrame (tstate=tstate@entry=0x555556cfe740, frame=frame@entry=0x7ffff6dfb020, throwflag=throwflag@entry=0)
    at ./Include/internal/pycore_ceval.h:116
#22 0x00005555557b9bdb in _PyEval_Vector (tstate=0x555556cfe740, func=0x7ffff7682450, locals=locals@entry=0x0, args=0x7fffb6bffd48, argcount=1, kwnames=0x0)
    at Python/ceval.c:1746
#23 0x00005555556662ea in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:413
#24 0x000055555566967f in _PyObject_VectorcallTstate (tstate=tstate@entry=0x555556cfe740, callable=callable@entry=0x7ffff7682450, args=args@entry=0x7fffb6bffd48, 
    nargsf=nargsf@entry=1, kwnames=kwnames@entry=0x0) at ./Include/internal/pycore_call.h:167
#25 0x00005555556698e4 in method_vectorcall (method=<optimized out>, args=0x555555b4dee0 <_PyRuntime+104992>, nargsf=0, kwnames=0x0) at Objects/classobject.c:72
#26 0x0000555555667f3b in _PyVectorcall_Call (tstate=tstate@entry=0x555556cfe740, func=0x555555669707 <method_vectorcall>, callable=callable@entry=0x7ffff6472210, 
    tuple=tuple@entry=0x555555b4dec8 <_PyRuntime+104968>, kwargs=kwargs@entry=0x0) at Objects/call.c:273
#27 0x000055555566825b in _PyObject_Call (tstate=0x555556cfe740, callable=0x7ffff6472210, args=0x555555b4dec8 <_PyRuntime+104968>, kwargs=0x0) at Objects/call.c:348
#28 0x0000555555668297 in PyObject_Call (callable=<optimized out>, args=<optimized out>, kwargs=<optimized out>) at Objects/call.c:373
#29 0x00005555558b0832 in thread_run (boot_raw=boot_raw@entry=0x555556c99640) at ./Modules/_threadmodule.c:354
#30 0x000055555584077a in pythread_wrapper (arg=<optimized out>) at Python/thread_pthread.h:242
#31 0x00007ffff7c9caa4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#32 0x00007ffff7d29c3c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

Do you think it's the same issue, or should it be filed as a new one?

Since the segfault is very rare, it's hard to reduce the test case :(

@picnixz picnixz added the 3.14 new features, bugs and security fixes label Feb 23, 2025
sergey-miryanov added a commit to sergey-miryanov/cpython that referenced this issue Feb 26, 2025
sergey-miryanov added a commit to sergey-miryanov/cpython that referenced this issue Feb 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-subinterpreters type-crash A hard crash of the interpreter, possibly with a core dump
Projects
Status: Todo
Status: No status
Development

No branches or pull requests

3 participants