-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-87135: Hang non-main threads that attempt to acquire the GIL during finalization #105805
Conversation
… during finalization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, as I wrote in issue #87135, I dislike switching to this behavior by default :-(
I would prefer to keep the current behavior by default, but give the ability to hang for people impacted by pthread_exit() issues.
My problem with that and reason why we need this PR is that the answer to "who is going to use that ability" is unclear. End users of Python applications don't have a way to debug random thread crashes and know that they need it. Extension module authors do not know that they need it. A few big application owners will simply blindly turn it on for everything because they happen to have people aware of the consequences of not going it. But there is no good point to turn such a conditional of "hang threads rather than allow them to randomly crash or refuse to run C++ finalizers despite releasing the associated memory meaning other threads can crash because multiple other things are using the previously allocated and then freed memory". Those concurrency scribbling messes are far worse things to attempt to debug than a simple hung thread, hung within a clear C symbol saying "hi there, i'm going to hang this thread". So I'm all for undoing our mistaken thread exiting API and use of it in favor of this least bad option of hanging threads that would otherwise have failed to cleanup and left ambiguious process state behind as we have had in the past. I do still think we want APIs to allow error handling upon re-entering an interpreter or acquiring the GIL and actually failing instead of this... but we don't have those. and most of the world's Python C API using code won't be using such new APIs for a looong time while the random crashing problem would persist. So this remains the least bad option that makes a viable improvement. |
@colesbury and/or @mpage, and @ericsnowcurrently - can you take a look at this PR? (FWIW, Eric's already gone over it with my on screen at the core dev sprint here) |
Blocking daemon threads after the VM has been finalized seems less error prone than forcing the threads to exit. I believe this is also how the JVM handles the same situation, so there's some precedence for this approach. While we're here, I think this might be a good time to also fix the issues with threads referencing their |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if any more, i'll just skip the test on wasi.
Linking to the filed pre-existing known issue.
That goes beyond the scope of what I want to accomplish with this PR - which I believe might mostly be a candidate for backporting as a bug fix (I'll see what the RMs think). I think it is a good followup issue though - #124878 for starters but that might turn into a couple of different things to chase down as well. |
@Yhg1s what's your opinion on the viability of backporting this as a bugfix to 3.13 and 3.12 patch releases? People who actually encounter this in the form of strange crashes from threads during shutdown have been wanting it for years. I wouldn't call it common, just a thorn in some classes of applications side. (In terms of impact I'd elide marking the deprecated C API as deprecated in the backports - otherwise I expect most of this to apply relatively easily given that it was originally authored on a |
Sorry, @jbms and @gpshead, I could not cleanly backport this to
|
Sorry, @jbms and @gpshead, I could not cleanly backport this to
|
I expected the backport automation to fail and need edits regardless even if it hadn't. I'll be taking care of them. :) |
This splits off the change from #28525 to hang threads that attempt to acquire the GIL during interpreter shutdown, but does not introduce any new public APIs.
📚 Documentation preview 📚: https://cpython-previews--105805.org.readthedocs.build/