-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-126914: Store the Preallocated Thread State's Pointer in a PyInterpreterState Field #126989
gh-126914: Store the Preallocated Thread State's Pointer in a PyInterpreterState Field #126989
Conversation
I'll get to reviewing in an hour or so. At a glance:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments. I'll yield to you on whether or not it's worth trying to do this locklessly :)
Let's circle back to this later. My initial thought is that backporting would make sense. What do you mean about making debuggers unhappy?
Perhaps. I don't anticipate there being enough simultaneous attempts that the overhead of any contention would matter.
I wanted to start off with this. I'm not convinced that an actual freelist of thread states is worth it. Conceptually it made sense as a solution for the pre-allocated thread state. I debated on having a dedicated field for just that one vs. do the freelist thing. Clearly I went with the latter, but now I'm thinking the illusion of a freelist isn't the best thing. I may switch it back. |
Some low-level debuggers and profilers (something like PyStack, py-spy, and probably some others that I don't know about) might get a nasty surprise if we change where the main thread is stored. I don't think it's that important, as they're already aware of the maintenance burden that comes with relying on private implementation details, but it's not too convincing for a backport. (It's probably easier for downstream if we backport GH-126915, and then just put this on main for going forward, but I'm definitely biased 😄)
I'm not too worried about the subinterpreter case, but instead for just normal multithreading under one interpreter. This will add additional overhead to calling something like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, it looks like you've removed the freelist now. (I would change the title for the commit message.)
This looks pretty good, and I think this is small enough to not worry about giving debuggers too much of a headache. I'll do one final pass a little later today once you've figured out what you want to do about resets and then let's merge :)
It is still found at
I'd be surprised if lock/atomic contention here were more than insignificant relative to any of the other operations at play. Also keep in mind that |
I noticed! I like this fix better than mine with the extra field, it's clever. In fact, we could reuse the
Yeah, contention on those functions is probably not great right now anyways because of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I'll add the backport to 3.13 label once gh-126995 is merged. Otherwise sorting out the ABI data file becomes a much bigger pain. |
|
In hindsight, we probably should have ran buildbots before merging. I'm not sure that's related though. |
I'm fairly sure it's unrelated. |
…yInterpreterState Field (pythongh-126989) This approach eliminates the originally reported race. It also gets rid of the deadlock reported in pythongh-96071, so we can remove the workaround added then.
…PyInterpreterState Field (gh-127114) This approach eliminates the originally reported race. It also gets rid of the deadlock reported in gh-96071, so we can remove the workaround added then. This is mostly a cherry-pick of 1c0a104 (AKA gh-126989). The difference is we add PyInterpreterState.threads_preallocated at the end of PyInterpreterState, instead of adding PyInterpreterState.threads.preallocated. That avoids ABI disruption.
This approach eliminates the originally reported race. It also gets rid of the deadlock reported in gh-96071, so we can remove the workaround added then.