-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
potential undefined behavior with subprocess using vfork() on Linux? #91401
Comments
Using vfork in bpo-35823 is VERY tricky. Please comment out vfork() usage for now. Yes, we can (should) use vfork(), but we have to rewrite the child code. https://bugzilla.kernel.org/show_bug.cgi?id=215813 I would say it's URGENT. |
Rewriting in a way that guarantee no stack (and heap) usage. Because stack is shared between child and parent. It seems there is no crossplatform way. Happily we can use some code like I wrote by link in the first message. OR, stick to posix_nspawn which is cross-platform. |
cpython/Modules/_posixsubprocess.c Line 717 in 4a08c4c
---------------- Py_NO_INLINE static void ------------- So, calling child_exec pushes to stack GUARANTEED. This is the bug. In fact everything works, but is too weak and break at any moment. Again. Please comment calling vfork() for now. And re-implement child part later. |
Solution: https://github.com/bminor/glibc/blob/master/sysdeps/unix/sysv/linux/spawni.c#L309 In short - do not use vfork(). Use clone(CLONE_VM | CLONE_VFORK). and do something with stack. |
Can you provide a reproducable way to demonstrate evidence of a problem in CPython's use of the Linux libc vfork() implementation? A test case that causes a CPython parent or child process on Linux when built with HAVE_VFORK failing to function properly would help prioritize this because in practice nobody has reported problems in 3.10. (we've deployed the subprocess vfork code into thousands production Python programs at work in the past year w/o issues being reported - though we have a constrained environment with use on only a couple of libc versions and limited set of kernels on a few very common architectures) General thinking (possible dated and incorrect - against what https://man7.org/linux/man-pages/man2/vfork.2.html wording claims with its "or calls any other function" text): Pushing additional data onto the stack in the child process _should_ not a problem. That by definition lands in previously unused pre-allocated stack space. If that page faults, that could map a new page into the process state shared by both the paused parent and running child. But this page mapping should be fine - the child exec that resumes the parent means the parent is the only one who sees it. When the parent process resumes, sure, that data will be in that memory on the unallocated portion of stack, but critically the *stack pointer* in the parent process (a register) never changes. As far as I understand things, registers are not shared between vfork()ed processes. So the parent only sees some temporary data having been written to the unused region of the stack by the since-replaced by exec() child process. No big deal. **Untrue wishful thinking**: If a new stack were needed on a given platform for use in the vfork()ed child, I'd like it to be the job of libc to take care of that for us. glibc sources do no such thing (every vfork supporting architecture has a vfork.S code that appears to make the syscall and jump back to the caller without stack manipulation). |
Immediate action item: Add a way for people to disable vfork at runtime by setting a flag in the subprocess module, just in case. This can be backported to 3.10 - It'd provide an escape hatch for anyone without a need to rebuild Python to disable use of vfork() without resorting to LD_PRELOAD hacks. This is not an urgent issue unless actual practical problems are being observed in the field. |
Our current assumptions around the use of vfork() are very much glibc specific. Another useful reference for reasoning, comments, and history is https://github.com/golang/go/blob/master/src/syscall/exec_linux.go#L146 |
In short: both this bug report and [1] are invalid. The reason why doing syscall(SYS_vfork) is illegal is explained by Florian Weimer in [2]:
This is off-topic here because CPython calls vfork() via libc, but I'll still expand the Florian's comment. Suppose one wants to write my_vfork() wrapper over vfork syscall. When vfork syscall returns the first time, my_vfork() has to return to its caller. This necessarily involves knowing the return address. On some architectures this return address is stored on the stack by the caller (e.g. x86). The problem is that any data in my_vfork() stack frame can be overwritten by its caller once it returns the first time. Then, when vfork syscall returns the second time, my_vfork() could be unable to return to its caller because the data it fetches from its (now invalid) stack frame is garbage. This is precisely what happens when one implements my_vfork() as syscall(SYS_vfork). To avoid this, the most common strategy is to store the return address into a register that's guaranteed to be preserved around syscall by the OS ABI. For example, the x86-64 musl implementation [3] stores the return address in rdx (which is preserved around syscall) and then restores it after syscall (both on the first and the second return of the syscall). Now back to CPython. The real problem with stack sharing between the child and the parent could be due to compiler bugs, e.g. if a variable stored on the stack is modified in the child branch of "if (vfork())", but the compiler assumes that some other variable sharing the stack slot with the first one is *not* modified in the parent branch (i.e. after vfork() returns the second time). But all production compilers handle vfork() (and setjmp(), which has a similar issue) in a special way to avoid this, and GCC has __attribute__((returns_twice)) that a programmer could use for custom functions behaving in this way (my_vfork() would have to use this attribute). Regarding a separate stack for the child and clone(CLONE_VM|CLONE_VFORK), I considered this in bpo-35823, but this has its own problems. The most important one is that CPython would be in business of choosing the correct stack size for the child's stack, but CPython is *not* in control of all the code executing in the child because it calls into libc. In practice, people use various LD_PRELOAD-based software that wraps various libc functions (e.g. Scratchbox 2 build environment for Sailfish OS is an LD_PRELOAD-based sandbox), so even common-sense assumptions like "execve() in libc can't use a lot of stack!" might turn out to be wrong. CPython *could* work around that by using direct syscalls for everything in the child, or by using some "large" size that "should be enough for everybody", but I wouldn't want to see that unless we have a real problem with vfork(). Note that vfork()-based posix_spawn() implementations in C libraries do *not* have this problem because they fully control all code in the child (e.g. they would use a non-interposable execve() symbol or a direct syscall).
I don't think any action is needed at all, and I think this issue should be closed.
Could you clarify what glibc-specific assumptions you mean? In bpo-35823 I tried to use as little assumptions as possible. [1] https://bugzilla.kernel.org/show_bug.cgi?id=215813 |
Thanks! I agree with you that this is probably not an actual problem on Linux. _I did look at the various glibc architecture vfork.s implementations: Cute tricks used on some where they need to avoid a stack modifying traditional return from vfork()._ As for glibc specifics, I'm mostly thinking of the calls we do in the child. According to the "Standard Description (POSIX.1)" calls to anything other than Some of the calls we do from our child_exec() code... many are likely "just" syscall shims and thus fine - but that is technically up to libc. A few others are Py functions that go elsewhere in CPython and while they may be fine for practical reasons today with dangerous bits on conditional branches that technically should not be possible to hit given the state by the time we're at this point in _posixsubprocess, pose a future risk - anyone touching implementations of those is likely unaware of vfork'ed child limitations that must be met. For example if one of the potential code paths that trigger an indirect Py_FatalError() is hit... that fatal exit code is definitely not post-vfork-child safe. The pre-exec child dying via that could screw up the vfork parent process's state. |
If we're talking about the kernel side of things, sure, we rely on Linux being "sane" here, though I suppose on *BSDs the situation is similar.
Yes, but I wouldn't say that "being just syscall shims" is specific for glibc. It's just a "natural" property that just about any libc is likely to possess. (Yeah, I know, those are vague words, but in my experience "glibc-specific" is usually applied to some functionality/bug present in glibc and absent in other libcs, and I don't think we rely on something like that). Of course, there are also LD_PRELOAD things that could be called instead of libc, but good news here is that we don't create new constrains for them (CPython is not the only software that uses vfork()), and they're on their own otherwise.
We already have async-signal-safety requirement for all such code because of fork(). Requirements of vfork() are a bit more strict, but at least the set of functions we have to watch for dangerous changes is the same. And I suspect that most practical violations of vfork()-safety also violate async-signal-safety.
Yeah, and it can break the fork parent too, at least because it uses exit() (not _exit()), so stdio buffers will be flushed twice, in the child and in the parent. |
So, finally:
|
What specifically do you propose to fix? There is no problem with GIL if the child dies because the GIL is locked and unlocked only by the parent and the child never touches it. Similarly, only Py_* calls known to be safe are used. As for "pointers to strings", it's not clear to me what you mean, but if you mean allocations, they are already done before (v)fork(), since the child code is required to be async-signal-safe even if plain fork() is used. |
I have studied assembler output of _posixsubprocess.o compilation. Yes, everything seems safe. So, I'm closing the bug. |
Just in case there is ever an issue with _posixsubprocess's use of vfork() due to the complexity of using it properly and potential directions that Linux platforms where it defaults to on could take, this adds a failsafe so that users can disable its use entirely by setting a global flag. No known reason to disable it exists. But it'd be a shame to encounter one and not be able to use CPython without patching and rebuilding it. See the linked issue for some discussion on reasoning.
I'm reopening this to track adding a failsafe so people have a big-red-button style way to disable Python's use of vfork without recompiling just in case of future problems. That arguably should've been there from the start in 3.10. Though we still have no known need to use it, by the time you do it needs to already exist. I'll do a manual more conservative backport of the disable_vfork_reason button to 3.10 for 3.10.5. |
I wish posix_spawn() could be used in more cases on Linux: see #86904. |
Just in case there is ever an issue with _posixsubprocess's use of vfork() due to the complexity of using it properly and potential directions that Linux platforms where it defaults to on could take, this adds a failsafe so that users can disable its use entirely by setting a global flag. No known reason to disable it exists. But it'd be a shame to encounter one and not be able to use CPython without patching and rebuilding it. See the linked issue for some discussion on reasoning. Also documents the existing way to disable posix_spawn.
On the contrary, I wish posix_spawn() completely disabled on Linux. Using different backends depending on Popen() arguments only adds possibility for hard-to-understand behavioral differences. And posix_spawn() in glibc is still buggy (#91307 (comment)). |
@gpshead: It looks like you added the big-red-button; can this be closed? |
Hah, joy, but not surprised, that at least proves that failsafes knobs to let modern things be disabled have uses. The 3.10 PR being merged should close this issue out. |
…1932) This does not alter the `_posixsubprocess.fork_exec()` private API to avoid issues for anyone relying on that (bad idea) or for anyone who's `subprocess.py` and `_posixsubprocess.so` upgrades may not become visible to existing Python 3.10 processes at the same time. Backports the concept of cd5726f. Provides a fail-safe way to disable vfork for #91401. I didn't backport the documentation as I don't actually expect this to be used and `.. versionadded: 3.10.5` always looks weird in docs. It's being done more to have a fail-safe in place for people just in case.
This flag was added as an escape hatch in pythongh-91401 and backported to Python 3.10. The flag broke at some point between its addition and now. As there is currently no publicly known environments that require this, remove it rather than work on fixing it. This leaves the flag in the subprocess module to not break code which may have used / checked the flag itself. discussion: https://discuss.python.org/t/subprocess-use-vfork-escape-hatch-broken-fix-or-remove/56915/2
This flag was added as an escape hatch in pythongh-91401 and backported to Python 3.10. The flag broke at some point between its addition and now. As there is currently no publicly known environments that require this, remove it rather than work on fixing it. This leaves the flag in the subprocess module to not break code which may have used / checked the flag itself. discussion: https://discuss.python.org/t/subprocess-use-vfork-escape-hatch-broken-fix-or-remove/56915/2
This flag was added as an escape hatch in gh-91401 and backported to Python 3.10. The flag broke at some point between its addition and now. As there is currently no publicly known environments that require this, remove it rather than work on fixing it. This leaves the flag in the subprocess module to not break code which may have used / checked the flag itself. discussion: https://discuss.python.org/t/subprocess-use-vfork-escape-hatch-broken-fix-or-remove/56915/2
) This flag was added as an escape hatch in pythongh-91401 and backported to Python 3.10. The flag broke at some point between its addition and now. As there is currently no publicly known environments that require this, remove it rather than work on fixing it. This leaves the flag in the subprocess module to not break code which may have used / checked the flag itself. discussion: https://discuss.python.org/t/subprocess-use-vfork-escape-hatch-broken-fix-or-remove/56915/2
Hi, Sorry to dig up this old issue. We are trying to figure out a kernel crash in our Python application and figure out it's always coming from sys_vfork. Python: 3.11.2 I have not tried to unset
|
This issue is closed. I suggest you to open a new issue. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: