-
-
Notifications
You must be signed in to change notification settings - Fork 31.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-97514: Authenticate the forkserver control socket. #99309
gh-97514: Authenticate the forkserver control socket. #99309
Conversation
This adds authentication. In the past only filesystem permissions protected this socket from code injection into the forkserver process by limiting access to the same UID, which didn't exist when Linux abstract namespace sockets were used (see issue) meaning that any process in the same system network namespace could inject code. This reuses the hmac based shared key auth already used on multiprocessing sockets used for other purposes. Doing this is useful so that filesystem permissions are not relied upon and trust isn't implied by default between all processes running as the same UID.
from the buildbots... tests leak some file descriptors. not too surprising given the bit of code the test pokes into, i'll see what can be done to manage those. |
I can't add new testcases to test_multiprocessing_forkserver itself, i had to put them within an existing _test_multiprocessing test class. I don't know why, but refleaks are fragile and that test suite is... rediculiously complicated with all that it does.
I'm not sure _why_ the hang happened, the forkserver process wasn't exiting when the alive_w fd was closed in the parent during tearDownModule(), instead it remained in its selector() loop. regardless the part of the test this removes fixes it and it only happened on macOS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mostly looks good. The tests helped me understand a bit better. Sorry if some of my comments demonstrate ignorance about the multiprocessing implementation. Thanks for working on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! thanks for addressing my comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…-99309) This adds authentication to the forkserver control socket. In the past only filesystem permissions protected this socket from code injection into the forkserver process by limiting access to the same UID, which didn't exist when Linux abstract namespace sockets were used (see issue) meaning that any process in the same system network namespace could inject code. We've since stopped using abstract namespace sockets by default, but protecting our control sockets regardless of type is a good idea. This reuses the HMAC based shared key auth already used by `multiprocessing.connection` sockets for other purposes. Doing this is useful so that filesystem permissions are not relied upon and trust isn't implied by default between all processes running as the same UID with access to the unix socket. ### pyperformance benchmarks No significant changes. Including `concurrent_imap` which exercises `multiprocessing.Pool.imap` in that suite. ### Microbenchmarks This does _slightly_ slow down forkserver use. How much so appears to depend on the platform. Modern platforms and simple platforms are less impacted. This PR adds additional IPC round trips to the control socket to tell forkserver to spawn a new process. Systems with potentially high latency IPC are naturally impacted more. Typically a 1-4% slowdown on a very targeted process creation microbenchmark, with a worst case overloaded system slowdown of 20%. No evidence that these slowdowns appear in practical sense. See the PR for details.
This adds authentication. In the past only filesystem permissions protected this socket from code injection into the forkserver process by limiting access to the same UID, which didn't exist when Linux abstract namespace sockets were used (see issue) meaning that any process in the same system network namespace could inject code. We've since stopped using abstract namespace sockets by default, but protecting our control sockets regardless of type seems desirable.
This reuses the HMAC based shared key auth already used by
multiprocessing.connection
sockets for other purposes.Doing this is useful so that filesystem permissions are not relied upon and trust isn't implied by default between all processes running as the same UID with access to the unix socket.
Tasks remaining
multiprocessing.Pool
worker processes are long lived by default so spawning new ones via the start method is infrequent compared to the number of tasks they are given. Only applications using maxtasksperchild= with a very low value might be able to notice, but even then the amount of work done in a worker process should far exceed any additional overhead this security measure adds to requesting forkserver to spawn new processes.pyperformance benchmarks
No significant changes. Including
concurrent_imap
which exercisesmultiprocessing.Pool.imap
in that suite.Microbenchmarks
This does slightly slow down forkserver use. How much so appears to depend on the platform. Modern platforms and simple platforms are less impacted. This PR adds additional IPC round trips to the control socket to tell forkserver to spawn a new process. Systems with potentially high latency IPC are naturally impacted more.
Using my multiprocessing
process-creation-benchmark.py
:I switched between this PR branch and
main
via a simple git checkout after my build as the changes are pure Python so no rebuild is needed.On an AMD zen4 system:
889 Procs/sec dropped to 874. 1.5% slower. Insignificant.
AMD 7800X3D single-CCD 8 cores.
Expand for baseline fork (2659 Procs/sec) and spawn (268) measurements.
``` % ../b/python ~/Downloads/process-creation-benchmark.py 13 fork Process Creation Microbenchmark (max 7 active processes) (13 iterations) multiprocessing start method: fork sys.version='3.14.0a1+ (heads/security/multiprocessing-forkserver-authkey-dirty:07c01d459f8, Nov 10 2024) [GCC 13.2.0]' -------------------------------------------------------------------------------- Total Procs/sec Time (s) StdDev -------------------------------------------------------------------------------- 32 2,300.78 0.014 78.91 128 2,391.11 0.054 114.68 384 2,650.31 0.145 13.23 1024 2,646.28 0.387 16.47 2048 2,641.08 0.775 13.65 5120 2,659.42 1.925 11.82 % ../b/python ~/Downloads/process-creation-benchmark.py 13 spawn Process Creation Microbenchmark (max 7 active processes) (13 iterations) multiprocessing start method: spawn sys.version='3.14.0a1+ (heads/security/multiprocessing-forkserver-authkey-dirty:07c01d459f8, Nov 10 2024) [GCC 13.2.0]' -------------------------------------------------------------------------------- Total Procs/sec Time (s) StdDev -------------------------------------------------------------------------------- 32 235.96 0.136 13.91 128 259.53 0.493 0.79 384 267.62 1.435 1.00 1024 267.89 3.822 0.35 ```On an Intel Broadwell Xeon E5-2698 v4 system:
828 Procs/sec dropped to 717. ~15% slower. Significant. BUT... if I drop the active processes from 19 to 9. The difference was far less. 414 dropped to 398 for a ~4% slower. Moderate.
20 cores, 2 ring busses, 4 memory controllers, single socket. A large die Broadwell Xeon is complicated. At high parallelism counts, interprocess communication latencies add up. I predict similar results from multi-core-complex-die zen/epycs and multi socket systems, probably also on big.little mixed power/perf core arrangements.
Expand for baseline fork (1265 Procs/sec) and spawn (233) measurements.
On an Raspberry Pi 5
126 Proc/sec dropped to 121. A ~4% slowdown. Moderate.
Raspberry Pi 5 running 32-bit raspbian.
Expand for baseline fork (973 Procs/sec) and spawn (32) measurements.