-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Add CommandExt::{exec, before_exec} #1359
Conversation
Add two methods to the `std::os::unix::process::CommandExt` trait to provide more control over how processes are spawned on Unix, specifically: ```rust fn exec(&mut self) -> io::Error; fn before_exec<F: FnOnce() + Send + Sync + 'static>(&mut self, f: F) -> &mut Self; ``` [Rendered][link] [link]: https://github.com/alexcrichton/rfcs/blob/process-ext/text/0000-process-ext-unix.md
/// Multiple closures can be registered and they will be called in order of | ||
/// their registration. If a closure returns `Err` then no further closures will | ||
/// be called and the spawn operation will immediately return with a failure. | ||
fn before_exec<F>(&mut self, f: F) -> &mut Self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this might want to be unsafe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question! I would be ok marking it as such, but I was unable to come up with a concrete reason as to why, in which case I left it as safe.
Do you have an unsafe use-case in mind though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Malloc is in an unspecified, but possibly unusable state" seems like a reasonable one? :P
More generally, it seems like it'll be possible to end up in a state where all of your other threads disappearing a the "wrong" time breaks invariants that some type's safety depends on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm I may want to tone down the wording there, but the only known problem to me at least is that you can easily deadlock. Concretely I don't know how to get into a memory unsafe situation, and I'd reason along the lines of if you witness any other memory state of any other thread it's as if it was a concurrent situation anyway, you just happen to always witness the same situation and no progress is made.
I can imagine this causing weird bugs, but not memory safety ones at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exec
and before_exec
should be unsafe and the inline doc should warn more the user why it's important to not use the heap at all to avoid inconsistent memory layout (e.g. when malloc/free is interrupted). Cf. reentrant closure.
It would be nice to have a lint to check unsafe code here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like with what @sfackler was saying, though, I don't think these should be unsafe unless they actually lead to memory unsafety. The only problem I know with heap allocation is that it deadlocks, which is not unsafe. I'm also not quite sure what this has to do with reentrancy as it's not as dangerous as, for example, a signal handler.
Could you clarify the situations in which you think running safe arbitrary code here could lead to memory unsafety?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it depend on the other threads of the process. They can deadlock but some code could probably corrupt your data as well. This article worth a read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right this is certainly only a problem if there are other threads in play here, but to reiterate, deadlock is not related to memory safety. Corrupting data certainly is, but I'm just trying to get a handle on how that's possible.
For example, if you clone the address space with a locked mutex, then I'd assume that any future attempts to lock that mutex in the child would simply deadlock. This does not corrupt any data. Similarly if you call a function like malloc
it was either locked or unlocked in the parent process, and in both cases data corruption wouldn't be possible.
The article you linked mentions "Moreover: state of mutexes is undefined" but I don't think that's true because the pthread_atfork
documentation literally recommends grabbing mutexes before a fork and releasing them afterwards. Taking a look again at the fork
documentation it indicates:
Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.
It does not explain why, "async-signal-safe" is the only method to be safe after a fork. For example malloc
is not async-signal-safe but to the best of my knowledge basically all libraries use pthread_atfork
to make it safe to call after the fork anyway. This may mean it's just not meant to work everywhere, however.
I'm uncomfortable saying this is unsafe "just because standard X says so" without concretely understanding why, but it's probably good enough for now.
Excellent! The |
@l0kod wouldn't that imply that the callback runs after exec though? Perhaps you meant |
Yes, of course :) |
This kind of sounds like an attractive alternative. I admit that I don't know much about the uses cases here (like, at all), but why not go this route instead? It seems like it might give others more flexibility? |
🔔 This RFC is now entering its week-long final comment period 🔔 |
@BurntSushi while plausible, I'd personally prefer to push more for this in the standard library itself. While getters will be attractive in their own right, implementing an actually spawn is actually as massive pain (e.g. |
The libs team discussed this during triage today and the conclusion was to merge, so I will do so shortly! |
This commit implements the `exec` function proposed in [RFC 1359][rfc] which is a function on the `CommandExt` trait to execute all parts of a `Command::spawn` without the `fork` on Unix. More details on the function itself can be found in the comments in the commit. [rfc]: rust-lang/rfcs#1359 cc rust-lang#31398
These commits are an implementation of rust-lang/rfcs#1359 which is tracked via #31398. The `before_exec` implementation fit easily with the current process spawning framework we have, but unfortunately the `exec` implementation required a bit of a larger refactoring. The stdio handles were all largely managed as implementation details of `std::process` and the `exec` function lived in `std::sys`, so the two didn't have access to one another. I took this as a sign that a deeper refactoring was necessary, and I personally feel that the end result is cleaner for both Windows and Unix. The commits should be separated nicely for reviewing (or all at once if you're feeling ambitious), but the changes made here were: * The process spawning on Unix was refactored in to a pre-exec and post-exec function. The post-exec function isn't allowed to do any allocations of any form, and management of transmitting errors back to the parent is managed by the pre-exec function (as it's the one that actually forks). * Some management of the exit status was pushed into platform-specific modules. On Unix we must cache the return value of `wait` as the pid is consumed after we wait on it, but on Windows we can just keep querying the system because the handle stays valid. * The `Stdio::None` variant was renamed to `Stdio::Null` to better reflect what it's doing. * The global lock on `CreateProcess` is now correctly positioned to avoid unintended inheritance of pipe handles that other threads are sending to their child processes. After a more careful reading of the article referenced the race is not in `CreateProcess` itself, but rather the property that handles are unintentionally shared. * All stdio management now happens in platform-specific modules. This provides a cleaner implementation/interpretation for `FromFraw{Fd,Handle}` for each platform as well as a cleaner transition from a configuration to what-to-do once we actually need to do the spawn. With these refactorings in place, implementing `before_exec` and `exec` ended up both being pretty trivial! (each in their own commit)
Add two methods to the
std::os::unix::process::CommandExt
trait to providemore control over how processes are spawned on Unix, specifically:
Rendered