-
-
Notifications
You must be signed in to change notification settings - Fork 649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Give interactive processes a chance to gracefully shutdown #14580
Give interactive processes a chance to gracefully shutdown #14580
Conversation
# Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
@@ -78,7 +130,7 @@ impl DerefMut for ManagedChild { | |||
impl Drop for ManagedChild { | |||
fn drop(&mut self) { | |||
if !self.killed.load(Ordering::SeqCst) { | |||
let _ = self.kill_pgid(); | |||
let _ = self.graceful_shutdown(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stuhood in all of my testing I was unable to trigger this codepath -- it seems that the process will actually hang if this path is removed. Maybe we can change the drop to simply call kill instead of graceful shutdown to avoid blocking in drop. Not sure if that is better though, since it could be a race-condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, yea. Possibly. On the other hand, the symmetry of using the same method here is a little bit better, and reduces the number of unique codepaths. So IMO, let's leave it.
We don't have direct unit tests of this code (mea culpa: sorry), but it's possible that we could cause this path to execute that way? Not a blocker for landing though, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great: thanks a lot!
@@ -78,7 +130,7 @@ impl DerefMut for ManagedChild { | |||
impl Drop for ManagedChild { | |||
fn drop(&mut self) { | |||
if !self.killed.load(Ordering::SeqCst) { | |||
let _ = self.kill_pgid(); | |||
let _ = self.graceful_shutdown(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, yea. Possibly. On the other hand, the symmetry of using the same method here is a little bit better, and reduces the number of unique codepaths. So IMO, let's leave it.
We don't have direct unit tests of this code (mea culpa: sorry), but it's possible that we could cause this path to execute that way? Not a blocker for landing though, I think.
cleanup_wait_time=2, | ||
) | ||
|
||
def test_pantsd_graceful_shutdown_deadline(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙌
A very impressive and high-impact first contribution 🙌 Thank you! I was at first thinking this would be good to cherry-pick to 2.10.0rc2, but given that signal handling can be finicky, I think it'd be better that we dogfood this for some time. Then we could cherry-pick to 2.10.1. Wdyt? |
Co-authored-by: Stu Hood <[email protected]>
# Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
# Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the contribution!
Overview
This PR aims to add a graceful shutdown feature for interactive processes.
Currently, when
pants run
receives aSIGINT
, it send aSIGKILL
to child processes which causes them to shutdown without an opportunity to run cleanup procedures (for instance,try .. finally
and context managers in python). This can cause file handles to be left open, streams to not be flushed, etc.This PR changes the behavior to first send a SIGINT and wait for the process to gracefully terminate within a pre-determined amount of time. If the process does not shut itself down gracefully, a SIGKILL is sent.
This problem is outlined in greater detail in #13230
Approach
This PR takes approach number (2) from #13230 (comment). This is was the lowest-lift implementation. It is slightly sub-optimal since it blocks in the
Drop
implementation, however, this should not cause deadlocks since it blocks for a maximum of a bounded time. Furthermore, in my benchmarking, this drop rarely runs, as most of the time the cancellation happens before drop and the drop becomes essentially a noop.An alternative is to run our own reaping process using a separate routine launched with
tokio::spawn
. The main loop could await this reaping process before shutting down. This would be more complex but would allow us to avoid using blocking calls where we should be awaiting.Long term a solution for
AsyncDrop
would solve this problem the most elegantly.Fixes #13230.