-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add graceful termination for custom builders #2886
Add graceful termination for custom builders #2886
Conversation
…l the process after a grace period
Codecov Report
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @cedrickring thanks for fixing this! Would you mind writing a unit test for this feature?
@cedrickring, i feel the better way to do this is by defining a different cleanUpCommand.
And then, skaffold to execute the |
Hey @tejal29, thanks for the feedback. The (up to) 30 seconds wait is not exactly for cleanup imo, but for the command to exit gracefully. Right now the process gets killed immediately after pressing Ctrl+C and doesn't have time to exit normally (temp files aren't cleaned up etc). Usually a process shouldn't take the whole 30 seconds to terminate. I think a cleanup script could be an additional option for this, but not a replacement. |
@cedrickring, thanks for clarifying. I see, you are implementing what golang/go#22757 proposes. I think, if we implement this solution, we should reduce the timeout for 2 seconds like |
fmt.Println("Got interrupt") | ||
}() | ||
|
||
fmt.Println("Sleeping 25 seconds") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am little bit concerned with these tests adding sleeps. Can you just add a bash command like sleep 5
in the build script?
Also, run it multiple times (100) to make sure its reliable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imo 100 repeats are a bit much for a unit test. Also inlining a bash script required some extra changes in code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the tests require the interrupt signal to be trapped. Otherwise they will always error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, i meant running it 100 times via go run -n -test.run <>
. I will do it myself before reviewing.
} | ||
|
||
func (b *ArtifactBuilder) handleGracefulTermination(ctx context.Context, cmd *exec.Cmd) error { | ||
done := make(chan struct{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really great. Just one question, i am assuming you have the done channel to return immediately if the process is already done? Lets put a defer close as soon as you create the channel
We can also shorten this code block further
func (b *ArtifactBuilder) handleGracefulTermination(ctx context.Context, cmd *exec.Cmd) error {
done := make(chan struct{})
defer(close(done))
go func() {
select {
case <- ctx.Done():
// Send interrupt signal if not windows.
return
case <- time.After(termination*time.Second):
// Send kill signal
return
case <-done
return
}
}()
err := cmd.Wait()
return err
}
}
I think this is it. And then we can merge this in!
Thank you for your patience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to your suggestion, the process will only run for termination
seconds and will then be killed. i think the nested select
is still necessary to wait for the context to be cancelled...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Running kokoro tests now
Fixes #2656
I added a 2 seconds grace period for the process to complete after receiving SIGINT.