-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Correct SIGTERM handling. Fixes #10518 #10337 #10033 #10490 #10520
Conversation
…rgoproj#10033 argoproj#10490 Signed-off-by: Alex Collins <[email protected]>
…rgoproj#10033 argoproj#10490 Signed-off-by: Alex Collins <[email protected]>
@@ -32,20 +30,13 @@ func waitContainer(ctx context.Context) error { | |||
defer stats.LogStats() | |||
stats.StartStatsTicker(5 * time.Minute) | |||
|
|||
// use a function to constrain the scope of ctx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we consider using the cmd.Context()
as the root-context ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been working with this, and I'm not sure my PR will fix the issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hope my work can help you 😄
Signed-off-by: Alex Collins <[email protected]>
func() { | ||
// this allows us to gracefully shutdown, capturing artifacts | ||
ctx, cancel := signal.NotifyContext(ctx, syscall.SIGTERM) | ||
defer cancel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sxllwx I think you're suggesting:
NotifyContext
catches a SIGTERM (probably sent from the controller).- This block will exit.
stop()
will be invoked, clearing the signal handler.- Another SIGTERM occurs (controller again), but without a handler we get default behaviour.
- Default behaviour is to exit (does this mean crash?).
If that is what you think is correct, then my fix should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I tried to look at the default processing method of Golang's runtime, and the relevant code:
https://github.com/golang/go/blob/master/src/runtime/signal_unix.go#L1068-L1074
https://github.com/golang/go/blob/master/src/runtime/signal_unix.go#L869-L875
Here's a test workflow I've been using for the past weekend, available for your use.
apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
generateName: etcd-code-artifact-
spec:
schedule: "*/1 * * * *"
concurrencyPolicy: "Forbid"
startingDeadlineSeconds: 0
workflowSpec:
entrypoint: run
imagePullSecrets:
- name: ccr-secret
templates:
- name: run
inputs:
artifacts:
- name: repo-artifact
path: /src
git:
repo: "https://github.com/etcd-io/etcd.git"
container:
image: "python:3.9-buster"
command: [ sh, -c ]
args: [ "git status && ls && pwd" ]
workingDir: /src
outputs:
artifacts:
- name: repo-artifact
path: /src
Closing as I've created #10523 which will push images for testing. |
Signed-off-by: Alex Collins [email protected]
This MIGHT fix these:
Fixes #10518
Fixes #10337
Fixes #10033
Fixes #10490
Please do not open a pull request until you have checked ALL of these:
make pre-commit -B
to fix codegen and lint problems.If changes were requested, and you've made them, dismiss the review to get it reviewed again.