Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Correct SIGTERM handling. Fixes #10518 #10337 #10033 #10490 #10520

Closed
wants to merge 4 commits into from

Conversation

alexec
Copy link
Contributor

@alexec alexec commented Feb 12, 2023

Signed-off-by: Alex Collins [email protected]

This MIGHT fix these:

Fixes #10518
Fixes #10337
Fixes #10033
Fixes #10490

Please do not open a pull request until you have checked ALL of these:

  • Create the PR as draft .
  • Run make pre-commit -B to fix codegen and lint problems.
  • Sign-off your commits (otherwise the DCO check will fail).
  • Use a conventional commit message (otherwise the commit message check will fail).
  • "Fixes #" is in both the PR title (for release notes) and this description (to automatically link and close the issue).
  • Add unit or e2e tests. Say how you tested your changes. If you changed the UI, attach screenshots.
  • Github checks are green.
  • Once required tests have passed, mark your PR "Ready for review".

If changes were requested, and you've made them, dismiss the review to get it reviewed again.

@alexec alexec changed the title fix: Correct SIGTERM handling. Fixes ##10518 #10337 #10033 #10490 fix: Correct SIGTERM handling. Fixes #10518 #10337 #10033 #10490 Feb 12, 2023
@alexec alexec marked this pull request as ready for review February 12, 2023 22:29
@alexec alexec enabled auto-merge February 12, 2023 22:29
@alexec alexec disabled auto-merge February 13, 2023 03:36
@@ -32,20 +30,13 @@ func waitContainer(ctx context.Context) error {
defer stats.LogStats()
stats.StartStatsTicker(5 * time.Minute)

// use a function to constrain the scope of ctx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we consider using the cmd.Context() as the root-context ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been working with this, and I'm not sure my PR will fix the issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope my work can help you 😄

#10490 (comment)

Signed-off-by: Alex Collins <[email protected]>
func() {
// this allows us to gracefully shutdown, capturing artifacts
ctx, cancel := signal.NotifyContext(ctx, syscall.SIGTERM)
defer cancel()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sxllwx I think you're suggesting:

  1. NotifyContext catches a SIGTERM (probably sent from the controller).
  2. This block will exit.
  3. stop() will be invoked, clearing the signal handler.
  4. Another SIGTERM occurs (controller again), but without a handler we get default behaviour.
  5. Default behaviour is to exit (does this mean crash?).

If that is what you think is correct, then my fix should work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I tried to look at the default processing method of Golang's runtime, and the relevant code:

https://github.com/golang/go/blob/master/src/runtime/signal_unix.go#L1068-L1074

https://github.com/golang/go/blob/master/src/runtime/signal_unix.go#L869-L875

Here's a test workflow I've been using for the past weekend, available for your use.

apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
   generateName: etcd-code-artifact-
spec:
   schedule: "*/1 * * * *"
   concurrencyPolicy: "Forbid"
   startingDeadlineSeconds: 0
   workflowSpec:
    entrypoint: run
    imagePullSecrets:
    - name: ccr-secret
    templates:
      - name: run
        inputs:
            artifacts:
            - name: repo-artifact
              path: /src
              git:
                repo: "https://github.com/etcd-io/etcd.git"
        container:
          image: "python:3.9-buster"
          command: [ sh, -c ]
          args: [ "git status && ls && pwd" ]
          workingDir: /src
        outputs:
          artifacts:
            - name: repo-artifact
              path: /src

@alexec
Copy link
Contributor Author

alexec commented Feb 13, 2023

Closing as I've created #10523 which will push images for testing.

@alexec alexec closed this Feb 13, 2023
@alexec alexec deleted the sigterm branch February 13, 2023 03:51
@agilgur5 agilgur5 added the solution/duplicate This issue or PR is a duplicate of an existing one label Jan 15, 2024
@agilgur5 agilgur5 added solution/superseded This PR or issue has been superseded by another one (slightly different from a duplicate) and removed solution/duplicate This issue or PR is a duplicate of an existing one labels Feb 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/executor solution/superseded This PR or issue has been superseded by another one (slightly different from a duplicate)
Projects
None yet
3 participants