-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trhooks: Add TaskStopHook interface to services #5821
Conversation
036f939
to
ec18a80
Compare
We currently only run cleanup Service Hooks when a task is either Killed, or Exited. However, due to the implementation of a task runner, tasks are only Exited if they every correctly started running, which is not true when you recieve an error early in the task start flow, such as not being able to pull secrets from Vault. This updates the service hook to also call consul deregistration routines during a task Stop lifecycle event, to ensure that any registered checks and services are cleared in such cases. fixes #5770
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change lgtm.
It's quite reasonable to have Stop
here deregister as a catch-all case as Stop()
is called after alloc is marked as dead in
nomad/client/allocrunner/taskrunner/task_runner.go
Lines 525 to 531 in ee7803d
// Mark the task as dead | |
tr.UpdateState(structs.TaskStateDead, nil) | |
// Run the stop hooks | |
if err := tr.stop(); err != nil { | |
tr.logger.Error("stop failed", "error", err) | |
} |
PreKill
and Exited
cases.
It would be great to have an integration test that ensures that service is deregistered when prestart hook fails.
|
||
// Removing canary and non-canary entries on stop | ||
require.Equal(t, "remove", consulOps[1].Op) | ||
require.Equal(t, "remove", consulOps[2].Op) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean 6 and 7 here?
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
We currently only run cleanup Service Hooks when a task is either
Killed, or Exited. However, due to the implementation of a task runner,
tasks are only Exited if they every correctly started running, which is
not true when you recieve an error early in the task start flow, such as
not being able to pull secrets from Vault.
This updates the service hook to also call consul deregistration
routines during a task
Stop
lifecycle event, to ensure that anyregistered checks and services are cleared in such cases.
fixes #5770