Skip to content

Commit

Permalink
task runner: fix goroutine leak in prestart hook (#11741)
Browse files Browse the repository at this point in the history
The task runner prestart hooks take a `joincontext` so they have the
option to exit early if either of two contexts are canceled: from
killing the task or client shutdown. Some tasks exit without being
shutdown from the server, so neither of the joined contexts ever gets
canceled and we leak the `joincontext` (48 bytes) and its internal
goroutine. This primarily impacts batch jobs and any task that fails
or completes early such as non-sidecar prestart lifecycle tasks.
Cancel the `joincontext` after the prestart call exits to fix the
leak.
  • Loading branch information
tgross authored Dec 23, 2021
1 parent 000354a commit 631db25
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 4 deletions.
3 changes: 3 additions & 0 deletions .changelog/11741.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
```release-note:bug
client: Fixed a memory and goroutine leak for batch tasks and any task that exits without being shut down from the server
```
4 changes: 3 additions & 1 deletion client/allocrunner/interfaces/task_lifecycle.go
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,9 @@ type TaskPrestartHook interface {
// Prestart is called before the task is started including after every
// restart. Prestart is not called if the allocation is terminal.
//
// The context is cancelled if the task is killed or shutdown.
// The context is cancelled if the task is killed or shutdown but
// should not be stored any persistent goroutines this Prestart
// creates.
Prestart(context.Context, *TaskPrestartRequest, *TaskPrestartResponse) error
}

Expand Down
8 changes: 5 additions & 3 deletions client/allocrunner/taskrunner/task_runner_hooks.go
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,11 @@ func (tr *TaskRunner) prestart() error {
}()
}

// use a join context to allow any blocking pre-start hooks
// to be canceled by either killCtx or shutdownCtx
joinedCtx, joinedCancel := joincontext.Join(tr.killCtx, tr.shutdownCtx)
defer joinedCancel()

for _, hook := range tr.runnerHooks {
pre, ok := hook.(interfaces.TaskPrestartHook)
if !ok {
Expand Down Expand Up @@ -235,9 +240,6 @@ func (tr *TaskRunner) prestart() error {
}

// Run the prestart hook
// use a joint context to allow any blocking pre-start hooks
// to be canceled by either killCtx or shutdownCtx
joinedCtx, _ := joincontext.Join(tr.killCtx, tr.shutdownCtx)
var resp interfaces.TaskPrestartResponse
if err := pre.Prestart(joinedCtx, &req, &resp); err != nil {
tr.emitHookError(err, name)
Expand Down

0 comments on commit 631db25

Please sign in to comment.