-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client: improve group service stanza interpolation and check_restart support #6586
Conversation
defer close(waitCh) | ||
for tn, tr := range ar.tasks { | ||
wg.Add(1) | ||
go func(taskName string, r agentconsul.WorkloadRestarter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviews: This might be overkill but I wanted restarts to be as quick as possible especially when there are numerous tasks in the group. I'd like to hear your thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this replace the RestartAll
method below?
Also, what happens when we're in the middle of this Restart
but we get a call to update or kill? The alloc runner hooks get their actions serialized by a lock but allocRunner
has a lot of "make sure you call this after/before this other method". (Although this is probably out of scope for this PR because that's the existing condition of things here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nickethier I've left a few comments/questions but overall this looks like a big improvement. 👍
UpdateTask(old, newTask *consul.TaskServices) error | ||
RegisterWorkload(*consul.WorkloadServices) error | ||
RemoveWorkload(*consul.WorkloadServices) | ||
UpdateWorkload(old, newTask *consul.WorkloadServices) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the reduction in information sharing we're doing here by passing in this consul.WorkloadServices
rather than the whole structs.Allocation
. 👍
defer close(waitCh) | ||
for tn, tr := range ar.tasks { | ||
wg.Add(1) | ||
go func(taskName string, r agentconsul.WorkloadRestarter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this replace the RestartAll
method below?
Also, what happens when we're in the middle of this Restart
but we get a call to update or kill? The alloc runner hooks get their actions serialized by a lock but allocRunner
has a lot of "make sure you call this after/before this other method". (Although this is probably out of scope for this PR because that's the existing condition of things here.)
…support Interpolation can now be done on group service stanzas. Note that some task runtime specific information that was previously available when the service was registered poststart of a task is no longer available. The check_restart stanza for checks defined on group services will now properly restart the allocation upon check failures if configured.
13d0704
to
02a9b54
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bindata_assetfs got removed.
This looks really great! I love the renaming/refactoring. I don't think anything I suggested would take a re-review as it's all trivial or a removal, so I'll mark as Approved.
if e != nil { | ||
errMutex.Lock() | ||
defer errMutex.Unlock() | ||
err = multierror.Append(err, fmt.Errorf("failed to restart task %s: %v", taskName, e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do this pattern elsewhere. Might be nice to create an errgroup like package on top of multierror that runs all goroutines and returns all errors, unlike errgroup which only returns the first error and cancels the other goroutines.
Not a blocker and would make a nice followup PR if you want to chase it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a PR to multierror for this pattern. Will open up a followup Nomad PR to make use of it once I get it merged. hashicorp/go-multierror#28
|
||
var networks structs.Networks | ||
if cfg.alloc.AllocatedResources != nil { | ||
networks = cfg.alloc.AllocatedResources.Shared.Networks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
networks = cfg.alloc.AllocatedResources.Shared.Networks | |
h.networks = cfg.alloc.AllocatedResources.Shared.Networks |
I don't think you need the local networks var since its zero value is what h.networks is initialized to anyway.
@@ -44,6 +44,7 @@ type RunnerUpdateHook interface { | |||
|
|||
type RunnerUpdateRequest struct { | |||
Alloc *structs.Allocation | |||
Node *structs.Node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above: I think this is static and therefore doesn't need to be included in Update calls.
client/taskenv/env.go
Outdated
@@ -575,6 +578,7 @@ func (b *Builder) setAlloc(alloc *structs.Allocation) *Builder { | |||
b.namespace = alloc.Namespace | |||
|
|||
// Set meta | |||
// TODO nickethier make sure this works |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😂 I eagerly await this comment being red too.
require.NotPanics(func() { | ||
taskEnv = NewBuilder(node, alloc, nil, "global").SetAllocDir("/tmp/alloc").Build() | ||
}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to add some assertions here? NotPanics is a pretty low bar.
Restart(ctx context.Context, event *structs.TaskEvent, failure bool) error | ||
} | ||
|
||
func NoopRestarter() WorkloadRestarter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary anymore outside of tests? Maybe move it to a testing.go file if it's needed for multiple package's tests.
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
Interpolation can now be done on group service stanzas. Note that some task runtime specific information
that was previously available when the service was registered poststart of a task is no longer available.
The check_restart stanza for checks defined on group services will now properly restart the allocation upon
check failures if configured.
fixes #6272