-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Docker Compose container execution #687
Conversation
internal/compose/compose.go
Outdated
continue | ||
} | ||
|
||
if containerDescription.State.Status == "exited" && containerDescription.State.ExitCode > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same line have been added in 8563a1c#diff-8019656c843609e3cb15ca57ab96dc41fdadae8215d0ff22d9b989f795aef4a4 so this PR will need to be rebased on top of main
once #688 is merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
`WaitForHealthy()` does not implement any logic to timeout if the healthy condition is never met. By adding a timeout we make sure this function ends if the expected condition cannot be met in a certain (known) time frame.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ticker needs to be stopped, for the rest it LGTM 👍
internal/compose/compose.go
Outdated
@@ -275,54 +275,58 @@ func (p *Project) WaitForHealthy(opts CommandOptions) error { | |||
return err | |||
} | |||
|
|||
timeout := time.After(10 * time.Minute) | |||
ticker := time.NewTicker(1 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ticker := time.NewTicker(1 * time.Second) | |
ticker := time.NewTicker(1 * time.Second) | |
defer ticker.Stop() |
internal/compose/compose.go
Outdated
@@ -275,54 +275,58 @@ func (p *Project) WaitForHealthy(opts CommandOptions) error { | |||
return err | |||
} | |||
|
|||
timeout := time.After(10 * time.Minute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to suggest to use context.WithTimeout()
, so we can also interrupt calls to docker if they hang, but leveraging this context would require more changes in internal/docker
.
internal/compose/compose.go
Outdated
case <-ticker.C: | ||
healthy := true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit. Do nothing and continue in this case
, to have the main part of the loop in the upper level.
case <-ticker.C: | |
healthy := true | |
case <-ticker.C: | |
} | |
healthy := true |
internal/compose/compose.go
Outdated
@@ -275,54 +275,58 @@ func (p *Project) WaitForHealthy(opts CommandOptions) error { | |||
return err | |||
} | |||
|
|||
timeout := time.After(10 * time.Minute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this timeout to const at the top of the file and name it accordingly.
internal/compose/compose.go
Outdated
} | ||
select { | ||
case <-timeout: | ||
return errors.New("time out waiting for healthy container") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: timeout waiting...
internal/compose/compose.go
Outdated
if err != nil { | ||
return err | ||
} | ||
select { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me challenge this and consider another approach.
We already have "for" loop with a 1-sec delay per iteration. The function can store time.Now()
and just compare it if the startTime.Add(timeout)
is greater than time.Now
. We don't need to add channels/select/ticker. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent some time searching differences between time.Sleep
and time.Tick
, and the only thing I found is that a tick tries to guarantee the interval, while sleep is unaware of execution times. There seems to be no difference in performance for the 2 implementations.
In general I prefer the ticker approach as it's behaviour is explicit, but in this case I don't see particular benefits as guaranteeing the interval is not a concern.
I'll rework it with your approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I prefer the ticker and select approach too. It also works better than time arithmetic when you have multiple sources of events, as may happen if/when we have contexts here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we can wait to have contexts here, not a strong opinion 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused here as adding a for-select block will introduce another nesting level and make the function hard to read. TBH I would follow the KISS rule, so if it's possible to solve with a single if-clause, I would go for it.
In general, we can consider refactoring of this area, as there is a similar logic in the SIGINT function. It would be great, If we can refactor all parts (SIGINT, timeout, waiter logic, contexts). You have my sword :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By running a quick search I found there are 5 for {
loops in the codebase. I think it would make a lot of sense to generalize the approach, so interval/timeout/signals are handled consistently.
Are you ok to keep it for a future PR? I already have a ticker-based Waiter
struct implementation to use as a baseline, but I'm blocked on other work streams so I would prefer to push this refactor to the back burner for a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you ok to keep it for a future PR? I already have a ticker-based Waiter struct implementation to use as a baseline, but I'm blocked on other work streams so I would prefer to push this refactor to the back burner for a bit.
Sure, we can leave it in the backlog, that what I meant. Let's focus on making this PR healthy.
internal/compose/compose.go
Outdated
timeout := time.Duration(waitForHealthyTimeout) | ||
startTime := time.Now() | ||
timeoutTime := startTime.Add(timeout) | ||
interval := time.Duration(1 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think you can push this one to consts too.
internal/compose/compose.go
Outdated
@@ -275,20 +277,23 @@ func (p *Project) WaitForHealthy(opts CommandOptions) error { | |||
return err | |||
} | |||
|
|||
timeout := time.Duration(waitForHealthyTimeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can push this one as var
to the top of the file and name accordingly.
internal/compose/compose.go
Outdated
} | ||
|
||
if !healthy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure about shuffling around with healthy
var? I think it broke down system tests. You can look at the original implementation and look for the bug or revert it, and leave the timeout improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the bug is obvious, I misinterpreted the logic. healthy
must be reinitialized at each loop, otherwise once it become false
the logic does not change it's value anymore. I tried to use it to detect the timeout, but I stumbled upon the bug. The current implementation is clearer and uses times instead of a side effect on the healthy
value.
internal/compose/compose.go
Outdated
healthy := true | ||
|
||
logger.Debugf("Wait for healthy containers: %s", strings.Join(containerIDs, ",")) | ||
descriptions, err := docker.InspectContainers(containerIDs...) | ||
if err != nil { | ||
return err | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: let's leave this empty line
internal/compose/compose.go
Outdated
// waitForHealthyTimeout is the maximum duration for WaitForHealthy(). | ||
var waitForHealthyTimeout = time.Duration(10 * time.Minute) | ||
|
||
// waitForHealthyInterval is the check interval for WaitForHealthy(). | ||
const waitForHealthyInterval = time.Duration(1 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think you can make those consistent:
const (
waitForHealthyTimeout = ...
waitForHealthyInterval = ...
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have anything against for {
s, I am ok with leaving them if they make sense 🙂
internal/compose/compose.go
Outdated
@@ -23,6 +23,12 @@ import ( | |||
"github.com/elastic/elastic-package/internal/signal" | |||
) | |||
|
|||
// waitForHealthyTimeout is the maximum duration for WaitForHealthy(). | |||
var waitForHealthyTimeout = time.Duration(10 * time.Minute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit. No need to convert to time.Duration
, a number multiplying a duration is already a duration.
internal/compose/compose.go
Outdated
containerIDs := strings.Split(strings.TrimSpace(b.String()), "\n") | ||
for { | ||
for time.Now().Before(timeout) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit. Break the loop directly with a return so we don't have to remember to check for the timeout again after the loop? (And we are consistent on how we handle other end conditions as SIGINT
).
for time.Now().Before(timeout) { | |
for { | |
if time.Now().After(timeout) { | |
return errors.New("timeout waiting for healthy container") | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR has been extracted from #665.
When
elastic-package
runs docker containers it relies onWaitForHealthy()
to ensure expected containers are running.This functions has a missing feature: timing out if the condition is not met.
This PR adds this feature, adding a 10 minute timeout after which an error is returned.
To prevent waiting time waiting for a condition that can never be met a check is added for unrecoverable termination condition (container is reported as exited and with an exit code greater than 0) to immediately exit with an error, skipping waiting until timeout expiration.