-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for startup healthchecks #13909
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mheon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
69a3442
to
27b97a1
Compare
@mheon this isn't even going to get to tests until you add the new options to man pages (they can be stubs) and add tests or the magic no-new-tests flag |
@edsantiago I'm still trying to get it working fully working locally, will get to manpages once that's done |
A friendly reminder that this PR had no activity for 30 days. |
A friendly reminder that this PR had no activity for 30 days. |
adf94ab
to
3551aaa
Compare
Want to add a few more tests but otherwise fundamentally complete |
@edsantiago Can you tell what the documentation errors are saying? I think I've added the relevant bits to the manpages |
docs/source/markdown/podman-run.1.md
Outdated
@@ -451,6 +451,38 @@ The number of retries allowed before a healthcheck is considered to be unhealthy | |||
The initialization time needed for a container to bootstrap. The value can be expressed in time format like | |||
**2m3s**. The default value is **0s**. | |||
|
|||
#### **-health-startup-cmd**=*"command"* | *'["command", "arg1", ...]'* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing dash: should be --health-etc
docs/source/markdown/podman-run.1.md
Outdated
A value of **0** means that any success will begin the regular healthcheck. | ||
The default is **0**. | ||
|
||
#### **--health-startup-timeout=*timeout* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing asterisks: should be timeout**
(timeout, star, star, equals)
@@ -408,6 +408,38 @@ The number of retries allowed before a healthcheck is considered to be unhealthy | |||
The initialization time needed for a container to bootstrap. The value can be expressed in time format like | |||
`2m3s`. The default value is `0s` | |||
|
|||
#### **-health-startup-cmd**=*"command"* | *'["command", "arg1", ...]'* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
A value of **0** means that any success will begin the regular healthcheck. | ||
The default is **0**. | ||
|
||
#### **--health-startup-timeout=*timeout* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
d0649bc
to
b964eee
Compare
Tests finally green. @containers/podman-maintainers PTAL |
LGTM |
@vrothberg PTAL |
I only find time reviewing small PRs at the moment as I am OOO. If you want my pair of eyes, I'd be happy to take a look on Tuesday. |
Fine by me |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM
Would love to see a test or two in the system tests as they are excecuted in gating.
// consecutive successes. | ||
func (c *Container) incrementStartupHCSuccessCounter(ctx context.Context) { | ||
if !c.batched { | ||
c.lock.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Shouldn't we lock irrespective of if it is batched or not ? We are incrementing SuccessCounter below without any lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A batched command holds a single lock for multiple operations (for performance reasons... guarantees no changes to the container while we're doing a series of operations, so no need to hit the DB to check for updates). As such, locking during batching is guaranteed to deadlock as the lock is already being held.
@mheon Sadly this needs a rebase. |
Startup healthchecks are similar to K8S startup probes, in that they are a separate check from the regular healthcheck that runs before it. If the startup healthcheck fails repeatedly, the associated container is restarted. Signed-off-by: Matthew Heon <[email protected]>
Rebased |
I think this is ready. Once CI passes, at least. |
/lgtm |
Extra function arguments were added in containers#13909. [NO NEW TESTS NEEDED] Signed-off-by: Doug Rabson <[email protected]>
TODO: Needs to be wired into
podman inspect
, needs tests written, needs manpagesAdd support for startup healthchecks. A startup healthcheck is a healthcheck that runs before the main healthcheck, to accomodate slow-starting containers; inspiration is mostly from K8S startup probes, and we'll be using this as a backend for supporting those in
podman play kube
once this merges.