Skip to content

Commit

Permalink
Document new check_restart stanza
Browse files Browse the repository at this point in the history
  • Loading branch information
schmichael committed Sep 11, 2017
1 parent 137d5c1 commit a6fbbdd
Show file tree
Hide file tree
Showing 2 changed files with 94 additions and 1 deletion.
21 changes: 20 additions & 1 deletion website/source/api/json-jobs.html.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,12 @@ Below is the JSON representation of the job outputed by `$ nomad init`:
"Interval": 10000000000,
"Timeout": 2000000000,
"InitialStatus": "",
"TLSSkipVerify": false
"TLSSkipVerify": false,
"CheckRestart": {
"Limit": 3,
"Grace": "30s",
"IgnoreWarnings": false
}
}]
}],
"Resources": {
Expand Down Expand Up @@ -377,6 +382,20 @@ The `Task` object supports the following keys:
- `TLSSkipVerify`: If true, Consul will not attempt to verify the
certificate when performing HTTPS checks. Requires Consul >= 0.7.2.

- `CheckRestart`: `CheckRestart` is an object which enables
restarting of tasks based upon Consul health checks.

- `Limit`: The number of unhealthy checks allowed before the
service is restarted. Defaults to `0` which disables
health-based restarts.

- `Grace`: The duration to wait after a task starts or restarts
before counting unhealthy checks count against the limit.
Defaults to "1s".

- `IgnoreWarnings`: Treat checks that are warning as passing.
Defaults to false which means warnings are considered unhealthy.

- `ShutdownDelay` - Specifies the duration to wait when killing a task between
removing it from Consul and sending it a shutdown signal. Ideally services
would fail healthchecks once they receive a shutdown signal. Alternatively
Expand Down
74 changes: 74 additions & 0 deletions website/source/docs/job-specification/service.html.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,12 @@ job "docs" {
args = ["--verbose"]
interval = "60s"
timeout = "5s"
check_restart {
limit = 3
grace = "90s"
ignore_warnings = false
}
}
}
}
Expand Down Expand Up @@ -162,6 +168,72 @@ scripts.
- `tls_skip_verify` `(bool: false)` - Skip verifying TLS certificates for HTTPS
checks. Requires Consul >= 0.7.2.

#### `check_restart` Stanza

As of Nomad 0.7 `check` stanzas may include a `check_restart` stanza to restart
tasks with unhealthy checks. Restarts use the parameters from the
[`restart`][restart_stanza] stanza, so if a task group has the default `15s`
delay, tasks won't be restarted for an extra 15 seconds after the
`check_restart` block considers it failed. `check_restart` stanzas have the
follow parameters:

- `limit` `(int: 0)` - Restart task after `limit` failing health checks. For
example 1 causes a restart on the first failure. The default, `0`, disables
healtcheck based restarts. Failures must be consecutive. A single passing
check will reset the count, so flapping services may not be restarted.

- `grace` `(string: "1s")` - Duration to wait after a task starts or restarts
before checking its health. On restarts the `delay` and max jitter is added
to the grace period to prevent checking a task's health before it has
restarted.

- `ignore_warnings` `(bool: false)` - By default checks with both `critical`
and `warning` statuses are considered unhealthy. Setting `ignore_warnings =
true` treats a `warning` status like `passing` and will not trigger a restart.

For example:

```hcl
restart {
delay = "8s"
}
task "mysqld" {
service {
# ...
check {
type = "script"
name = "check_table"
command = "/usr/local/bin/check_mysql_table_status"
args = ["--verbose"]
interval = "20s"
timeout = "5s"
check_restart {
# Restart the task after 3 consecutive failed checks (180s)
limit = 3
# Ignore failed checks for 90s after a service starts or restarts
grace = "90s"
# Treat warnings as unhealthy (the default)
ignore_warnings = false
}
}
}
}
```

In this example the `mysqld` task has `90s` from startup to begin passing
healthchecks. After the grace period if `mysqld` would remain unhealthy for
`60s` (as determined by `limit * interval`) it would be restarted after `8s`
(as determined by the `restart.delay`). Nomad would then wait `100s` (as
determined by `grace + delay + (delay * 0.25)`) before checking `mysqld`'s
health again.

~> `check_restart` stanzas may also be placed in `service` stanzas to apply the
same restart logic to multiple checks.

#### `header` Stanza

HTTP checks may include a `header` stanza to set HTTP headers. The `header`
Expand All @@ -170,6 +242,7 @@ the header to be set multiple times, once for each value.

```hcl
service {
# ...
check {
type = "http"
port = "lb"
Expand Down Expand Up @@ -319,3 +392,4 @@ system of a task for that driver.</small>
[interpolation]: /docs/runtime/interpolation.html "Nomad Runtime Interpolation"
[network]: /docs/job-specification/network.html "Nomad network Job Specification"
[qemu]: /docs/drivers/qemu.html "Nomad qemu Driver"
[restart_stanza]: /docs/job-specification/restart.html "restart stanza"

0 comments on commit a6fbbdd

Please sign in to comment.