-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add component healthcheck api design #34
base: main
Are you sure you want to change the base?
Changes from 7 commits
cd99db6
14e30a2
9ee3a92
22d9b7e
5a32ea2
82a5060
f92c475
ea97fc8
7a84066
fe5e04b
356976a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
# Components Healthcheck In Dapr | ||
|
||
* Author(s): Deepanshu Agarwal (@DeepanshuA) | ||
* State: Draft | ||
* Updated: 07/20/2023 | ||
|
||
## Overview | ||
As a user, I should be able to enquire the health of dapr system, which is possible currently - but this health check should also include health of connected dapr components as well. | ||
|
||
## Tenets | ||
1. It should be extensible i.e. tomorrow if any feature to check health of Actors etc. is required, it should not require a new endpoint. | ||
|
||
## Current Scenario | ||
There are many components in Dapr which don't yet implement Ping. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Comment from Alessandro (@ItalyPaleAle ): That said, we should review components that don't implement Ping, and see if adding it would be useful. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed |
||
Ping is not mandatory to be implemented by Components, which is the correct behavior, as it could lead to false positives. | ||
For this components health-check, components not implementing Ping will be omitted out. | ||
|
||
## API Design | ||
### Endpoint: | ||
Instead of an additional endpoint, the Approach underneath works with a query parameter, in addition to `healthz` endpoint. | ||
|
||
Following are the possible responses for `healthz` API: | ||
|
||
| HTTP | Response Codes | | ||
| -------- | -------- | | ||
| 204 | dapr is healthy | | ||
| 500 | dapr is not healthy | | ||
|
||
Hence, if dapr is healthy, then the response code changes, as per the enlisted cases below. | ||
If dapr is NOT healthy, then the compoenents Health check should anyways NOT be checked. | ||
|
||
http://localhost:3500/v1.0/healthz?include_components=true | ||
|
||
### Approach: | ||
- Maintain a cache with status of all components loaded successfully and keep updating this cache in a background go routine at a configurable `pingUpdateFrequency`. By default, `pingUpdateFrequency` to be 5 minutes. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some thoughts:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If un-healthy we should use an exponential backoff (with a limit of |
||
|
||
- This cache will not start to be built, right at the boot of daprd sidecar. There will be flag (let's say `collectPings`), which will be `false` at the beginning of the daprd sidecar and which will be turned `true`, once all the components are ready. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How are these options passed? Are they in the Configuration CRD? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
Once, `collectPings` is `true`, the cache will start to be populated. | ||
|
||
- But, what happens if a component fails to initialize? | ||
If a mandatory component fails to initialize, then daprd will not come up healthy and thus, anyways health check will report unhealthy. | ||
If an optional component fails to initialize OR if it is not healthy afterwards as well, then it is governed by a query parameter `ignoreOptionalComponent`, which is `true` by default. So, as the name suggests, if this query param is not set to `false`, then optional components failure to initialize OR failure to report healthy will not result in a un-healthy status. Rather, the http status code 207 will be reported back. | ||
ItalyPaleAle marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- For components which don't yet implement Ping, they will be ignored for their health check. | ||
|
||
- Response of healthz endpoint will be always only a http status code. | ||
The result i.e. `healthCheckStatus` and `errorCode`/`message`, if required will be provided as part of `metadata` API. This is done, as dapr healthz endpoint is publicly accessible, so for security constraints, it doesn't deem good to provide component names etc. as a response of `healthz` endpoint. | ||
|
||
- To implement only http endpoint, at least for the first version of this API. | ||
|
||
- This endpoint is not supposed to support query for a particular component Health check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment from Alessandro (@ItalyPaleAle ): I don't know if making Ping mandatory is needed. A lot of components are stateless (for example, they don't maintain persistent connections with a remove service). IMHO it's fine to include Ping in an optional interface.
Ref: https://hackmd.io/MaNpUYRyQqe-0eqCsSUiIg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it is Optional only right now. And, that is the correct state in my opinion too. The doc also doesn't recommend it to make mandatory.