Liveness probe support request #333

arturas-d · 2020-12-29T15:35:33Z

We ensure compliance of Kubernetes Cluster with company policies using Gatekeeper. One of the main policies is ensuring we have a liveness probe to helps the kubelet know when it should restart a container. Without a defined liveliness probe we simply can't deploy aws-node-termination-handler on our cluster as it breaks required policies.
Is there any recommended command to define liveness probe atm? Will it be supported in the future?

bwagner5 · 2020-12-29T18:25:36Z

I think the liveness probe is a reasonable thing to add to aws-node-termination-handler especially when operating in queue-processor mode (running as a K8s Deployment).

IMDS mode operates as a DaemonSet and by default runs w/ HostNetworking, meaning it shares the root network namespace of the node it's running on. Exposing a liveness endpoint in the root network namespace might be problematic because of port conflicts and security concerns.

I would propose implementing a configuration option called --enable-health=8080 or ENABLE_HEALTH=8080 which would activate a health check HTTP endpoint on the specified port which can be used as the liveness probe. The helm chart could set enableHealth default as 8080 in queue-processor mode and disable it in IMDS mode (the user could still enable it in IMDS mode if they wanted).

$ curl localhost:8080/health
{
   "health": "OK"
}

As for what the health endpoint should check... I'm not quite sure right now. It might be enough the webserver is running, at least the first draft of it. aws-node-termination-handler is pretty liberal on exiting if a fatal error or persistent errors occur since it know kubernetes will restart it.

How does this sound to you @arturas-d ? Anyone else with an opinion on this proposal, let me know!

bwagner5 · 2020-12-29T18:30:36Z

@arturas-d you also asked if there was something existing that could be used for the liveness probe. You could use the /metrics endpoint. This endpoint is disabled by default (for the same reasons I mentioned above for IMDS mode operating w/ HostNetworking).

You can configure it with the following parameters in the helm chart (https://github.com/aws/aws-node-termination-handler/tree/main/config/helm/aws-node-termination-handler):

enablePrometheusServer
prometheusServerPort

And then modify the PodSpec to something like this:

livenessProbe:
  httpGet:
    path: /metrics
    port: 9092
  initialDelaySeconds: 10

tomelliff · 2021-03-25T18:31:50Z

@sliaptsou are you planning on raising that as a pull request back upstream?

sliaptsou · 2021-03-25T18:36:31Z

@tomelliff what do you mean?

sliaptsou · 2021-03-25T18:42:29Z

@bwagner5 Could you please review again?

bwagner5 · 2021-03-26T17:33:11Z

@sliaptsou Yes! Sorry for the delay!

bwagner5 · 2021-03-26T18:47:17Z

Merged and released!

tomelliff · 2021-03-29T12:45:06Z

@tomelliff what do you mean?

Arf, apologies. I only saw your PR on your own fork that you had merged and didn't see the PR link back upstream.

bwagner5 added the Type: Enhancement New feature or request label Dec 29, 2020

bwagner5 added the good first issue Good for newcomers label Jan 6, 2021

sliaptsou mentioned this issue Feb 5, 2021

Implement Liveness probe sliaptsou/aws-node-termination-handler#1

Closed

jillmon added Status: Help Wanted Status: Help Wanted and removed Status: Help Wanted labels Feb 11, 2021

This was referenced Mar 12, 2021

Implement Liveness probe sliaptsou/aws-node-termination-handler#2

Merged

Liveness probe support #389

Merged

bwagner5 closed this as completed Mar 26, 2021

gabegorelick mentioned this issue Oct 27, 2021

Enable livenessProbe in queue processor mode by default #520

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Liveness probe support request #333

Liveness probe support request #333

arturas-d commented Dec 29, 2020

bwagner5 commented Dec 29, 2020

bwagner5 commented Dec 29, 2020

tomelliff commented Mar 25, 2021

sliaptsou commented Mar 25, 2021

sliaptsou commented Mar 25, 2021

bwagner5 commented Mar 26, 2021

bwagner5 commented Mar 26, 2021

tomelliff commented Mar 29, 2021

Liveness probe support request #333

Liveness probe support request #333

Comments

arturas-d commented Dec 29, 2020

bwagner5 commented Dec 29, 2020

bwagner5 commented Dec 29, 2020

tomelliff commented Mar 25, 2021

sliaptsou commented Mar 25, 2021

sliaptsou commented Mar 25, 2021

bwagner5 commented Mar 26, 2021

bwagner5 commented Mar 26, 2021

tomelliff commented Mar 29, 2021