Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add liveness probe to Antrea Controller #1839

Conversation

antoninbas
Copy link
Contributor

This is motivated by #1837, in which the Antrea Controller got in a bad
state (presubmably because of a bug in the apiserver library), and
didn't recover. The endpoint was marked as non-Ready for the Antrea
Service, which means the APIService became broken, since the Antrea
Service has a single Endpoint (single Controller replica). With the
liveness probe, the Antrea Controller would have been restarted by
kubelet and the APIService would hopefully have recovered.

The /healthz endpoint is deprecated since K8s v1.16, so we switch to
/readyz and /livez.

This is motivated by antrea-io#1837, in which the Antrea Controller got in a bad
state (presubmably because of a bug in the apiserver library), and
didn't recover. The endpoint was marked as non-Ready for the Antrea
Service, which means the APIService became broken, since the Antrea
Service has a single Endpoint (single Controller replica). With the
liveness probe, the Antrea Controller would have been restarted by
kubelet and the APIService would hopefully have recovered.

The /healthz endpoint is deprecated since K8s v1.16, so we switch to
/readyz and /livez.
@antoninbas antoninbas force-pushed the add-liveness-probe-for-antrea-controller branch from 7684406 to b535177 Compare February 8, 2021 22:44
@codecov-io
Copy link

codecov-io commented Feb 8, 2021

Codecov Report

❗ No coverage uploaded for pull request base (main@abb6c33). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1839   +/-   ##
=======================================
  Coverage        ?   61.80%           
=======================================
  Files           ?      196           
  Lines           ?    16655           
  Branches        ?        0           
=======================================
  Hits            ?    10294           
  Misses          ?     5270           
  Partials        ?     1091           
Flag Coverage Δ
kind-e2e-tests 51.22% <0.00%> (?)
unit-tests 42.81% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

jianjuns
jianjuns previously approved these changes Feb 9, 2021
Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume /livez and /readyz are also available in 1.15?

@antoninbas
Copy link
Contributor Author

I assume /livez and /readyz are also available in 1.15?

They're not, but that reminds me I need to update generate-manifest.sh since we have the --k8s-1.15 option

jianjuns
jianjuns previously approved these changes Feb 9, 2021
@tnqn
Copy link
Member

tnqn commented Feb 9, 2021

I assume /livez and /readyz are also available in 1.15?

They're not, but that reminds me I need to update generate-manifest.sh since we have the --k8s-1.15 option

We are relying on apiserver library and not the K8s API of the cluster, so the paths should be there even in K8s 1.15 cluster?

@antoninbas
Copy link
Contributor Author

I assume /livez and /readyz are also available in 1.15?

They're not, but that reminds me I need to update generate-manifest.sh since we have the --k8s-1.15 option

We are relying on apiserver library and not the K8s API of the cluster, so the paths should be there even in K8s 1.15 cluster?

good point, I can remove my commit :)

@antoninbas antoninbas force-pushed the add-liveness-probe-for-antrea-controller branch from e752d43 to b535177 Compare February 9, 2021 02:05
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@antoninbas
Copy link
Contributor Author

/test-all

@antoninbas
Copy link
Contributor Author

/test-e2e

@antoninbas antoninbas added this to the Antrea v0.13.0 release milestone Feb 9, 2021
@antoninbas antoninbas merged commit 138f137 into antrea-io:main Feb 9, 2021
@antoninbas antoninbas deleted the add-liveness-probe-for-antrea-controller branch February 9, 2021 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants