-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add liveness probe to Antrea Controller #1839
Add liveness probe to Antrea Controller #1839
Conversation
This is motivated by antrea-io#1837, in which the Antrea Controller got in a bad state (presubmably because of a bug in the apiserver library), and didn't recover. The endpoint was marked as non-Ready for the Antrea Service, which means the APIService became broken, since the Antrea Service has a single Endpoint (single Controller replica). With the liveness probe, the Antrea Controller would have been restarted by kubelet and the APIService would hopefully have recovered. The /healthz endpoint is deprecated since K8s v1.16, so we switch to /readyz and /livez.
7684406
to
b535177
Compare
Codecov Report
@@ Coverage Diff @@
## main #1839 +/- ##
=======================================
Coverage ? 61.80%
=======================================
Files ? 196
Lines ? 16655
Branches ? 0
=======================================
Hits ? 10294
Misses ? 5270
Partials ? 1091
Flags with carried forward coverage won't be shown. Click here to find out more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume /livez and /readyz are also available in 1.15?
They're not, but that reminds me I need to update generate-manifest.sh since we have the |
We are relying on apiserver library and not the K8s API of the cluster, so the paths should be there even in K8s 1.15 cluster? |
good point, I can remove my commit :) |
e752d43
to
b535177
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-all |
/test-e2e |
This is motivated by #1837, in which the Antrea Controller got in a bad
state (presubmably because of a bug in the apiserver library), and
didn't recover. The endpoint was marked as non-Ready for the Antrea
Service, which means the APIService became broken, since the Antrea
Service has a single Endpoint (single Controller replica). With the
liveness probe, the Antrea Controller would have been restarted by
kubelet and the APIService would hopefully have recovered.
The /healthz endpoint is deprecated since K8s v1.16, so we switch to
/readyz and /livez.