Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extra health check for k8s installation #266

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kahou82
Copy link
Member

@kahou82 kahou82 commented Oct 9, 2017

Currently, contiv k8s installation only check contiv related pod status.
This is not sufficient enough as there is a chance that we cannot
run any other pod.

This ticket is to enhance k8s installation to ensure that we can
start a regular pod successfully.

Signed-off-by: Kahou Lei [email protected]

Currently, contiv k8s installation only check contiv related pod status.
This is not sufficient enough as there is a chance that we cannot
run any other pod.

This ticket is to enhance k8s installation to ensure that we can
start a regular pod successfully.

Signed-off-by: Kahou Lei <[email protected]>
@unclejack
Copy link
Contributor

@kahou82: I've looked into this. Unfortunately, we need to add this check after default-net is created. The bootstrap of the cluster never finishes otherwise. The default-net isn't created if we don't finish the setup without error in install.sh. A good place to have this test would be after this block https://github.com/contiv/install/blob/master/scripts/kubeadm_test.sh#L57

Once the pod is fine, we should also clean it up.


for i in {0..15}; do
sleep 2
$kubectl get pods | grep test-pod | grep -v "Running" && continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may want to check all namespaces

@neelimamukiri
Copy link
Contributor

+1 to @unclejack that we need the default-net created before the pods can run. I dont think we want to check all namespaces as system namespace will have pods running with host=net.

If possible, can you add a message saying that installation is complete and you are starting some kind of a self test. That way if things do fail people will know what to do next - may be point them to contiv support FAQ page or something :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants