You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What steps did you take and what happened:
[A clear and concise description of what the bug is.]
In most attempts, only haproxy and controlplane nodes are created, not worker nodes.
NAME AGE
haproxyloadbalancer.infrastructure.cluster.x-k8s.io/vsphere-quickstart 24m
NAME PROVIDERID PHASE
machine.cluster.x-k8s.io/vsphere-quickstart-gbr84 vsphere://42191e50-0a4f-1dd0-b03c-a23d14ef90f5 Provisioning
machine.cluster.x-k8s.io/vsphere-quickstart-md-0-76675f574-5n9k6 Pending
As a result of the trace, it is understood that the communication timeout for the controlPlaneEndpoint(HAProxy) occurred during the kubeadm init process (timeout: 4m0s).
Over time, communication with the controlPlaneEndpoint is successful, but coredns and kube-proxy are not installed due to kubeadm init failure.
This is a bug that did not occur in CAPV v0.6.1 and only in v0.6.2.
What did you expect to happen: kubeadm init should succeed.
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
This is the contolplane node log where kubeadm init failed.
Mar 25 07:56:03 vsphere-quickstart-gbr84 cloud-init: W0325 07:56:03.136945 1793 validation.go:28] Cannot validate kube-proxy config - no validator is available
Mar 25 07:56:03 vsphere-quickstart-gbr84 cloud-init: W0325 07:56:03.136956 1793 validation.go:28] Cannot validate kubelet config - no validator is available
Mar 25 07:56:03 vsphere-quickstart-gbr84 cloud-init: [init] Using Kubernetes version: v1.17.3
Mar 25 07:56:03 vsphere-quickstart-gbr84 cloud-init: [preflight] Running pre-flight checks
Mar 25 07:56:03 vsphere-quickstart-gbr84 cloud-init: [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
Mar 25 07:56:03 vsphere-quickstart-gbr84 cloud-init: [preflight] Pulling images required for setting up a Kubernetes cluster
Mar 25 07:56:03 vsphere-quickstart-gbr84 cloud-init: [preflight] This might take a minute or two, depending on the speed of your internet connection
Mar 25 07:56:03 vsphere-quickstart-gbr84 cloud-init: [preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
Mar 25 07:56:04 vsphere-quickstart-gbr84 cloud-init: [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
Mar 25 07:56:04 vsphere-quickstart-gbr84 cloud-init: [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
Mar 25 07:56:04 vsphere-quickstart-gbr84 cloud-init: [kubelet-start] Starting the kubelet
Mar 25 07:56:04 vsphere-quickstart-gbr84 cloud-init: [certs] Using certificateDir folder "/etc/kubernetes/pki"
Mar 25 07:56:04 vsphere-quickstart-gbr84 cloud-init: [certs] Using existing ca certificate authority
Mar 25 07:56:05 vsphere-quickstart-gbr84 cloud-init: [certs] Generating "apiserver" certificate and key
Mar 25 07:56:05 vsphere-quickstart-gbr84 cloud-init: [certs] apiserver serving cert is signed for DNS names [vsphere-quickstart-gbr84 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.60.31.227 10.60.31.238]
Mar 25 07:56:05 vsphere-quickstart-gbr84 cloud-init: [certs] Generating "apiserver-kubelet-client" certificate and key
Mar 25 07:56:05 vsphere-quickstart-gbr84 cloud-init: [certs] Using existing front-proxy-ca certificate authority
Mar 25 07:56:05 vsphere-quickstart-gbr84 cloud-init: [certs] Generating "front-proxy-client" certificate and key
Mar 25 07:56:05 vsphere-quickstart-gbr84 cloud-init: [certs] Using existing etcd/ca certificate authority
Mar 25 07:53:07 vsphere-quickstart-gbr84 cloud-init: [certs] Generating "etcd/server" certificate and key
Mar 25 07:53:07 vsphere-quickstart-gbr84 cloud-init: [certs] etcd/server serving cert is signed for DNS names [vsphere-quickstart-gbr84 localhost] and IPs [10.60.31.227 127.0.0.1 ::1]
Mar 25 07:53:07 vsphere-quickstart-gbr84 cloud-init: [certs] Generating "etcd/peer" certificate and key
Mar 25 07:53:07 vsphere-quickstart-gbr84 cloud-init: [certs] etcd/peer serving cert is signed for DNS names [vsphere-quickstart-gbr84 localhost] and IPs [10.60.31.227 127.0.0.1 ::1]
Mar 25 07:53:07 vsphere-quickstart-gbr84 cloud-init: [certs] Generating "etcd/healthcheck-client" certificate and key
Mar 25 07:53:07 vsphere-quickstart-gbr84 cloud-init: [certs] Generating "apiserver-etcd-client" certificate and key
Mar 25 07:53:07 vsphere-quickstart-gbr84 cloud-init: [certs] Using the existing "sa" key
Mar 25 07:53:07 vsphere-quickstart-gbr84 cloud-init: [kubeconfig] Using kubeconfig folder "/etc/kubernetes"
Mar 25 07:53:08 vsphere-quickstart-gbr84 cloud-init: [kubeconfig] Writing "admin.conf" kubeconfig file
Mar 25 07:53:08 vsphere-quickstart-gbr84 cloud-init: [kubeconfig] Writing "kubelet.conf" kubeconfig file
Mar 25 07:53:09 vsphere-quickstart-gbr84 cloud-init: [kubeconfig] Writing "controller-manager.conf" kubeconfig file
Mar 25 07:53:09 vsphere-quickstart-gbr84 cloud-init: [kubeconfig] Writing "scheduler.conf" kubeconfig file
Mar 25 07:53:09 vsphere-quickstart-gbr84 cloud-init: [control-plane] Using manifest folder "/etc/kubernetes/manifests"
Mar 25 07:53:09 vsphere-quickstart-gbr84 cloud-init: [control-plane] Creating static Pod manifest for "kube-apiserver"
Mar 25 07:53:09 vsphere-quickstart-gbr84 cloud-init: [control-plane] Creating static Pod manifest for "kube-controller-manager"
Mar 25 07:53:09 vsphere-quickstart-gbr84 cloud-init: W0325 07:53:09.267906 1793 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
Mar 25 07:53:09 vsphere-quickstart-gbr84 cloud-init: [control-plane] Creating static Pod manifest for "kube-scheduler"
Mar 25 07:53:09 vsphere-quickstart-gbr84 cloud-init: W0325 07:53:09.269338 1793 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
Mar 25 07:53:09 vsphere-quickstart-gbr84 cloud-init: [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
Mar 25 07:53:09 vsphere-quickstart-gbr84 cloud-init: [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
Mar 25 07:53:49 vsphere-quickstart-gbr84 cloud-init: [kubelet-check] Initial timeout of 40s passed.
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: Unfortunately, an error has occurred:
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: timed out waiting for the condition
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: This error is likely caused by:
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: - The kubelet is not running
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: - 'systemctl status kubelet'
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: - 'journalctl -xeu kubelet'
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: Additionally, a control plane component may have crashed or exited when started by the container runtime.
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: Here is one example how you may list all Kubernetes containers running in docker:
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: - 'docker ps -a | grep kube | grep -v pause'
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: Once you have found the failing container, you can inspect its logs with:
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: - 'docker logs CONTAINERID'
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: To see the stack trace of this error execute with --v=5 or higher
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: 2020-03-25 07:57:09,287 - util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/runcmd [1]
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: 2020-03-25 07:57:09,289 - cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts)
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: 2020-03-25 07:57:09,290 - util.py[WARNING]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python2.7/site-packages/cloudinit/config/cc_scripts_user.pyc'>) failed
Mar 25 07:57:09 vsphere-quickstart-gbr84 cloud-init: Cloud-init v. 18.5 finished at Wed, 25 Mar 2020 07:57:09 +0000. Datasource DataSourceVMwareGuestInfo. Up 250.98 seconds
A log of communication failures from the kubelet to the controlPlaneEndpoint is being logged even after the kubeadm init is over.
Mar 25 07:57:57 vsphere-quickstart-gbr84 kubelet[11900]: E0325 07:57:57.927013 11900 reflector.go:153] k8s.io/kubernetes/pkg/kubelet/kubelet.go:458: Failed to list *v1.Node: Get https://10.60.31.238:6443/api/v1/nodes?fieldSelector=metadata.name%3Dvsphere-quickstart-gbr84&limit=500&resourceVersion=0: EOF
Mar 25 07:57:57 vsphere-quickstart-gbr84 kubelet[11900]: E0325 07:57:57.927888 11900 reflector.go:153] k8s.io/kubernetes/pkg/kubelet/kubelet.go:449: Failed to list *v1.Service: Get https://10.60.31.238:6443/api/v1/services?limit=500&resourceVersion=0: EOF
Mar 25 07:57:57 vsphere-quickstart-gbr84 kubelet[11900]: E0325 07:57:57.929063 11900 reflector.go:153] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://10.60.31.238:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dvsphere-quickstart-gbr84&limit=500&resourceVersion=0: EOF
Mar 25 07:57:58 vsphere-quickstart-gbr84 kubelet[11900]: E0325 07:57:58.928522 11900 reflector.go:153] k8s.io/kubernetes/pkg/kubelet/kubelet.go:458: Failed to list *v1.Node: Get https://10.60.31.238:6443/api/v1/nodes?fieldSelector=metadata.name%3Dvsphere-quickstart-gbr84&limit=500&resourceVersion=0: EOF
Mar 25 07:57:58 vsphere-quickstart-gbr84 kubelet[11900]: E0325 07:57:58.928814 11900 reflector.go:153] k8s.io/kubernetes/pkg/kubelet/kubelet.go:449: Failed to list *v1.Service: Get https://10.60.31.238:6443/api/v1/services?limit=500&resourceVersion=0: EOF
Mar 25 07:57:58 vsphere-quickstart-gbr84 kubelet[11900]: E0325 07:57:58.930204 11900 reflector.go:153] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://10.60.31.238:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dvsphere-quickstart-gbr84&limit=500&resourceVersion=0: EOF
I have confirmed that the communication to localhost:6443 is normally performed without going through the controlPlaneEndpoint, and after time, it naturally communicates with the controlPlaneEndpoint.
But I cannot install cni plugin because kube-proxy is not installed.
Therefore, worker nodes are not created forever.
/kind bug
What steps did you take and what happened:
[A clear and concise description of what the bug is.]
In most attempts, only haproxy and controlplane nodes are created, not worker nodes.
As a result of the trace, it is understood that the communication timeout for the controlPlaneEndpoint(HAProxy) occurred during the
kubeadm init
process (timeout:4m0s
).Over time, communication with the controlPlaneEndpoint is successful, but
coredns
andkube-proxy
are not installed due tokubeadm init
failure.This is a bug that did not occur in CAPV v0.6.1 and only in v0.6.2.
What did you expect to happen:
kubeadm init
should succeed.Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
This is the contolplane node log where
kubeadm init
failed.A log of communication failures from the kubelet to the
controlPlaneEndpoint
is being logged even after thekubeadm init
is over.I have confirmed that the communication to
localhost:6443
is normally performed without going through thecontrolPlaneEndpoint
, and after time, it naturally communicates with thecontrolPlaneEndpoint
.But I cannot install
cni plugin
becausekube-proxy
is not installed.Therefore, worker nodes are not created forever.
Environment:
kubectl version
): v1.17.3/etc/os-release
): CentOS Linux release 7.7.1908 (Core)The text was updated successfully, but these errors were encountered: