-
Notifications
You must be signed in to change notification settings - Fork 112
TASK [master : Initialize Master v1.14.1] Fails #57
Comments
I'm having the same issue, but for me it occasionally works (maybe 1 out of 10 tries) |
To force the use of systemd instead of cgroupfs edit
I'm working on a ansible task to do this. I've got the task to edit the line, and added a task to restart the docker service, but it seems you need to reboot the whole pi for it to take effect. |
Here is the output of jornalctl command:" |
I can reproduce the issue from the command line, so I think it might be a k8s issue, not rak8s. |
Did as you suggested manually. Now I got this from the journalctl:
|
I wiped and completely reinstalled Raspian-Lite on my k8s master. When I run the playbook I got:
It appears that rak8s is not installing dependencies (or removing them when running the cleanup playbook). So I installed the kubeadm manually with apt-get which installed all the dependencies. Rerunning the playbook got past the Install master part but is hanging on joining the workers to the cluster. |
My guess is that it is a k8s issue because I get the same error running the At this point I'm going to take a break and then reinstall Raspbian on all my nodes and just follow the manual instructions at here Good Luck! |
Hit this issue this evening -- still debugging. In the meantime, I just submitted a PR that fixes the "No package matching 'kubelet' is available" issue for me. |
I merged your changes in @PostlMC. Please test on clean installs if you can. |
Any updates here? |
OS running on Ansible host:
macOS 10.14.4
Ansible Version (
ansible --version
):ansible 2.7.10
config file = None
configured module search path = ['/Users/peiman/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/Cellar/ansible/2.7.10/libexec/lib/python3.7/site-packages/ansible
executable location = /usr/local/bin/ansible
python version = 3.7.3 (default, Mar 29 2019, 15:51:26) [Clang 10.0.1 (clang-1001.0.46.3)]
Uploaded logs showing errors(
rak8s/.log/ansible.log
)2019-04-20 10:13:47,555 p=21017 u=peiman | TASK [master : Initialize Master v1.14.1] ****************************************************************************************************************************************************************************************************
2019-04-20 10:21:26,984 p=21017 u=peiman | fatal: [rak8s000]: FAILED! => {"changed": true, "cmd": "kubeadm init --apiserver-advertise-address=192.168.1.60 --token=udy29x.ugyyk3tumg27atmr --kubernetes-version=v1.14.1 --pod-network-cidr=10.244.0.0/16", "delta": "0:07:38.901933", "end": "2019-04-20 08:21:26.902845", "msg": "non-zero return code", "rc": 1, "start": "2019-04-20 08:13:48.000912", "stderr": "\t[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/\nerror execution phase wait-control-plane: couldn't initialize a Kubernetes cluster", "stderr_lines": ["\t[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/", "error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster"], "stdout": "[init] Using Kubernetes version: v1.14.1\n[preflight] Running pre-flight checks\n[preflight] Pulling images required for setting up a Kubernetes cluster\n[preflight] This might take a minute or two, depending on the speed of your internet connection\n[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'\n[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"\n[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"\n[kubelet-start] Activating the kubelet service\n[certs] Using certificateDir folder "/etc/kubernetes/pki"\n[certs] Generating "etcd/ca" certificate and key\n[certs] Generating "etcd/healthcheck-client" certificate and key\n[certs] Generating "apiserver-etcd-client" certificate and key\n[certs] Generating "etcd/server" certificate and key\n[certs] etcd/server serving cert is signed for DNS names [rak8s000 localhost] and IPs [192.168.1.60 127.0.0.1 ::1]\n[certs] Generating "etcd/peer" certificate and key\n[certs] etcd/peer serving cert is signed for DNS names [rak8s000 localhost] and IPs [192.168.1.60 127.0.0.1 ::1]\n[certs] Generating "ca" certificate and key\n[certs] Generating "apiserver" certificate and key\n[certs] apiserver serving cert is signed for DNS names [rak8s000 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.1.60]\n[certs] Generating "apiserver-kubelet-client" certificate and key\n[certs] Generating "front-proxy-ca" certificate and key\n[certs] Generating "front-proxy-client" certificate and key\n[certs] Generating "sa" key and public key\n[kubeconfig] Using kubeconfig folder "/etc/kubernetes"\n[kubeconfig] Writing "admin.conf" kubeconfig file\n[kubeconfig] Writing "kubelet.conf" kubeconfig file\n[kubeconfig] Writing "controller-manager.conf" kubeconfig file\n[kubeconfig] Writing "scheduler.conf" kubeconfig file\n[control-plane] Using manifest folder "/etc/kubernetes/manifests"\n[control-plane] Creating static Pod manifest for "kube-apiserver"\n[control-plane] Creating static Pod manifest for "kube-controller-manager"\n[control-plane] Creating static Pod manifest for "kube-scheduler"\n[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"\n[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s\n[kubelet-check] Initial timeout of 40s passed.\n[kubelet-check] It seems like the kubelet isn't running or healthy.\n[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.\n[kubelet-check] It seems like the kubelet isn't running or healthy.\n[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.\n[kubelet-check] It seems like the kubelet isn't running or healthy.\n[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.\n[kubelet-check] It seems like the kubelet isn't running or healthy.\n[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.\n[kubelet-check] It seems like the kubelet isn't running or healthy.\n[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.\n\nUnfortunately, an error has occurred:\n\ttimed out waiting for the condition\n\nThis error is likely caused by:\n\t- The kubelet is not running\n\t- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)\n\nIf you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:\n\t- 'systemctl status kubelet'\n\t- 'journalctl -xeu kubelet'\n\nAdditionally, a control plane component may have crashed or exited when started by the container runtime.\nTo troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.\nHere is one example how you may list all Kubernetes containers running in docker:\n\t- 'docker ps -a | grep kube | grep -v pause'\n\tOnce you have found the failing container, you can inspect its logs with:\n\t- 'docker logs CONTAINERID'", "stdout_lines": ["[init] Using Kubernetes version: v1.14.1", "[preflight] Running pre-flight checks", "[preflight] Pulling images required for setting up a Kubernetes cluster", "[preflight] This might take a minute or two, depending on the speed of your internet connection", "[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'", "[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"", "[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"", "[kubelet-start] Activating the kubelet service", "[certs] Using certificateDir folder "/etc/kubernetes/pki"", "[certs] Generating "etcd/ca" certificate and key", "[certs] Generating "etcd/healthcheck-client" certificate and key", "[certs] Generating "apiserver-etcd-client" certificate and key", "[certs] Generating "etcd/server" certificate and key", "[certs] etcd/server serving cert is signed for DNS names [rak8s000 localhost] and IPs [192.168.1.60 127.0.0.1 ::1]", "[certs] Generating "etcd/peer" certificate and key", "[certs] etcd/peer serving cert is signed for DNS names [rak8s000 localhost] and IPs [192.168.1.60 127.0.0.1 ::1]", "[certs] Generating "ca" certificate and key", "[certs] Generating "apiserver" certificate and key", "[certs] apiserver serving cert is signed for DNS names [rak8s000 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.1.60]", "[certs] Generating "apiserver-kubelet-client" certificate and key", "[certs] Generating "front-proxy-ca" certificate and key", "[certs] Generating "front-proxy-client" certificate and key", "[certs] Generating "sa" key and public key", "[kubeconfig] Using kubeconfig folder "/etc/kubernetes"", "[kubeconfig] Writing "admin.conf" kubeconfig file", "[kubeconfig] Writing "kubelet.conf" kubeconfig file", "[kubeconfig] Writing "controller-manager.conf" kubeconfig file", "[kubeconfig] Writing "scheduler.conf" kubeconfig file", "[control-plane] Using manifest folder "/etc/kubernetes/manifests"", "[control-plane] Creating static Pod manifest for "kube-apiserver"", "[control-plane] Creating static Pod manifest for "kube-controller-manager"", "[control-plane] Creating static Pod manifest for "kube-scheduler"", "[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"", "[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s", "[kubelet-check] Initial timeout of 40s passed.", "[kubelet-check] It seems like the kubelet isn't running or healthy.", "[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.", "[kubelet-check] It seems like the kubelet isn't running or healthy.", "[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.", "[kubelet-check] It seems like the kubelet isn't running or healthy.", "[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.", "[kubelet-check] It seems like the kubelet isn't running or healthy.", "[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.", "[kubelet-check] It seems like the kubelet isn't running or healthy.", "[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.", "", "Unfortunately, an error has occurred:", "\ttimed out waiting for the condition", "", "This error is likely caused by:", "\t- The kubelet is not running", "\t- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)", "", "If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:", "\t- 'systemctl status kubelet'", "\t- 'journalctl -xeu kubelet'", "", "Additionally, a control plane component may have crashed or exited when started by the container runtime.", "To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.", "Here is one example how you may list all Kubernetes containers running in docker:", "\t- 'docker ps -a | grep kube | grep -v pause'", "\tOnce you have found the failing container, you can inspect its logs with:", "\t- 'docker logs CONTAINERID'"]}
Raspberry Pi Hardware Version:
5 x Raspberry Pi 3 Model B Rev 1.2
Raspberry Pi OS & Version (
cat /etc/os-release
):PRETTY_NAME="Raspbian GNU/Linux 9 (stretch)"
NAME="Raspbian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"
Detailed description of the issue:
Here is my inventory file:
[dev]
[prod]
rak8s000 ansible_host=192.168.1.60
rak8s001 ansible_host=192.168.1.61
rak8s002 ansible_host=192.168.1.62
rak8s003 ansible_host=192.168.1.63
rak8s004 ansible_host=192.168.1.64
[master]
rak8s000
I ran the cleanup.yml and then cluster.yml and then received the error that you can see above in the ansible log.
rak8s git# I used: d1b14ec
The text was updated successfully, but these errors were encountered: