Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.6.0 kubelet fails with error "misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd" #43805

Closed
sbezverk opened this issue Mar 29, 2017 · 38 comments
Labels
area/kubeadm sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@sbezverk
Copy link
Contributor

kubernetes 1.6.0, installation AIO with kubeadm
centos 7.3
when kubeadm init run, the following error gets reported and kubelet fails to start:
kubelet: error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

@RichWellum
Copy link

Seeing the same.

@civik
Copy link

civik commented Mar 29, 2017

Same. kubeadm init appears completely broken on CentOS with 1.6. When I run a deploy it freezes at:

[apiclient] Created API client, waiting for the control plane to become ready

I also see the referenced cgroupfs error.

@sbezverk
Copy link
Contributor Author

@civik it has been resolved, you need to add --cgroup-driver=system to kubelet start up parameters in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
But this does not resolve kubeadm problem :(

@sbezverk
Copy link
Contributor Author

@civik kubeadm issue solved too. you MUST make sure you run a non alpha version of kubeadm. Here is what I haveand it started working properly:
kubeadm version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"clean", BuildDate:"2017-03-28T16:24:30Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}

@kfox1111
Copy link

This looks like a bug in the kubelet package. It should default to the way docker is setup on the system.

@luhkevin
Copy link

I'm having this issue too. I added --cgroup-driver=system and I'm getting the same thing.
How do I download a new version of kubeadm?

@jamstar
Copy link

jamstar commented Mar 29, 2017

im getting this issue as well but its not with kubeadmin. mine is the hard way built via vagrant on centos7 .

@sbezverk
Copy link
Contributor Author

for me, since I am on centos, running "yum install kubeadm-1.6.0-0.x86_64" does the trick.

@civik
Copy link

civik commented Mar 29, 2017

I opened #43819

@grodrigues3 grodrigues3 added area/kubeadm sig/node Categorizes an issue or PR as relevant to SIG Node. labels Mar 29, 2017
@Cisneiros
Copy link

I had the same problem when installing 1.6.0 on CentOS. Added --cgroup-driver=systemd (note the "d" at the end) and kubelet started, and I can now see the node using kubectl get node.

But the node never gets ready. When I describe the node, I get these events:

Starting                Starting kubelet.
ImageGCFailed           unable to find data for container /
KubeletSetupFailed      Failed to start ContainerManager systemd version does not support ability to start a slice as transient unit```

@dsever
Copy link

dsever commented Mar 30, 2017

Same here with Red Hat Enterprise Linux Server release 7.3 (Maipo)

Mar 30 14:58:17 master01a kubelet: error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

But looks like
KUBELET_ARGS="--kubeconfig=/etc/kubernetes/kubeadminconfig --require-kubeconfig=true --pod-manifest-path=/etc/kubernetes/manifests --cgroup-driver=systemd"

Resolved my issue.

@4admin2root
Copy link

you can try yum remove docker docker-common & yum install docker-engine-1.12.6
repo:
[dockerrepo]
name=Docker Repository
baseurl=https://mirrors.aliyun.com/docker-engine/yum/repo/main/centos/7/
enabled=1
gpgcheck=1
gpgkey=https://mirrors.aliyun.com/docker-engine/yum/gpg

@zetaab
Copy link
Member

zetaab commented Mar 31, 2017

does anyone have idea what is real difference between cgroupfs and systemd drivers? Which one I should really use? Currently it looks like centos/rhel comes with docker which have systemd specified by default. Why rhel/centos does not use docker default cgroupdriver (which is cgroupfs)?

The weird thing here is that we have been using docker cgroupdriver systemd for a long time, and we do not have anything specified in kubelet. After upgrade to 1.6.0 we need to specify startup options, so is automatic support to read cgroup from docker removed or is it bug?

@gtirloni
Copy link
Contributor

gtirloni commented Apr 1, 2017

Related discussions coreos/bugs#1435

@mikedanese
Copy link
Member

The fix is for this is in the release repo so let's fold this into kubernetes/release#306

@obnoxxx
Copy link
Contributor

obnoxxx commented Apr 4, 2017

Not sure I got it right, but the rpms from http://yum.kubernetes.io/repos/kubernetes-el7-x86_64 still need the --cgroup-driver=systemd fix in 10-kubeadm.conf in order to get the kubelet service started.

@mikedanese
Copy link
Member

mikedanese commented Apr 4, 2017

Agreed. This is a duplicate of kubernetes/release#306 but that issue is in the repository where the fix needs to be made so let's consolidate discussion there.

@obnoxxx
Copy link
Contributor

obnoxxx commented Apr 4, 2017

@mikedanese - thanks, got it!

@punkdata
Copy link

punkdata commented Apr 8, 2017

Here is a sed cmd that I run in a provisioning script before executing kubeadm init
Tested on CentOS 7.3 only so YMMV on other distros

sed -i 's#Environment="KUBELET_KUBECONFIG_ARGS=-.*#Environment="KUBELET_KUBECONFIG_ARGS=--kubeconfig=/etc/kubernetes/kubelet.conf --require-kubeconfig=true --cgroup-driver=systemd"#g' /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

@max-lobur
Copy link

Last one solved the issue for me.
P.S. don't forget systemctl daemon-reload

@caoyang001
Copy link

@Cisneiros - you can try "yum update systemd"
test in CentOS7.2

martinpitt added a commit to cockpit-project/cockpit that referenced this issue May 8, 2017
kubelet.service fails to start out of the box:

    kubelet[1650]: Error: failed to run Kubelet: failed to create
    kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is
    different from docker cgroup driver: "systemd"

due to kubernetes/kubernetes#43805. Add a hack
until the upstream fix (kubernetes/release#313)
trickles down into Fedora-26.
@yogeshnath
Copy link

yogeshnath commented May 13, 2017

I'm getting totally opposite error: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"

-- Fixed. added "--exec-opt native.cgroupdriver=systemd" to docker options.

@MartinEmrich
Copy link

I just had this issue, maybe this helps someone...:

If you use the docker packages supplied via EPEL (package "docker", version 1.12.6), it works OOTB with the "systemd" driver.
If you use newer packages from docker, you have to switch to cgroupfs, but then you are in for this message: "WARNING: docker version is greater than the most recently validated version. Docker version: 17.05.0-ce. Max validated version: 1.12".

@heartarea
Copy link

kubelet's cgroup driver is not same with docker's cgroup driver, so I update systemd -> cgroupfs.

vi /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
update KUBELET_CGROUP_ARGS=--cgroup-driver=systemd to KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs

restart kubelet
run 'service kubelet restart'

everyting is ok

@mageshmcc
Copy link

I am facing the same issue on Ubuntu 16.04.2, pls let me know, if there are any workarounds

cat /etc/issue

Ubuntu 16.04.2 LTS \n \l

kubeadm version

kubeadm version: &version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.2", GitCommit:"922a86cfcd65915a9b2f69f3f193b8907d741d9c", GitTreeState:"clean", BuildDate:"2017-07-21T08:08:00Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

docker version

Client:
Version: 1.12.6
API version: 1.24
Go version: go1.6.2
Git commit: 78d1802
Built: Tue Jan 31 23:35:14 2017
OS/Arch: linux/amd64

Server:
Version: 1.12.6
API version: 1.24
Go version: go1.6.2
Git commit: 78d1802
Built: Tue Jan 31 23:35:14 2017
OS/Arch: linux/amd64

kubeadm init

[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[init] Using Kubernetes version: v1.7.2
[init] Using Authorization modes: [Node RBAC]
[preflight] Running pre-flight checks
[preflight] Starting the kubelet service
[kubeadm] WARNING: starting in 1.8, tokens expire after 24 hours by default (if you require a non-expiring token use --token-ttl 0)
[certificates] Generated CA certificate and key.
[certificates] Generated API server certificate and key.
[certificates] API Server serving cert is signed for DNS names [kubemaster kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.100.52.186]
[certificates] Generated API server kubelet client certificate and key.
[certificates] Generated service account token signing key and public key.
[certificates] Generated front-proxy CA certificate and key.
[certificates] Generated front-proxy client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[apiclient] Created API client, waiting for the control plane to become ready

@phye
Copy link

phye commented Aug 1, 2017

I am in ArchLinux. Got it work by overriding ExecStart of docker service:

$ cat /etc/systemd/system/docker.service.d/override.conf 
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H fd:// --exec-opt native.cgroupdriver=systemd

Based on this stack overflow answer:
https://stackoverflow.com/questions/43794169/docker-change-cgroup-driver-to-systemd

@jmarcos-cano
Copy link

jmarcos-cano commented Aug 8, 2017

Just to add to @heartarea 's response

Verify which cgroup driver dockerd is using

docker info |grep -i cgroup

output

Cgroup Driver: cgroupfs

Verify kubeadm cgroup settings

cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

Change it to match Docker's

  • vi /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

  • update KUBELET_CGROUP_ARGS=--cgroup-driver=systemd to KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs

restart it

systemctl daemon-reload
service kubelet restart

Important NOTE

you'll need to change cgroup driver also in your nodes.

Versions

kubeadm: v1.7.3
docker: 17.06.0-ce

@sjtindell
Copy link

Confirmed @jmarcos-cano steps at least cleared up kubeadm hanging on "waiting for control plane to become ready".

Virtualbox VM with NAT and Bridged Network interfaces.
CentOS 7.3 Kernel 3.10.0-514.26.2.el7.x86_64
Kubeadm version: &version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.5"
Docker version 17.06.2-ce, build cec0b72

ERROR in /var/log/messages:
kubelet: error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"

ERROR in /var/log/messages:
reflector.go:190] k8s.io/kubernetes/pkg/kubelet/kubelet.go:408: Failed to list *v1.Node: Get https://10.118.35.17:6443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost.localdomain&resourceVersion=0: dial tcp 10.118.35.17:6443: getsockopt: connection refused

Followed above steps, all control plane components are healthy.

@KeithTt
Copy link

KeithTt commented Nov 2, 2017

good job!

@garyyang6
Copy link

I changed it to match Docker's. But, it does not work. My system is CentOS 7

vi /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

update KUBELET_CGROUP_ARGS=--cgroup-driver=systemd to KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs

grep KUBELET_CGROUP_ARGS 10-kubeadm.conf
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs"

ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CGROUP_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS

docker info |grep -i cgroup
Cgroup Driver: cgroupfs

systemctl daemon-reload
service kubelet restart

./openshift start
W1114 20:26:47.521319 17156 start_master.go:297] Warning: assetConfig.loggingPublicURL: Invalid value: "": required to view aggregated container logs in the console, master start will continue.
W1114 20:26:47.521412 17156 start_master.go:297] Warning: assetConfig.metricsPublicURL: Invalid value: "": required to view cluster metrics in the console, master start will continue.
W1114 20:26:47.521425 17156 start_master.go:297] Warning: auditConfig.auditFilePath: Required value: audit can not be logged to a separate file, master start will continue.
I1114 20:26:47.544184 17156 plugins.go:101] No cloud provider specified.
2017-11-14 20:26:47.595757 I | etcdserver/api/v3rpc: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp 10.104.6.127:4001: getsockopt: connection refused"; Reconnecting to {10.104.6.127:4001 }
E1114 20:26:47.595859 17156 controllermanager.go:337] Server isn't healthy yet. Waiting a little while.
2017-11-14 20:26:47.595907 I | etcdserver/api/v3rpc: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp 10.104.6.127:4001: getsockopt: connection refused"; Reconnecting to {10.104.6.127:4001 }
2017-11-14 20:26:47.595957 I | etcdserver/api/v3rpc: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp 10.104.6.127:4001: getsockopt: connection refused"; Reconnecting to {10.104.6.127:4001 }
2017-11-14 20:26:47.596004 I | etcdserver/api/v3rpc: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp 10.104.6.127:4001: getsockopt: connection refused"; Reconnecting to {10.104.6.127:4001 }
2017-11-14 20:26:47.617441 I | etcdserver/api/v3rpc: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp 10.104.6.127:4001: getsockopt: connection refused"; Reconnecting to {10.104.6.127:4001 }
I1114 20:26:47.861440 17156 start_master.go:529] Starting master on 0.0.0.0:8443 (v3.6.1+008f2d5)
I1114 20:26:47.861464 17156 start_master.go:530] Public master address is https://10.104.6.127:8443
I1114 20:26:47.861484 17156 start_master.go:534] Using images from "openshift/origin-:v3.6.1"
2017-11-14 20:26:47.861582 I | embed: peerTLS: cert = openshift.local.config/master/etcd.server.crt, key = openshift.local.config/master/etcd.server.key, ca = openshift.local.config/master/ca.crt, trusted-ca = , client-cert-auth = true
2017-11-14 20:26:47.862490 I | embed: listening for peers on https://0.0.0.0:7001
2017-11-14 20:26:47.862548 I | embed: listening for client requests on 0.0.0.0:4001
2017-11-14 20:26:47.865511 I | etcdserver: name = openshift.local
2017-11-14 20:26:47.865525 I | etcdserver: data dir = openshift.local.etcd
2017-11-14 20:26:47.865533 I | etcdserver: member dir = openshift.local.etcd/member
2017-11-14 20:26:47.865540 I | etcdserver: heartbeat = 100ms
2017-11-14 20:26:47.865546 I | etcdserver: election = 1000ms
2017-11-14 20:26:47.865552 I | etcdserver: snapshot count = 100000
2017-11-14 20:26:47.865567 I | etcdserver: advertise client URLs = https://10.104.6.127:4001
2017-11-14 20:26:47.895923 I | etcdserver: restarting member a7340362e2996c30 in cluster cf86d7c1b2833ba9 at commit index 1187
2017-11-14 20:26:47.896041 I | raft: a7340362e2996c30 became follower at term 21
2017-11-14 20:26:47.896063 I | raft: newRaft a7340362e2996c30 [peers: [], term: 21, commit: 1187, applied: 0, lastindex: 1187, lastterm: 21]
2017-11-14 20:26:47.919587 W | auth: simple token is not cryptographically signed
2017-11-14 20:26:47.922793 I | etcdserver: starting server... [version: 3.2.1, cluster version: to_be_decided]
2017-11-14 20:26:47.922828 I | embed: ClientTLS: cert = openshift.local.config/master/etcd.server.crt, key = openshift.local.config/master/etcd.server.key, ca = openshift.local.config/master/ca.crt, trusted-ca = , client-cert-auth = true
2017-11-14 20:26:47.925319 I | etcdserver/api/v3rpc: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp 10.104.6.127:4001: getsockopt: connection refused"; Reconnecting to {10.104.6.127:4001 }
2017-11-14 20:26:47.925562 I | etcdserver/membership: added member a7340362e2996c30 [https://10.104.6.127:7001] to cluster cf86d7c1b2833ba9
2017-11-14 20:26:47.925674 N | etcdserver/membership: set the initial cluster version to 3.2
2017-11-14 20:26:47.925719 I | etcdserver/api: enabled capabilities for version 3.2
2017-11-14 20:26:48.096468 I | raft: a7340362e2996c30 is starting a new election at term 21
2017-11-14 20:26:48.096563 I | raft: a7340362e2996c30 became candidate at term 22
2017-11-14 20:26:48.096589 I | raft: a7340362e2996c30 received MsgVoteResp from a7340362e2996c30 at term 22
2017-11-14 20:26:48.096609 I | raft: a7340362e2996c30 became leader at term 22
2017-11-14 20:26:48.096622 I | raft: raft.node: a7340362e2996c30 elected leader a7340362e2996c30 at term 22
2017-11-14 20:26:48.097433 I | etcdserver: published {Name:openshift.local ClientURLs:[https://10.104.6.127:4001]} to cluster cf86d7c1b2833ba9
I1114 20:26:48.097468 17156 run.go:85] Started etcd at 10.104.6.127:4001
2017-11-14 20:26:48.098248 I | embed: ready to serve client requests
2017-11-14 20:26:48.098751 I | embed: serving client requests on [::]:4001
2017-11-14 20:26:48.298679 I | etcdserver/api/v3rpc: Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2017-11-14 20:26:48.299820 I | etcdserver/api/v3rpc: Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2017-11-14 20:26:48.305434 I | etcdserver/api/v3rpc: Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2017-11-14 20:26:48.307315 I | etcdserver/api/v3rpc: Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2017-11-14 20:26:48.307368 I | etcdserver/api/v3rpc: Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2017-11-14 20:26:48.307407 I | etcdserver/api/v3rpc: Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2017-11-14 20:26:48.307444 I | etcdserver/api/v3rpc: Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2017-11-14 20:26:48.307633 I | etcdserver/api/v3rpc: Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
I1114 20:26:48.308900 17156 run_components.go:91] Using default project node label selector:
I1114 20:26:48.317753 17156 clusterquotamapping.go:160] Starting ClusterQuotaMappingController controller
I1114 20:26:48.318126 17156 master.go:182] Starting OAuth2 API at /oauth
I1114 20:26:48.318141 17156 master.go:190] Starting Web Console /console/
E1114 20:26:48.333055 17156 reflector.go:201] github.com/openshift/origin/pkg/authorization/generated/informers/internalversion/factory.go:45: Failed to list *authorization.ClusterPolicyBinding: Get https://10.104.6.127:8443/apis/authorization.openshift.io/v1/clusterpolicybindings?resourceVersion=0: dial tcp 10.104.6.127:8443: getsockopt: connection refused
E1114 20:26:48.333798 17156 reflector.go:201] github.com/openshift/origin/pkg/authorization/generated/informers/internalversion/factory.go:45: Failed to list *authorization.ClusterPolicy: Get https://10.104.6.127:8443/apis/authorization.openshift.io/v1/clusterpolicies?resourceVersion=0: dial tcp 10.104.6.127:8443: getsockopt: connection refused
E1114 20:26:48.333870 17156 reflector.go:201] github.com/openshift/origin/pkg/authorization/generated/informers/internalversion/factory.go:45: Failed to list *authorization.PolicyBinding: Get https://10.104.6.127:8443/apis/authorization.openshift.io/v1/policybindings?resourceVersion=0: dial tcp 10.104.6.127:8443: getsockopt: connection refused
E1114 20:26:48.333937 17156 reflector.go:201] github.com/openshift/origin/pkg/authorization/generated/informers/internalversion/factory.go:45: Failed to list *authorization.Policy: Get https://10.104.6.127:8443/apis/authorization.openshift.io/v1/policies?resourceVersion=0: dial tcp 10.104.6.127:8443: getsockopt: connection refused
E1114 20:26:48.334012 17156 reflector.go:201] github.com/openshift/origin/pkg/quota/generated/informers/internalversion/factory.go:45: Failed to list *quota.ClusterResourceQuota: Get https://10.104.6.127:8443/apis/quota.openshift.io/v1/clusterresourcequotas?resourceVersion=0: dial tcp 10.104.6.127:8443: getsockopt: connection refused
E1114 20:26:48.642161 17156 controllermanager.go:337] Server isn't healthy yet. Waiting a little while.
W1114 20:26:48.934271 17156 genericapiserver.go:295] Skipping API autoscaling/v2alpha1 because it has no resources.
W1114 20:26:49.108531 17156 genericapiserver.go:295] Skipping API rbac.authorization.k8s.io/v1alpha1 because it has no resources.
I1114 20:26:49.272056 17156 openshift_apiserver.go:237] Starting Origin API at /apis/user.openshift.io/v1
I1114 20:26:49.274906 17156 openshift_apiserver.go:237] Starting Origin API at /apis/image.openshift.io/v1
I1114 20:26:49.276019 17156 openshift_apiserver.go:237] Starting Origin API at /apis/template.openshift.io/v1
I1114 20:26:49.277340 17156 openshift_apiserver.go:237] Starting Origin API at /apis/security.openshift.io/v1
I1114 20:26:49.278069 17156 openshift_apiserver.go:237] Starting Origin API at /apis/project.openshift.io/v1
I1114 20:26:49.280928 17156 openshift_apiserver.go:237] Starting Origin API at /apis/build.openshift.io/v1
I1114 20:26:49.282884 17156 openshift_apiserver.go:237] Starting Origin API at /apis/apps.openshift.io/v1
I1114 20:26:49.480116 17156 openshift_apiserver.go:237] Starting Origin API at /apis/authorization.openshift.io/v1
I1114 20:26:49.482871 17156 openshift_apiserver.go:237] Starting Origin API at /apis/oauth.openshift.io/v1
I1114 20:26:49.484164 17156 openshift_apiserver.go:237] Starting Origin API at /apis/quota.openshift.io/v1
I1114 20:26:49.486888 17156 openshift_apiserver.go:237] Starting Origin API at /apis/network.openshift.io/v1
I1114 20:26:49.488031 17156 openshift_apiserver.go:237] Starting Origin API at /apis/route.openshift.io/v1
I1114 20:26:49.919536 17156 openshift_apiserver.go:243] Started Origin API at /oapi/v1
E1114 20:26:49.947160 17156 controllermanager.go:337] Server isn't healthy yet. Waiting a little while.
E1114 20:26:49.947241 17156 reflector.go:201] github.com/openshift/origin/pkg/quota/generated/informers/internalversion/factory.go:45: Failed to list *quota.ClusterResourceQuota: Get https://10.104.6.127:8443/apis/quota.openshift.io/v1/clusterresourcequotas?resourceVersion=0: dial tcp 10.104.6.127:8443: getsockopt: connection refused
E1114 20:26:49.947342 17156 reflector.go:201] github.com/openshift/origin/pkg/authorization/generated/informers/internalversion/factory.go:45: Failed to list *authorization.Policy: Get https://10.104.6.127:8443/apis/authorization.openshift.io/v1/policies?resourceVersion=0: dial tcp 10.104.6.127:8443: getsockopt: connection refused
E1114 20:26:49.947449 17156 reflector.go:201] github.com/openshift/origin/pkg/authorization/generated/informers/internalversion/factory.go:45: Failed to list *authorization.PolicyBinding: Get https://10.104.6.127:8443/apis/authorization.openshift.io/v1/policybindings?resourceVersion=0: dial tcp 10.104.6.127:8443: getsockopt: connection refused
E1114 20:26:49.947513 17156 reflector.go:201] github.com/openshift/origin/pkg/authorization/generated/informers/internalversion/factory.go:45: Failed to list *authorization.ClusterPolicy: Get https://10.104.6.127:8443/apis/authorization.openshift.io/v1/clusterpolicies?resourceVersion=0: dial tcp 10.104.6.127:8443: getsockopt: connection refused
E1114 20:26:49.947572 17156 reflector.go:201] github.com/openshift/origin/pkg/authorization/generated/informers/internalversion/factory.go:45: Failed to list *authorization.ClusterPolicyBinding: Get https://10.104.6.127:8443/apis/authorization.openshift.io/v1/clusterpolicybindings?resourceVersion=0: dial tcp 10.104.6.127:8443: getsockopt: connection refused
[restful] 2017/11/14 20:26:50 log.go:30: [restful/swagger] listing is available at https://10.104.6.127:8443/swaggerapi
[restful] 2017/11/14 20:26:50 log.go:30: [restful/swagger] https://10.104.6.127:8443/swaggerui/ is mapped to folder /swagger-ui/
I1114 20:26:50.459540 17156 serve.go:86] Serving securely on 0.0.0.0:8443
W1114 20:26:50.496834 17156 lease_endpoint_reconciler.go:176] Resetting endpoints for master service "kubernetes" to [10.104.6.127]
W1114 20:26:50.570565 17156 run_components.go:60] Binding DNS on port 8053 instead of 53, which may not be resolvable from all clients
I1114 20:26:50.570884 17156 logs.go:41] skydns: ready for queries on cluster.local. for tcp4://0.0.0.0:8053 [rcache 0]
I1114 20:26:50.570898 17156 logs.go:41] skydns: ready for queries on cluster.local. for udp4://0.0.0.0:8053 [rcache 0]
E1114 20:26:50.597709 17156 controllermanager.go:337] Server isn't healthy yet. Waiting a little while.
I1114 20:26:50.671922 17156 run_components.go:86] DNS listening at 0.0.0.0:8053
I1114 20:26:51.448452 17156 docker.go:364] Connecting to docker on unix:///var/run/docker.sock
I1114 20:26:51.448480 17156 docker.go:384] Start docker client with request timeout=2m0s
W1114 20:26:51.452960 17156 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
I1114 20:26:51.490948 17156 node_config.go:367] DNS Bind to 10.104.6.127:53
I1114 20:26:51.490974 17156 start_node.go:345] Starting node sf-docker01.corp.wagerworks.com (v3.6.1+008f2d5)
I1114 20:26:51.492790 17156 start_node.go:354] Connecting to API server https://10.104.6.127:8443
I1114 20:26:51.493026 17156 docker.go:364] Connecting to docker on unix:///var/run/docker.sock
I1114 20:26:51.493042 17156 docker.go:384] Start docker client with request timeout=2m0s
I1114 20:26:51.494487 17156 node.go:134] Connecting to Docker at unix:///var/run/docker.sock
I1114 20:26:51.520004 17156 feature_gate.go:144] feature gates: map[]
I1114 20:26:51.521375 17156 manager.go:143] cAdvisor running in container: "/user.slice"
I1114 20:26:51.524423 17156 node.go:348] Using iptables Proxier.
W1114 20:26:51.531871 17156 node.go:488] Failed to retrieve node info: nodes "sf-docker01.corp.wagerworks.com" not found
W1114 20:26:51.531973 17156 proxier.go:309] invalid nodeIP, initializing kube-proxy with 127.0.0.1 as nodeIP
W1114 20:26:51.531984 17156 proxier.go:314] clusterCIDR not specified, unable to distinguish between internal and external traffic
I1114 20:26:51.532007 17156 node.go:380] Tearing down userspace rules.
I1114 20:26:51.692198 17156 node.go:480] Started Kubernetes Proxy on 0.0.0.0
I1114 20:26:51.693463 17156 node.go:316] Starting DNS on 10.104.6.127:53
I1114 20:26:51.693599 17156 logs.go:41] skydns: ready for queries on cluster.local. for tcp://10.104.6.127:53 [rcache 0]
I1114 20:26:51.693617 17156 logs.go:41] skydns: ready for queries on cluster.local. for udp://10.104.6.127:53 [rcache 0]
W1114 20:26:51.708162 17156 manager.go:151] unable to connect to Rkt api service: rkt: cannot tcp Dial rkt api service: dial tcp [::1]:15441: getsockopt: connection refused
I1114 20:26:51.744723 17156 fs.go:117] Filesystem partitions: map[/dev/mapper/centos-root:{mountpoint:/var/lib/docker/devicemapper major:253 minor:0 fsType:xfs blockSize:0} /dev/sda1:{mountpoint:/boot major:8 minor:1 fsType:xfs blockSize:0}]
I1114 20:26:51.747229 17156 manager.go:198] Machine: {NumCores:1 CpuFrequency:2533423 MemoryCapacity:16658931712 MachineID:197dcaf983ac43d49bc33a715706d364 SystemUUID:423C41D5-41C2-B4C5-FB04-E3E643ABDDC6 BootID:a883e9e3-59c1-46c4-a086-1aa638600889 Filesystems:[{Device:/dev/mapper/centos-root DeviceMajor:253 DeviceMinor:0 Capacity:47724642304 Type:vfs Inodes:28275344 HasInodes:true} {Device:/dev/sda1 DeviceMajor:8 DeviceMinor:1 Capacity:520794112 Type:vfs Inodes:512000 HasInodes:true}] DiskMap:map[2:0:{Name:fd0 Major:2 Minor:0 Size:4096 Scheduler:deadline} 8:0:{Name:sda Major:8 Minor:0 Size:53687091200 Scheduler:deadline} 253:0:{Name:dm-0 Major:253 Minor:0 Size:47747956736 Scheduler:none} 253:1:{Name:dm-1 Major:253 Minor:1 Size:5368709120 Scheduler:none} 253:2:{Name:dm-2 Major:253 Minor:2 Size:107374182400 Scheduler:none}] NetworkDevices:[{Name:ens160 MacAddress:00:50:56:bc:74:5e Speed:10000 Mtu:1500} {Name:virbr0 MacAddress:52:54:00:4c:ea:cb Speed:0 Mtu:1500} {Name:virbr0-nic MacAddress:52:54:00:4c:ea:cb Speed:0 Mtu:1500} {Name:virbr1 MacAddress:52:54:00:fa:ae:1e Speed:0 Mtu:1500} {Name:virbr1-nic MacAddress:52:54:00:fa:ae:1e Speed:0 Mtu:1500}] Topology:[{Id:0 Memory:17179402240 Cores:[{Id:0 Threads:[0] Caches:[{Size:32768 Type:Data Level:1} {Size:32768 Type:Instruction Level:1} {Size:262144 Type:Unified Level:2}]}] Caches:[{Size:12582912 Type:Unified Level:3}]}] CloudProvider:Unknown InstanceType:Unknown InstanceID:None}
I1114 20:26:51.756883 17156 start_master.go:715] Started serviceaccount-token controller
I1114 20:26:51.764977 17156 manager.go:204] Version: {KernelVersion:3.10.0-693.5.2.el7.x86_64 ContainerOsVersion:CentOS Linux 7 (Core) DockerVersion:17.11.0-ce-rc3 DockerAPIVersion:1.34 CadvisorVersion: CadvisorRevision:}
I1114 20:26:51.765707 17156 server.go:509] --cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /
W1114 20:26:51.769951 17156 container_manager_linux.go:217] Running with swap on is not supported, please disable swap! This will be a fatal error by default starting in K8s v1.6! In the meantime, you can opt-in to making this a fatal error by enabling --experimental-fail-swap-on.
I1114 20:26:51.770304 17156 container_manager_linux.go:244] container manager verified user specified cgroup-root exists: /
I1114 20:26:51.770323 17156 container_manager_linux.go:249] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:systemd ProtectKernelDefaults:false EnableCRI:true NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:}]} ExperimentalQOSReserved:map[]}
I1114 20:26:51.770527 17156 kubelet.go:265] Watching apiserver
W1114 20:26:51.886300 17156 kubelet_network.go:70] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
I1114 20:26:51.886342 17156 kubelet.go:494] Hairpin mode set to "hairpin-veth"
W1114 20:26:51.902336 17156 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
I1114 20:26:52.030094 17156 start_master.go:783] Started "openshift.io/serviceaccount-pull-secrets"
I1114 20:26:52.038075 17156 docker_service.go:184] Docker cri networking managed by kubernetes.io/no-op
F1114 20:26:52.065947 17156 node.go:281] failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"

./openshift version
openshift v3.6.1+008f2d5
kubernetes v1.6.1+5115d708d7
etcd 3.2.1

docker version
Client:
Version: 17.11.0-ce-rc3
API version: 1.34
Go version: go1.8.3
Git commit: 5b4af4f
Built: Wed Nov 8 03:04:32 2017
OS/Arch: linux/amd64

Server:
Version: 17.11.0-ce-rc3
API version: 1.34 (minimum version 1.12)
Go version: go1.8.3
Git commit: 5b4af4f
Built: Wed Nov 8 03:07:05 2017
OS/Arch: linux/amd64
Experimental: false

oc version
oc v3.6.1+008f2d5
kubernetes v1.6.1+5115d708d7
features: Basic-Auth GSSAPI Kerberos SPNEGO

kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.3", GitCommit:"f0efb3cb883751c5ffdbe6d515f3cb4fbe7b7acd", GitTreeState:"clean", BuildDate:"2017-11-08T18:27:48Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

@lxy16611
Copy link

lxy16611 commented Jul 5, 2018

skipping pod synchronization - [Failed to start ContainerManager systemd version does not support ability to start a slice as transient unit]
This is my problem。This is because the node is using"systemd" while the master is using "cgroupfs". I fixed it by both use "cgroupfs".

@fOO223Fr
Copy link

fOO223Fr commented Jan 8, 2019

@lxy16611 could you please explain more ? when you node is using you mean the OS of the node? or else i suppose master is a node itself

@Maftouhou
Copy link

Maftouhou commented Feb 12, 2019

I came across to a solution by removing completely the version of Docker from the base repository of CentOS and installing the Docker from the official repository, as explained here: https://docs.docker.com/install/linux/docker-ce/centos/#install-using-the-repository

In fact, the CentOS based Docker comes with "systemd" as "cgroup-driver" (--cgroup-driver=systemd).
And "kubelet" service use "cgroupfs" in the "/etc/systemd/system/kubelet.service.d/10-kubeadm.conf" file as "cgroup-driver" (--cgroup-driver=cgroupfs).

So, instead of changing, the "cgroup-driver" between "cgroupfs" or "systemd", consider installing the Docker from official repository my following by following this link, https://docs.docker.com/install/linux/docker-ce/centos/#install-using-the-repository
And this should ok. That worked for me. Kubelet started correctly and I was able to deal with minikube
Hope that help

@jmbeach
Copy link

jmbeach commented Oct 26, 2020

I got this error because I hadn't done the post-installation steps for installing docker (so you don't have to run docker as root).

After doing that, I got a new error:

W1025 19:40:00.166230    2251 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.19.3
[preflight] Running pre-flight checks
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: [preflight] Some fatal errors occurred:
	[ERROR NumCPU]: the number of available CPUs 1 is less than the required 2
	[ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

Have more work to do I guess

@shuaisai
Copy link

alter docker's file in /etc/docker/daemon.json

@sknot-rh
Copy link

With minikube 1.22.0 I had to change cgroupDriver in /var/lib/kubelet/config.yaml

@whatwewant
Copy link

whatwewant commented Aug 29, 2021

With minikube 1.22.0 I had to change cgroupDriver in /var/lib/kubelet/config.yaml

same in kubeadm v1.22.1, this is caused by orignal docker (show docker cgroup driver: docker info | grep "Cgroup Driver")

change (file: /var/lib/kubelet/config.yaml)

cgroupDriver: systemd

to

cgroupDriver: cgroupfs

@gainskills
Copy link

gainskills commented Sep 28, 2021

For cluster installation with kubeadm, I refered to https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/#configuring-the-kubelet-cgroup-driver
create a YAML file and updated the content:
cgroupDriver: systemd
to
cgroupDriver: cgroupfs

then run sudo kubeadm init --config kubeadm-config.yaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubeadm sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests