Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unexpected providerID format #126

Closed
ty2 opened this issue Sep 13, 2018 · 9 comments
Closed

unexpected providerID format #126

ty2 opened this issue Sep 13, 2018 · 9 comments

Comments

@ty2
Copy link

ty2 commented Sep 13, 2018

I am following CCM deployment instruction in https://github.com/digitalocean/digitalocean-cloud-controller-manager/blob/master/docs/getting-started.md.

After deployed, load balancing and node addressing are work great but node labelling is not working.

kubectl get no

root@kube-1-master-2:~# kubectl get no kube-1-worker-1 -o yaml
apiVersion: v1
kind: Node
metadata:
  annotations:
    alpha.kubernetes.io/provided-node-ip: 10.130.13.103
    csi.volume.kubernetes.io/nodeid: '{"com.digitalocean.csi.dobs":"109470133"}'
    kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
    node.alpha.kubernetes.io/ttl: "0"
    projectcalico.org/IPv4Address: 10.130.13.103/16
    volumes.kubernetes.io/controller-managed-attach-detach: "true"
  creationTimestamp: 2018-09-11T10:23:33Z
  labels:
    beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/os: linux
    kubernetes.io/hostname: kube-1-worker-1
  name: kube-1-worker-1
  resourceVersion: "350167"
  selfLink: /api/v1/nodes/kube-1-worker-1
  uid: bc946296-b5ac-11e8-964e-daca43c1fec8
spec:
  podCIDR: 192.168.3.0/24
status:
  addresses:
  - address: kube-1-worker-1
    type: Hostname
  - address: 10.130.13.103
    type: InternalIP
  - address: <HIDDEN>
    type: ExternalIP
  allocatable:
    cpu: "1"
    ephemeral-storage: "23249247399"
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    memory: 913620Ki
    pods: "110"
  capacity:
    cpu: "1"
    ephemeral-storage: 25227048Ki
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    memory: 1016020Ki
    pods: "110"
  conditions:
  - lastHeartbeatTime: 2018-09-13T15:46:05Z
    lastTransitionTime: 2018-09-11T10:23:33Z
    message: kubelet has sufficient disk space available
    reason: KubeletHasSufficientDisk
    status: "False"
    type: OutOfDisk
  - lastHeartbeatTime: 2018-09-13T15:46:05Z
    lastTransitionTime: 2018-09-11T10:23:33Z
    message: kubelet has sufficient memory available
    reason: KubeletHasSufficientMemory
    status: "False"
    type: MemoryPressure
  - lastHeartbeatTime: 2018-09-13T15:46:05Z
    lastTransitionTime: 2018-09-11T10:23:33Z
    message: kubelet has no disk pressure
    reason: KubeletHasNoDiskPressure
    status: "False"
    type: DiskPressure
  - lastHeartbeatTime: 2018-09-13T15:46:05Z
    lastTransitionTime: 2018-09-11T10:23:33Z
    message: kubelet has sufficient PID available
    reason: KubeletHasSufficientPID
    status: "False"
    type: PIDPressure
  - lastHeartbeatTime: 2018-09-13T15:46:05Z
    lastTransitionTime: 2018-09-13T13:55:04Z
    message: kubelet is posting ready status. AppArmor enabled
    reason: KubeletReady
    status: "True"
    type: Ready
  daemonEndpoints:
    kubeletEndpoint:
      Port: 10250
  images:
  - names:
    - quay.io/kubernetes-ingress-controller/nginx-ingress-controller@sha256:d4d0f5416c26444fb318c1bf7e149b70c7d0e5089e129827b7dccfad458701ca
    - quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.19.0
    sizeBytes: 414090450
  - names:
    - quay.io/calico/node@sha256:a35541153f7695b38afada46843c64a2c546548cd8c171f402621736c6cf3f0b
    - quay.io/calico/node:v3.1.3
    sizeBytes: 248202699
  - names:
    - k8s.gcr.io/kube-proxy-amd64@sha256:6a8d6e8d1674cb26167d85bebbb953e93993b81bbbf7e00c2985e61e0c7c2062
    - k8s.gcr.io/kube-proxy-amd64:v1.11.2
    sizeBytes: 97772380
  - names:
    - quay.io/calico/cni@sha256:ed172c28bc193bb09bce6be6ed7dc6bfc85118d55e61d263cee8bbb0fd464a9d
    - quay.io/calico/cni:v3.1.3
    sizeBytes: 68849270
  - names:
    - digitalocean/digitalocean-cloud-controller-manager@sha256:c59c83fb1a5ef73b255de12245b17debe181a66c31fc828ea1b722a162ef7966
    - digitalocean/digitalocean-cloud-controller-manager:v0.1.7
    sizeBytes: 68295557
  - names:
    - huseyinbabal/node-example@sha256:caa0bb831c88be08d342c05fe8fa223516dbef33ebadf8cae9e7c27d55370d9d
    - huseyinbabal/node-example:latest
    sizeBytes: 66276303
  - names:
    - quay.io/coreos/flannel@sha256:60d77552f4ebb6ed4f0562876c6e2e0b0e0ab873cb01808f23f55c8adabd1f59
    - quay.io/coreos/flannel:v0.9.1
    sizeBytes: 51338831
  - names:
    - quay.io/k8scsi/csi-attacher@sha256:44b7d518e00d437fed9bdd6e37d3a9dc5c88ca7fc096ed2ab3af9d3600e4c790
    - quay.io/k8scsi/csi-attacher:v0.3.0
    sizeBytes: 46929442
  - names:
    - quay.io/k8scsi/csi-provisioner@sha256:d45e03c39c1308067fd46d69d8e01475cc0c9944c897f6eded4df07e75e5d3fb
    - quay.io/k8scsi/csi-provisioner:v0.3.0
    sizeBytes: 46848737
  - names:
    - quay.io/k8scsi/driver-registrar@sha256:b9b8b0d2e7e3bcf1fda1776c4bee216f70a51345c3b62af7248c10054143755d
    - quay.io/k8scsi/driver-registrar:v0.3.0
    sizeBytes: 44650528
  - names:
    - quay.io/coreos/flannel@sha256:88f2b4d96fae34bfff3d46293f7f18d1f9f3ca026b4a4d288f28347fcb6580ac
    - quay.io/coreos/flannel:v0.10.0-amd64
    sizeBytes: 44598861
  - names:
    - digitalocean/do-csi-plugin@sha256:ccda85cecb6a0fccd8492acff11f4d3071036ff97f1f3226b9dc3995d9f372da
    - digitalocean/do-csi-plugin:v0.2.0
    sizeBytes: 19856073
  - names:
    - gokul93/hello-world@sha256:4cf553f69fbb1c331a1ac8f3b6dc3a2d92276e27e55b79c049aec6b841f904ac
    - gokul93/hello-world:latest
    sizeBytes: 10319652
  - names:
    - gcr.io/google-samples/hello-app@sha256:c62ead5b8c15c231f9e786250b07909daf6c266d0fcddd93fea882eb722c3be4
    - gcr.io/google-samples/hello-app:1.0
    sizeBytes: 9860419
  - names:
    - gcr.io/google_containers/defaultbackend@sha256:865b0c35e6da393b8e80b7e3799f777572399a4cff047eb02a81fa6e7a48ed4b
    - gcr.io/google_containers/defaultbackend:1.4
    sizeBytes: 4844064
  - names:
    - busybox@sha256:cb63aa0641a885f54de20f61d152187419e8f6b159ed11a251a09d115fdff9bd
    - busybox:latest
    sizeBytes: 1162769
  - names:
    - k8s.gcr.io/pause@sha256:f78411e19d84a252e53bff71a4407a5686c46983a2c2eeed83929b888179acea
    - k8s.gcr.io/pause:3.1
    sizeBytes: 742472
  nodeInfo:
    architecture: amd64
    bootID: 0b982393-75c4-4a64-a14b-29978a591d9d
    containerRuntimeVersion: docker://17.3.3
    kernelVersion: 4.4.0-131-generic
    kubeProxyVersion: v1.11.2
    kubeletVersion: v1.11.2
    machineID: a38c62498ffb47cab90b37d7b4f0b586
    operatingSystem: linux
    osImage: Ubuntu 16.04.5 LTS
    systemUUID: A38C6249-8FFB-47CA-B90B-37D7B4F0B586

kubectl logs

root@kube-1-master-2:~# kubectl logs digitalocean-cloud-controller-manager-79cff6f759-99kj9 -n kube-system
W0913 14:17:38.001579       1 client_config.go:552] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
W0913 14:17:38.042439       1 controllermanager.go:108] detected a cluster without a ClusterID.  A ClusterID will be required in the future.  Please tag your cluster to avoid any future issues
W0913 14:17:38.043324       1 authentication.go:55] Authentication is disabled
I0913 14:17:38.043391       1 insecure_serving.go:49] Serving insecurely on [::]:10253
I0913 14:17:38.045003       1 node_controller.go:89] Sending events to api server.
I0913 14:17:38.047221       1 controllermanager.go:264] Will not configure cloud provider routes for allocate-node-cidrs: false, configure-cloud-routes: true.
I0913 14:17:38.048342       1 pvlcontroller.go:107] Starting PersistentVolumeLabelController
I0913 14:17:38.048421       1 controller_utils.go:1025] Waiting for caches to sync for persistent volume label controller
I0913 14:17:38.048509       1 service_controller.go:183] Starting service controller
I0913 14:17:38.048533       1 controller_utils.go:1025] Waiting for caches to sync for service controller
I0913 14:17:38.174388       1 controller_utils.go:1032] Caches are synced for service controller
I0913 14:17:38.174832       1 service_controller.go:636] Detected change in list of current cluster nodes. New node set: map[kube-1-worker-1:{}]
I0913 14:17:38.174908       1 service_controller.go:644] Successfully updated 0 out of 0 load balancers to direct traffic to the updated set of nodes
I0913 14:17:38.195621       1 controller_utils.go:1032] Caches are synced for persistent volume label controller
E0913 14:17:38.766837       1 node_controller.go:161] unexpected providerID format: 109037323, format should be: digitalocean://12345
E0913 14:17:39.911414       1 node_controller.go:161] unexpected providerID format: 109155407, format should be: digitalocean://12345
E0913 14:17:40.954761       1 node_controller.go:161] unexpected providerID format: 109172105, format should be: digitalocean://12345
E0913 14:17:42.048235       1 node_controller.go:161] unexpected providerID format: 109470133, format should be: digitalocean://12345
E0913 14:22:44.119502       1 node_controller.go:161] unexpected providerID format: 109037323, format should be: digitalocean://12345
E0913 14:22:45.211726       1 node_controller.go:161] unexpected providerID format: 109155407, format should be: digitalocean://12345
E0913 14:22:46.313092       1 node_controller.go:161] unexpected providerID format: 109172105, format should be: digitalocean://12345
E0913 14:22:47.447328       1 node_controller.go:161] unexpected providerID format: 109470133, format should be: digitalocean://12345
E0913 14:27:49.184234       1 node_controller.go:161] unexpected providerID format: 109037323, format should be: digitalocean://12345
E0913 14:27:50.954856       1 node_controller.go:161] unexpected providerID format: 109155407, format should be: digitalocean://12345
E0913 14:27:52.838915       1 node_controller.go:161] unexpected providerID format: 109172105, format should be: digitalocean://12345
E0913 14:27:53.946188       1 node_controller.go:161] unexpected providerID format: 109470133, format should be: digitalocean://12345
E0913 14:32:55.680519       1 node_controller.go:161] unexpected providerID format: 109037323, format should be: digitalocean://12345
E0913 14:32:57.436294       1 node_controller.go:161] unexpected providerID format: 109155407, format should be: digitalocean://12345
E0913 14:32:58.453036       1 node_controller.go:161] unexpected providerID format: 109172105, format should be: digitalocean://12345
E0913 14:32:59.578324       1 node_controller.go:161] unexpected providerID format: 109470133, format should be: digitalocean://12345
E0913 14:38:02.240903       1 node_controller.go:161] unexpected providerID format: 109037323, format should be: digitalocean://12345
E0913 14:38:04.694336       1 node_controller.go:161] unexpected providerID format: 109155407, format should be: digitalocean://12345
E0913 14:38:06.557360       1 node_controller.go:161] unexpected providerID format: 109172105, format should be: digitalocean://12345
E0913 14:38:07.886014       1 node_controller.go:161] unexpected providerID format: 109470133, format should be: digitalocean://12345
E0913 14:43:08.918326       1 node_controller.go:161] unexpected providerID format: 109037323, format should be: digitalocean://12345
E0913 14:43:09.997752       1 node_controller.go:161] unexpected providerID format: 109155407, format should be: digitalocean://12345
E0913 14:43:11.095733       1 node_controller.go:161] unexpected providerID format: 109172105, format should be: digitalocean://12345
E0913 14:43:12.158490       1 node_controller.go:161] unexpected providerID format: 109470133, format should be: digitalocean://12345
@andrewsykim
Copy link
Contributor

@ty2 this usually happens if there's a mismatch of the droplet name and the kubernetes node name. See https://github.com/digitalocean/digitalocean-cloud-controller-manager/blob/master/docs/getting-started.md#kubernetes-node-names-must-match-the-droplet-name-private-ipv4-ip-or-public-ipv4-ip for more details.

If the names match then this may be a bug related to setting --node-ip on the kubelet which I'm assuming is set based on the annotation alpha.kubernetes.io/provided-node-ip.

@ty2
Copy link
Author

ty2 commented Sep 14, 2018

@andrewsykim Your assumption is correct. Before I found this project, the node's internal IP is
droplet's public IP by default so I have added --node-ip on the kubelet for setting the node internal ip to the droplet's private ip.

version:
kubernetes: 1.11.2
digitalocean-cloud-controller-manager version: 0.1.7

Finally, I have resolved the labelling problem but I am not sure if it is related to --node-ip bug.
My solution is delete node and join it back to the existing cluster, 1 by 1.

Here is my journal showing how to reproduce this bug (node labelling is not functional) and resolved it.

Test 1:

  1. Add --node-ip= to all kubelet
  2. Reload systemd manager configuration on every nodes
  3. Create 3 master nodes node and 1 worker node by kubeadm
  4. Install pod network add-on
  5. Install DO CSI
  6. Install DO CCM
  7. Add --cloud-provider=external on every kubectl
  8. Restart all kubelet
  9. All nodes doesn't have labels beta.kubernetes.io/instance-type and failure-domain.beta.kubernetes.io
  10. Remove annotation alpha.kubernetes.io/provided-node-ip in every nodes by kubectl no edit <node name>
  11. Remove kubelet flag --node-ip in every nodes
  12. Restart kubelet and docker
  13. Node labelling still not working
  14. Delete worker node by kubectl delete no <worker-name> and kubeadm reset
  15. Join worker to cluster by kubeadm join
  16. Worker node labels are include beta.kubernetes.io/instance-type and failure-domain.beta.kubernetes.io, master node labels still without beta.kubernetes.io/instance-type and failure-domain.beta.kubernetes.io
  17. Delete worker node by kubectl delete no <worker-name> and kubeadm reset
  18. Add flag --node-ip to worker node kubelet
  19. Join worker to cluster by kubeadm join
  20. Worker node labels are include beta.kubernetes.io/instance-type and failure-domain.beta.kubernetes.io, master node labels still without beta.kubernetes.io/instance-type
  21. Keeps --node-ip flag on kubelet and elete nodes and join the node 1 by 1, node labelling will be work in every node.

Test 2:

  1. Only add --node-ip= to kubelet on master node 1
  2. Reload systemd manager configuration on master node 1
  3. Create 3 master nodes and 1 worker node by kubeadm
  4. Install DO CCM
  5. Add --cloud-provider=external on every kubectl
  6. Reload systemd manager configuration on every nodes and restart all kubelet
  7. No beta.kubernetes.io/instance-type label and failure-domain.beta.kubernetes.io label is added to any nodes
  8. Delete node and join the node to the cluster, 1 by 1.
  9. beta.kubernetes.io/instance-type label and failure-domain.beta.kubernetes.io label are appear in every nodes

Thanks.

@andrewsykim
Copy link
Contributor

@ty2 thank you very much for the detailed steps to reproduce. Let me take a look at this and report back to you here :)

@andrewsykim
Copy link
Contributor

Did you happen to set the --provider-id flag on the kubelet?

@ty2
Copy link
Author

ty2 commented Sep 14, 2018

I didn't try to set the --provider-id

Btw, after scale up my worker droplet in digitalocean dashboard, the label beta.kubernetes.io/instance-type and failure-domain.beta.kubernetes.io didn't update with the latest droplet info, it is normal?

I manually add the taint to force update it.
kubectl taint nodes kube-1-worker-1 node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule

@andrewsykim
Copy link
Contributor

What do you mean by scale up? Did you create a new droplet with the same name?

@ty2
Copy link
Author

ty2 commented Sep 15, 2018

That means resize droplet RAM and CPU in DO Dashboard.
I expected the label beta.kubernetes.io/instance-type will update to new size of droplet , but it won't.

For example, I switch off my worker droplet, resize it from 1GRAM to 2GRAM plan and switch on it in DO Dashboard , beta.kubernetes.io/instance-type=s-1vcpu-1gb won't update to beta.kubernetes.io/instance-type=s-1vcpu-2gb.

All the droplets have a unique hostname.

@andrewsykim
Copy link
Contributor

Ahh I see! So I believe that is expected as those labels are only set during registration (i.e when the taint it set, see here).

Kubernetes doesn't really handle the case where a node is rebooted with different properties and it's usually recommended that if you change the underlying machine, it should be registered as a new node.

@andrewsykim
Copy link
Contributor

Closing as this is expected behaviour upstream

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants