You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When upgrading to v2.23.0 in a calico in etcd mode, every calico-node pod will have it's configuration set to the same node name (first controlplane) resulting in IP allocation mayhem with every new pod getting an IP from the first controlplane IP block resulting in broken network for any new pod (existing pods are fine).
This seems to be due to #10177 which makes the install-cni init-container of calico-node pulls configuration from a single configmap that has first controlplane name set in stone (4f85b75#diff-91635da451087a93ab261ec90f794c825a5d584d12562fc94d183c50f63d81c3R43) instead of having it parametrized by node name (which is the case with kdd mode : 4f85b75#diff-91635da451087a93ab261ec90f794c825a5d584d12562fc94d183c50f63d81c3R38 and was the case in etcd mode before this PR when config was pulled from a config file on each host).
One workaround is to change calico-config (namespace kube-system) configmap replacing nodename with "nodename": "__KUBERNETES_NODE_NAME__" and then adding
). calico-node pods will then restart, init container install-cni will write the right nodename to the config file on each node and voila. New pods will be fine but any pod created during the bug will have to be deleted so that it gets a new - correct - IP.
We will submit a PR to fix this but if you encounter this issue please try this workaround, it worked for us 🥳
The text was updated successfully, but these errors were encountered:
Hi - will this fix also be added to a 2.23.2 release? The issue is preventing us from upgrading our cluster, and kubespray documentation specifies not to skip releases when upgrading (ie, we shouldn't go from 2.22 directly to 2.24), so we need a working 2.23 to give us an upgrade path.
Hello,
When upgrading to
v2.23.0
in a calico in etcd mode, everycalico-node
pod will have it's configuration set to the same node name (first controlplane) resulting in IP allocation mayhem with every new pod getting an IP from the first controlplane IP block resulting in broken network for any new pod (existing pods are fine).This seems to be due to #10177 which makes the
install-cni
init-container ofcalico-node
pulls configuration from a single configmap that has first controlplane name set in stone (4f85b75#diff-91635da451087a93ab261ec90f794c825a5d584d12562fc94d183c50f63d81c3R43) instead of having it parametrized by node name (which is the case withkdd
mode : 4f85b75#diff-91635da451087a93ab261ec90f794c825a5d584d12562fc94d183c50f63d81c3R38 and was the case in etcd mode before this PR when config was pulled from a config file on each host).One workaround is to change
calico-config
(namespace kube-system) configmap replacing nodename with"nodename": "__KUBERNETES_NODE_NAME__"
and then addingto
env
ofinstall-cni
init container of calico-node daemonset (just like it's done forkdd
mode, see here :kubespray/roles/network_plugin/calico/templates/calico-node.yml.j2
Lines 98 to 104 in 0f243d7
calico-node
pods will then restart, init containerinstall-cni
will write the right nodename to the config file on each node and voila. New pods will be fine but any pod created during the bug will have to be deleted so that it gets a new - correct - IP.We will submit a PR to fix this but if you encounter this issue please try this workaround, it worked for us 🥳
The text was updated successfully, but these errors were encountered: