-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent unmanaged pods in GKE's containerd flavors (cilium/cilium#18486) #710
Conversation
323da3a
to
309a50d
Compare
309a50d
to
8257e22
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the CNI is installed, we expect any taints to be removed from nodes correct? I do not see any updates where we update the removal of the taint to now remove "NoExecute" taint.
Am I missing something?
@ldelossa I am not sure if I got the question right, but Cilium does remove the taint regardless of its effect. |
It's also probably pointing out that this is just a port of a piece of code that has been reviewed upstream. |
@bmcustodio gotchya, I was assuming the taint being removed was hard coded, and we needed to update it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the second commit msg referencing by saying?
[ upstream commit b049574 ]
@christarazi whenever I (back-)port something, I usually always include a reference to the original commit. In this case, it's just a reference to the original |
@bmcustodio That makes sense to me, but it's not clear where that commit SHA is coming from, so I'd suggest either linking it as a GH link or mentioning it's a cilium/cilium commit. |
The changes that we have been doing to /etc/defaults/kubelet are reset on node reboots, as is apparently the whole /etc directory --- which also means that /etc/cni/net.d/05-cilium.conf is removed. This would not be a problem if the assumption we made that the node taint we recommend placing on the nodes would come back upon reboots held true, but in practice it doesn't. Besides this, it seems that containerd will re-instante its CNI configuration file, and it will do so way before Cilium has had the chance to re-run on the node and re-create its CNI configuration, causing pods to be assigned IPs by the default CNI rather than by Cilium in the meantime. This commit attempts at preventing that from happening by observing that /home/kubernetes/bin/kubelet (i.e. the actual kubelet binary) is kept between reboots and executed concurrently with containerd by systemd. We leverage on this empirical observation to replace this file kubelet with a wrapper script that, under the required conditions, disables containerd, patches its configuration, removes undesired CNI configuration files, re-enables containerd and becomes the kubelet. [ cilium/cilium commit 36585e4 ] Signed-off-by: Bruno Miguel Custódio <[email protected]> Co-authored-by: Alexandre Perrin <[email protected]> Co-authored-by: Chris Tarazi <[email protected]>
To prevent situations in which the GKE node is forcibly stopped and re-created from causing unmanaged pods, and building on the observation that the node comes back with the same name and pods are already scheduled there, we change the recommended taint effect from NoSchedule to NoExecute, to cause any previously scheduled pods to be evicted, preventing them from getting IPs assigned by the default CNI. This should not impact other environments due to the nature of 'NoExecute', so we recommend it everywhere. [ cilium/cilium commit b049574 ] Signed-off-by: Bruno Miguel Custódio <[email protected]> Co-authored-by: Tam Mach <[email protected]>
8257e22
to
7c599ac
Compare
@christarazi sure, makes sense! I've rebased the PR and replaced the "upstream" mentions with "cilium/cilium" ones. |
Porting cilium/cilium#18486 to the CLI.