-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Control Plane NoSchedule Taint Missing After Upgrade #10217
Comments
I have the same issue upgrading from 1.24.6 -> 1.25.6 |
Thanks @tman5 for the bug report. Would you please help to provide a PR to fix it. :-) |
More info - on new clusters on 1.25 the taints are applied appropriately. I don't know yet if this issue also exists going from 1.25 -> 1.26 or if it only exists on clusters 1.24 -> 1.25 |
FYI I just did an upgrade from kubespray 2.21 -> 2.22 -> 2.23 - 2.24 upgrading the k8s version along the way and the control plane taint did not re-apply |
I don't really understand how exactly #10464 can fix this, but I patched it in to Kubespray 2.21 and then the issue did not happen when upgrading anymore. |
Hi @rptaylor! Did you patch #10464 or #10532 into Kubespray 2.21? The issue related to the missing taint after the upgrade relies on this commit that was introduced in Kubernetes v1.25, which is the default Kubernetes version of Kubespray 2.21. Due to this commit, kubeadm removes the legacy taint This fix was only backported to Kubespray 2.23 but checking it wright now it makes sense to backport it in 2.22 and 2.21 too since this versions can be used to upgrade Kubernetes to |
@unai-ttxu thanks for the extra details! It seems crazy that k8s 1.25 would automatically remove the old taint without also automatically adding the new taint... ! In my environment when I hit this bug upgrading to kubespray 2.21, the /etc/kubernetes/kubeadm-config.yaml file for the master nodes looked correct:
This should have caused 'control-plane' to be added whether or not 'master' was ignored. Very odd. But I patched https://github.com/kubernetes-sigs/kubespray/pull/10464/files#diff-2510b9cc3e44d8d6e2cc83bd5b60ba888f278a70f1a87ba4df53a2d6f881fcae into my branch which removes "master" from the kubeadm config and that fixed it. |
Environment:
Cloud provider or hardware configuration: on-prem
OS (
printf "$(uname -srm)\n$(cat /etc/os-release)\n"
): Rocky 8.7Version of Ansible (
ansible --version
): 2.12Version of Python (
python --version
): 3.11.3Kubespray version (commit) (
git rev-parse --short HEAD
):0955df2ec
Network plugin used:
calico
Full inventory with variables (
ansible -i inventory/sample/inventory.ini all -m debug -a "var=hostvars[inventory_hostname]"
):Anything else do we need to know: After upgrading from 1.24.7 -> 1.25.10 the taint is not applied to the control_plane nodes even though in the kubeadm config file it does appear there, it is not configured on the nodes themselves. This taint is missing:
node-role.kubernetes.io/control-plane:NoSchedule
I should also note running the
cluster.yaml
playbook after does not fix this issue. Nor does running theupgrade.yaml
playbook either. The only "fix" is to manually apply the taint afterward.See this issue #9578
The text was updated successfully, but these errors were encountered: