-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tolerate failed coredns svc errors on kubeadm init/upgrade #6244
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mattymo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -38,6 +38,9 @@ | |||
when: | |||
- inventory_hostname != groups['kube-master']|first | |||
- not kubeadm_already_run.stat.exists | |||
- kubeadm_init.rc != 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI fails with the following error:
fatal: [instance-2]: FAILED! => {"msg": "The conditional check 'kubeadm_init.rc != 0' failed. The error was: error while evaluating conditional (kubeadm_init.rc != 0): 'dict object' has no attribute 'rc'\n\nThe error appears to be in '/builds/kargo-ci/kubernetes-sigs-kubespray/roles/kubernetes/master/tasks/kubeadm-secondary-legacy.yml': line 33, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: kubeadm | Init other uninitialized masters\n ^ here\n"}
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
kubeadm-secondary-legacy.yml is now gone, need to rebase this /remove-lifecycle stale |
Also deletes kube-dns svc in kube-system namespace which is unused Change-Id: I61798a258efbc3f6ee72cd57f68a776db063d417
6fdc0ed
to
8f0daa9
Compare
failed_when: | ||
- kubeadm_join_control_plane.rc != 0 | ||
- '"field is immutable" not in kubeadm_join_control_plane.stderr' | ||
- '"unable to create/update the DNS service" not in kubeadm_join_control_plane.stdaerr' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo stdaerr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattymo ping
@mattymo should we merge this? |
Is this still relevant ? |
@mattymo: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
We're currently experiencing this issue with Kubespray 2.15 for unknown reasons |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
Coming in from the discussion at #7083. This is still relevant. However, the patch here seems to be "simply ignore" the kubeadm return code when the error message relates to the issue. Is this actully safe? I mean... kubeadm dies and stops mid-upgrade. I'm not familiar with all of kubeadm phases, but can we be sure this is not skipping some important phase that kubeadm should be executing after that? |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I have same problem when upgrade 1.22.3 to 1.23.3 |
We have the same issue, upgrading from kubespray 1.18.2 to 1.19.1. |
Also deletes kube-dns svc in kube-system namespace which is unused