Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tolerate failed coredns svc errors on kubeadm init/upgrade #6244

Closed
wants to merge 1 commit into from

Conversation

mattymo
Copy link
Contributor

@mattymo mattymo commented Jun 8, 2020

Also deletes kube-dns svc in kube-system namespace which is unused

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 8, 2020
@k8s-ci-robot k8s-ci-robot requested a review from bozzo June 8, 2020 15:30
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mattymo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 8, 2020
@@ -38,6 +38,9 @@
when:
- inventory_hostname != groups['kube-master']|first
- not kubeadm_already_run.stat.exists
- kubeadm_init.rc != 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI fails with the following error:

 fatal: [instance-2]: FAILED! => {"msg": "The conditional check 'kubeadm_init.rc != 0' failed. The error was: error while evaluating conditional (kubeadm_init.rc != 0): 'dict object' has no attribute 'rc'\n\nThe error appears to be in '/builds/kargo-ci/kubernetes-sigs-kubespray/roles/kubernetes/master/tasks/kubeadm-secondary-legacy.yml': line 33, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: kubeadm | Init other uninitialized masters\n  ^ here\n"}

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Sep 14, 2020
@EppO
Copy link
Contributor

EppO commented Sep 14, 2020

kubeadm-secondary-legacy.yml is now gone, need to rebase this

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 14, 2020
Also deletes kube-dns svc in kube-system namespace which is unused

Change-Id: I61798a258efbc3f6ee72cd57f68a776db063d417
@mattymo mattymo force-pushed the kubeadm_coredns_svc branch from 6fdc0ed to 8f0daa9 Compare November 4, 2020 08:04
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 4, 2020
failed_when:
- kubeadm_join_control_plane.rc != 0
- '"field is immutable" not in kubeadm_join_control_plane.stderr'
- '"unable to create/update the DNS service" not in kubeadm_join_control_plane.stdaerr'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo stdaerr

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattymo ping

@woopstar
Copy link
Member

woopstar commented Jan 8, 2021

@mattymo should we merge this?

@champtar
Copy link
Contributor

Is this still relevant ?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 24, 2021
@k8s-ci-robot
Copy link
Contributor

@mattymo: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 3, 2021
@klippo
Copy link

klippo commented Jun 21, 2021

Is this still relevant ?

We're currently experiencing this issue with Kubespray 2.15 for unknown reasons

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 9, 2021
@juliohm1978
Copy link
Contributor

Coming in from the discussion at #7083. This is still relevant.

However, the patch here seems to be "simply ignore" the kubeadm return code when the error message relates to the issue. Is this actully safe? I mean... kubeadm dies and stops mid-upgrade. I'm not familiar with all of kubeadm phases, but can we be sure this is not skipping some important phase that kubeadm should be executing after that?

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 16, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mvilche
Copy link

mvilche commented Feb 24, 2022

I have same problem when upgrade 1.22.3 to 1.23.3
the fix exist in master branch?

@Monpoke
Copy link

Monpoke commented Oct 7, 2022

We have the same issue, upgrading from kubespray 1.18.2 to 1.19.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.