You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Upgrade fails when there is at least 1 unmanaged pod
How to reproduce
Steps to reproduce the behavior:
install k8s cluster
create a single pod
upgrade k8s cluster
13:32:44 ERROR cli.engine.ansible.AnsibleCommand - fatal: [atsikham-k8s-kubernetes-node-vm-0 -> atsikham-k8s-kubernetes-master-13:32:37 ERROR cli.engine.ansible.AnsibleCommand - FAILED - RETRYING: k8s/utils | Drain master or node in preparation for maintenance (1 retries left).
13:32:44 ERROR cli.engine.ansible.AnsibleCommand - fatal: [atsikham-k8s-kubernetes-node-vm-0 -> atsikham-k8s-kubernetes-master-vm-0]: FAILED! => {"attempts": 20, "changed": true, "cmd": ["kubectl", "drain", "atsikham-k8s-kubernetes-node-vm-0", "--ignore-daemonsets", "--delete-local-data"], "delta": "0:00:00.086227", "end": "2021-10-01 13:32:44.175848", "msg": "non-zero return code", "rc": 1, "start": "2021-10-01 13:32:44.089621", "stderr": "error: unable to drain node \"atsikham-k8s-kubernetes-node-vm-0\", aborting command...\n\nThere are pending nodes to be drained:\n atsikham-k8s-kubernetes-node-vm-0\nerror: cannot delete Pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet (use --force to override): default/azure, default/azure1, default/azure2", "stderr_lines": ["error: unable to drain node \"atsikham-k8s-kubernetes-node-vm-0\", aborting command...", "", "There are pending nodes to be drained:", " atsikham-k8s-kubernetes-node-vm-0", "error: cannot delete Pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet (use --force to override): default/azure, default/azure1, default/azure2"], "stdout": "node/atsikham-k8s-kubernetes-node-vm-0 already cordoned", "stdout_lines": ["node/atsikham-k8s-kubernetes-node-vm-0 already cordoned"]}
Expected behavior
There is a preflight check to inform the user about such pods. After a user makes all preparations for theirs pods and stops them, upgrade is successful.
Proposed solution is to add preflight check that will query k8s for problematic pods before starting the upgrade. If any found, user should get an error with suggested command to execute manually before rerunning the upgrade.
Describe the bug
Upgrade fails when there is at least 1 unmanaged pod
How to reproduce
Steps to reproduce the behavior:
Expected behavior
There is a preflight check to inform the user about such pods. After a user makes all preparations for theirs pods and stops them, upgrade is successful.
Config files
Pod example can be taken here
Environment
epicli version: 1.3 develop
Additional context
DoD checklist
The text was updated successfully, but these errors were encountered: