Migrating from docker to containerd results in dependency errors -- docker-ce : Depends: containerd.io (>= 1.4.1) but it is not going to be installed #8431

juliohm1978 · 2022-01-14T16:36:36Z

Environment:

Cloud provider or hardware configuration: Baremetal
OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):

Linux 4.15.0-166-generic x86_64
NAME="Ubuntu"
VERSION="18.04.6 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.6 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

Kubespray version (commit) (git rev-parse --short HEAD):

Using ansible/python/kubespray provided by the official docker image: quay.io/kubespray/kubespray:v2.18.0

Network plugin used: Calico

Full inventory with variables (ansible -i inventory/sample/inventory.ini all -m debug -a "var=hostvars[inventory_hostname]"):

inventory.zip

Command used to invoke ansible:

ansible-playbook -i /inventory/inventory.ini --become \
  /kubespray/cluster.yml \
  -u infrasw \
  --limit=k8s-master01-lab-20190604.dis.tjpr.jus.br

Output of ansible run:

output.zip

Anything else do we need to know:

We deployed and have been upgrading our k8s clusters with Kubepsray for a few years now. With more recent upgrades, we decided to migrate the container engine from docker to containerd, in preparation for the definitive deprecation of docker.

Running the adjusted playbook cluster.yml with version 2.18.0, results in the following error:

TASK [container-engine/containerd : containerd | Remove any package manager controlled containerd package] ***
fatal: [k8s-master01-lab-20190604.dis.tjpr.jus.br]: FAILED! => {"changed": false, "msg": "'apt-get remove 'containerd.io'' failed: E: Error, pkgProblemResolver::Resolve generated breaks, this may be caused by held packages.\n", "rc": 100, "stderr": "E: Error, pkgProblemResolver::Resolve generated breaks, this may be caused by held packages.\n", "stderr_lines": ["E: Error, pkgProblemResolver::Resolve generated breaks, this may be caused by held packages."], "stdout": "Reading package lists...\nBuilding dependency tree...\nReading state information...\nSome packages could not be installed. This may mean that you have\nrequested an impossible situation or if you are using the unstable\ndistribution that some required packages have not yet been created\nor been moved out of Incoming.\nThe following information may help to resolve the situation:\n\nThe following packages have unmet dependencies:\n docker-ce : Depends: containerd.io (>= 1.4.1) but it is not going to be installed\n", "stdout_lines": ["Reading package lists...", "Building dependency tree...", "Reading state information...", "Some packages could not be installed. This may mean that you have", "requested an impossible situation or if you are using the unstable", "distribution that some required packages have not yet been created", "or been moved out of Incoming.", "The following information may help to resolve the situation:", "", "The following packages have unmet dependencies:", " docker-ce : Depends: containerd.io (>= 1.4.1) but it is not going to be installed"]}

As far as I can tell, the container-engine role is trying to uninstall a previous container engine dependency:

# apt-get remove 'containerd.io'
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 docker-ce : Depends: containerd.io (>= 1.4.1) but it is not going to be installed

All dependencies were installed by kubespray itself, so I guess I expected the playbook to be able to handle this type of migration.

Any notes on how we are supposed to proceed? I guess I could always isolate nodes and remove all dependencies by force. But it would be nice to know if I'm missing something or doing something wrong here.

Regards.

The text was updated successfully, but these errors were encountered:

juliohm1978 · 2022-01-14T22:26:24Z

If anyone else hits this bump, I was able to workaround by manually draining, isolating the node and uninstalling docker and containerd from the node before running cluster.yml again.

sudo apt remove docker-ce docker-ce-cli containerd.io

cristicalin · 2022-01-15T08:26:14Z

Note that we don't quite support transitioning from one container engine to another. If you are upgrading from pre-2.18 to 2.18+ you need to change your container_manager inventory variable to keep compatibility with your old version in this case set it to container_manager: docker.

Changing the container_manager of a cluster is extremely disruptive and involves redeploying the cluster at the moment and is your best course of action. In this case just run the reset.yml playbook to clean the cluster and then deploy the new version with cluster.yml.

juliohm1978 · 2022-01-15T12:43:50Z

Thank you for the feedback, @cristicalin !

Migrating the container engine does not seem any more disruptive than any other Kubespray upgrade. For years, I have been upgrading our clusters with Kubespray and that usually means restarting docker daemon, calico and other core components. It has always been necessary to cordon and drain nodes before proceeding.

I can see how Kubespray does not yet support this officially. Correct me if I'm wrong, but as far as I understand, changing the container engine involves few simple steps:

Cordon and drain the node
Uninstall docker-ce and docker-ce-cli
Install containerd (if that wasn't already there because of docker)
Change the references of the container engine in the kubelet and a couple of other places
Reboot the node to get a fresh clean set of containers in the new engine

Natively, k8s supports having nodes with different engines. I was able to run the above steps and, apparently, it works fine as long as you don't need to run Kubespray playbooks to reconfigure something else in the cluster during the process.

cristicalin · 2022-01-15T15:10:32Z

In theory yes, those are the steps but there are a few more changes than kubespray does not handle and would need to be done manually.

The node cleanup is not something we do in the upgrade procedure and kubespray itself is unaware of what was the old container manager since we don't have any detection to actually care to clean up the old one.

You would have a much easier time if you can just remove and reprovision the nodes though we never tested the cluster expansion procedure with changing the container manager. We would be happy to have feedback on how well this works in practice and fix issues you may encounter or accept your code contributions for any fixes.

Note that containerd for kubespray means the one from upstream github release page, we no longer support the one built by docker outside of using it with the docker engine itself.

juliohm1978 · 2022-01-15T15:43:05Z

Thank you.

I'll be glad to get back with feedback on our cluster after this.

dtodor · 2022-01-18T08:37:26Z

We came across the same issue. I think the very least that has to be tested before a release is:

create a clean installation with the current tag (e.g. v2.17.1), no configuration changes whatsoever
perform an upgrade to the next tag (e.g. v2.18.0)

This test would have failed.

cristicalin · 2022-01-18T17:38:41Z

Such tests are performed as part of the CI, it is just that we don't test the change of engines. The CI was explicitly hardcoded to test upgrades for the new default configuration.

There are some discussions on how to properly support container manager changes as in the linked PR but explicitly removing docker engine in the containerd role is not a long term maintainable solution. Once we have a clean solution we will consider backporting it to 2.18.x.

For now, it is highly recomended to test the upgrade with your existing configuration, read the release notes for things that have changed in the public configuration variables and update and pin your inventories before performing production upgrades.

juliohm1978 · 2022-01-18T18:45:42Z

I'll have to agree with @cristicalin. The scenario raised by this issue is not currently supported, so discussing any related errors seem like a moot point at this moment.

I'll be in the look out for PRs and new improvements in that sense.

If there are dependency error during a normal upgrade between k8s versions, that deserves a new issue on its own.

Closing this for now.

Thank you!

juliohm1978 · 2022-01-25T19:58:13Z

Hi guys. Just popping back to provide some feedback on this adventure.

We managed to migrate container engine from docker to containerd in our Kubespray installation, along with a few manual steps. Docker has been fully uninstalled from all nodes and, as far as we can tell, everything is working as expected.

The following steps have been performed in our particular environment (specs below) and, as mentioned before, there are no guarantees that anything else is needed to adjust or cleanup the cluster. Any other ideas and missing steps are apreciated as further feedback. Hopefully, this can provide some insights into how this procedure could be integrated into the playbooks officially.

Environment

Nodes: Ubuntu 18.04 LTS
Cloud Provider: None (baremetal or VMs)
Kubernetes version: 1.21.5
Kubespray version: 2.18.0

Important considerations

If you require minimum downtime, nodes need to be cordoned and drained before being processed, one by one. If you wish to run cluster.yml only once and get it all done in one swoop, downtime will be significanly higher, since docker will need to be manually removed from all nodes before the playbook runs. For minimum downtime, the following steps will be executed multiple times, once for each node.

Processing nodes one-by-one also means you will not be able to update any other cluster configuration using Kubespray before this procedure is finished and the cluster is fully migrated.

Everything done here requires full root access to every node.

Steps

1) Pick one or more nodes for processing.

I am not sure how the order might affect this procedure. So, to be sure, I decided to start with master and etcd nodes all together, followed by each worker node individually.

2) Cordon and drain the node

... because, downtime.

3) Adjust k8s-cluster.yml in your inventory.

resolvconf_mode: host_resolvconf
container_manager: containerd

4) Stop docker and kubelet daemons

service kubelet stop
service docker stop

5) Uninstall docker + dependencies

apt-get remove -y --allow-change-held-packages containerd.io docker-ce docker-ce-cli docker-ce-rootless-extras

6) Run cluster.yml playbook with --limit

cluster.yml --limit=NODENAME

This effectively reinstalls containerd and seems to place all config files in the right place. When this completes, kubelet will immediately pick up the new container engine and start spinning up DaemonSets and kube-system Pods.

Optionally, if you feel confident, you can remove /var/lib/docker anytime after this step.

rm -fr /var/lib/docker

You can watch new containers using crictl.

crictl ps -a

7) Replace the cri-socket node annotation to point to the new container engine

Node annotations need to be adjusted. Kubespray will not do this, but a simple kubectl is enough.

kubectl annotate node NODENAME --overwrite kubeadm.alpha.kubernetes.io/cri-socket=/var/run/containerd/containerd.sock

As far as I can tell, the annotation is only required by kubeadm to follow through future cluster upgrades.

8) Reboot the node

Reboot, just to make sure everything restarts fresh before the node is uncordoned.

After thoughts

If your cluster runs a log aggregator, like fluentd+Graylog, you will likely need to adjust collection filters and parsers. While docker generates Json logs, containerd has its own space delimited format. For example:

2020-01-10T18:10:40.01576219Z stdout F application log message...

In our case, we just had to switch the fluentd parser to fluent-plugin-parser-cri.

juliohm1978 · 2022-01-25T20:03:14Z

PS: we also tested the next Kubernetes upgrade from 1.21 to 1.22.

Works like a charm :D

cristicalin · 2022-01-25T20:06:09Z

This procedure would actually make for a good addition to our docs, would you mind submitting a documentation PR and updating https://github.com/kubernetes-sigs/kubespray/blob/master/docs/upgrades.md ?

juliohm1978 · 2022-01-25T20:09:03Z

Sure. It's a lot of new information.

Should I add a new *.md or just append to upgrades.md?

cristicalin · 2022-01-25T20:11:29Z

You can create a new folder called docs/upgrades and create the procedure then and link it from upgrades.md this would open up the docs for future more complex upgrade procedures.

…-containerd), with special emphasis on the fact that the procedure is still not officially supported. Follow up from kubernetes-sigs#8431. Signed-off-by: Julio Morimoto <[email protected]>

juliohm1978 · 2022-01-25T21:06:24Z

On the way
#8471

…-containerd), with special emphasis on the fact that the procedure is still not officially supported. Follow up from kubernetes-sigs#8431. Signed-off-by: Julio Morimoto <[email protected]>

…-containerd), with special emphasis on the fact that the procedure is still not officially supported. (#8471) Follow up from #8431. Signed-off-by: Julio Morimoto <[email protected]>

Xartos · 2022-02-10T09:40:50Z

@juliohm1978 You did never encounter any issues with etcd? I'm trying to do a migration from kubespray 2.17.1 -> 2.18.0 following your guide to migrate to containerd.

Because when I now run ansible-playbook cluster.yml ... --limit=NODENAME on one of the masters I get an error.

When I look in the cluster I see that etcd is still running since the static manifest (from kubespray 2.17) is still there and it's using the old version (etcd v3.4.13)
I tried to remove that manifest and run it again, but then I get stuck on some other task.

EDIT: Just remebered that I have etcd_kubeadm_enabled set. Maybe that can cause the issue?

Xartos · 2022-02-10T12:46:20Z

Can confirm that there seemed to be a missing step if you had the etcd_kubeadm_enabled set.
PTAL #8528

juliohm1978 · 2022-02-11T20:50:09Z

I am using the Kubespray default for etcd_kubeadm_enabled. Our etcd was installed as a docker container. It was replaced by a host native binary.

…-containerd), with special emphasis on the fact that the procedure is still not officially supported. (kubernetes-sigs#8471) Follow up from kubernetes-sigs#8431. Signed-off-by: Julio Morimoto <[email protected]>

juliohm1978 added the kind/bug Categorizes issue or PR as related to a bug. label Jan 14, 2022

cristicalin mentioned this issue Jan 16, 2022

feat: check & uninstall container engine if needed #8439

Merged

juliohm1978 closed this as completed Jan 18, 2022

juliohm1978 mentioned this issue Jan 25, 2022

Provide initial guidelines for a container engine migration #8471

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrating from docker to containerd results in dependency errors -- docker-ce : Depends: containerd.io (>= 1.4.1) but it is not going to be installed #8431

Migrating from docker to containerd results in dependency errors -- docker-ce : Depends: containerd.io (>= 1.4.1) but it is not going to be installed #8431

juliohm1978 commented Jan 14, 2022 •

edited

Loading

juliohm1978 commented Jan 14, 2022

cristicalin commented Jan 15, 2022

juliohm1978 commented Jan 15, 2022

cristicalin commented Jan 15, 2022 •

edited

Loading

juliohm1978 commented Jan 15, 2022

dtodor commented Jan 18, 2022

cristicalin commented Jan 18, 2022

juliohm1978 commented Jan 18, 2022

juliohm1978 commented Jan 25, 2022

juliohm1978 commented Jan 25, 2022

cristicalin commented Jan 25, 2022

juliohm1978 commented Jan 25, 2022

cristicalin commented Jan 25, 2022

juliohm1978 commented Jan 25, 2022

Xartos commented Feb 10, 2022 •

edited

Loading

Xartos commented Feb 10, 2022

juliohm1978 commented Feb 11, 2022

Migrating from docker to containerd results in dependency errors -- docker-ce : Depends: containerd.io (>= 1.4.1) but it is not going to be installed #8431

Migrating from docker to containerd results in dependency errors -- docker-ce : Depends: containerd.io (>= 1.4.1) but it is not going to be installed #8431

Comments

juliohm1978 commented Jan 14, 2022 • edited Loading

juliohm1978 commented Jan 14, 2022

cristicalin commented Jan 15, 2022

juliohm1978 commented Jan 15, 2022

cristicalin commented Jan 15, 2022 • edited Loading

juliohm1978 commented Jan 15, 2022

dtodor commented Jan 18, 2022

cristicalin commented Jan 18, 2022

juliohm1978 commented Jan 18, 2022

juliohm1978 commented Jan 25, 2022

Environment

Important considerations

Steps

After thoughts

juliohm1978 commented Jan 25, 2022

cristicalin commented Jan 25, 2022

juliohm1978 commented Jan 25, 2022

cristicalin commented Jan 25, 2022

juliohm1978 commented Jan 25, 2022

Xartos commented Feb 10, 2022 • edited Loading

Xartos commented Feb 10, 2022

juliohm1978 commented Feb 11, 2022

juliohm1978 commented Jan 14, 2022 •

edited

Loading

cristicalin commented Jan 15, 2022 •

edited

Loading

Xartos commented Feb 10, 2022 •

edited

Loading