[EKS] [CRI]: Support for Containerd CRI #313

paavan98pm · 2019-06-05T17:06:48Z

Tell us about your request
What do you want us to build?
Support for Containerd CRI

Which service(s) is this request for?
EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Currently, EKS nodes run dockerd. Containerd is a popular CRI that is more efficient.

Are you currently working around this issue?
How are you currently solving this problem?
AL2 nodes fail when containerd is installed.

Additional context
Anything else we should know?
This will enable customers customise and configure kubelet parameters to select their preferred Container Runtime.

Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

rverma-nikiai · 2019-08-03T07:33:31Z

No updates in past 2 months, even no acknowledgement. This looks bad.

lgg42 · 2019-08-09T19:39:06Z

Yep, time to think about it right?

jtoberon · 2019-08-23T01:26:53Z

We're excited to support containerd, too. We are making sure that we have the right test, security, and release tools in place before we officially recommend it to our customers.

It would be useful to know if folks have specific thoughts about how the runtime should be configured:

Should we install both docker-ce and containerd in the AMI? Should both start automatically upon instance launch?
Do customers want to use a runtimeClassName config to pick a runtime dynamically?

whereisaaron · 2019-08-23T03:07:34Z

Great @jtoberon. If there is just one AMI, then both runtimes with runtimeClassName sounds like a good transition plan. But for us the end goal is for the cluster/AMIs to be docker-free; the past grief caused by the unstable development practices, the kitchen-sinking of swarm, the ‘moby’ mess and other junk changes make us wary of that upstream project, we are ready to leave it behind.

owenthereal · 2019-09-11T23:48:49Z

👍 to being able to run both runtimes with runtimeClassName. Is there a ETA for the support?

edify42 · 2019-09-13T07:16:19Z

Is it as simple as making changes to the worker nodes AMI https://github.com/awslabs/amazon-eks-ami to run containerd? Or are the masters/control-plane also in need of changes?

jtoberon · 2019-09-16T18:38:35Z

Yes, we intend to change that AMI build to install the containerd software.

To support that change, we need to do a bunch of other things. Here are a few examples:

Build the containerd binary into the Amazon Linux yum repositories.
Establish a process for learning about and responding to embargoed CVEs in any new software that we build.
Do performance testing to make sure we understand the performance implications of this runtime change for our customers.
Set up automated tests to ensure that all of the software that we package for our customers continues to work together over time.

whereisaaron · 2019-09-16T22:51:37Z

Thanks @jtoberon yes I imagine it is not trivial. And I expect it will need a healthy period of 'developer preview' too.

One other possible chore for your list is ensuring the container logging and log rotation works well, and that you have a solid fluentd/cloudwatch configuration. Since that works quite differently for containerd compared to dockerd.

kvidhya · 2019-09-24T20:14:28Z

Hello, I am trying to get runsc running on EKS workers. Is there a way to do it today? Appreciate any pointers

kvidhya · 2019-09-24T20:18:49Z

Also when will the containerd support be released?

Thutm · 2019-11-13T20:30:23Z

Now that docker enterprise was acquired could this also be scoped out for the ECS platform? Not really sure what goes into that other than maybe updating ecs optimized ami and the ecs-agent itself? What is everyone thoughts on that?

kr3cj · 2020-01-09T18:58:05Z

We would be interested in switching out docker for containerd in our EKS nodes as well, in hopes that it might help with things like awslabs/amazon-eks-ami#195

inductor · 2020-02-04T08:45:41Z

In design, Fargate nodes seem to use containerd as their container runtime.

mvisonneau · 2020-05-05T19:14:36Z

With a custom AMI setup, at first sight it seems to work correctly 👍

eks.1 - 1.16

# Install containerd
~# wget https://storage.googleapis.com/cri-containerd-release/cri-containerd-1.3.4.linux-amd64.tar.gz
~# tar --no-overwrite-dir -C / -xzf /tmp/cri-containerd-1.3.4.linux-amd64.tar.gz
~# containerd config default > /etc/containerd/config.toml
~# systemctl containerd start

# Added couple of necessary flags on the kubelet
--container-runtime-endpoint=unix:///run/containerd/containerd.sock
--container-runtime=remote

# Node info
System Info:
  Machine ID:                 ec24d43f57c1054dbf44887269f36c5a
  System UUID:                ec24d43f-57c1-054d-bf44-887269f36c5a
  Boot ID:                    976f4d4e-07df-4d7a-94a4-9ff7a661ed70
  Kernel Version:             5.4.0-1009-aws
  OS Image:                   Ubuntu 20.04 LTS
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.3.4
  Kubelet Version:            v1.16.8
  Kube-Proxy Version:         v1.16.8

# Ready event
  Normal   NodeReady                9m1s                   kubelet, ip-10-0-0-1.eu-west-1.compute.internal  Node ip-10-0-0-1.eu-west-1.compute.internal status is now: NodeReady

kferrone · 2020-05-12T20:07:28Z

EKS with some NodeGroups as Docker and some as ContainerD? Is this possible with K8S? Would be nice . . .

hendrikhalkow · 2020-05-21T12:06:52Z

Hi EKS team, it‘s 2020 and containerd is a CNCF graduated project. It shouldn‘t be only installed on the EKS AMIs, it should be the default runtime.

When will we see containerd support on EKS?

midN · 2020-05-28T15:25:55Z

Any updates on that?

mikestef9 · 2020-07-09T16:01:38Z

EKS/Fargate uses the containerd runtime, so that is a production ready option today.

Our plan for containerd on worker nodes is to add official EKS support for Bottlerocket, once the project graduates from a public preview.

Bottlerocket is a Linux based operating system purpose-built to run containers. We encourage you to try out the public preview and leave any feedback on the GitHub project. You can learn more about Bottlerocket here.

ecrousseau · 2020-09-01T07:41:41Z

Hi @mikestef9 and/or EKS team - we are trying to work out which 'sandboxing' features (e.g. gVisor, Firecracker) are available in EKS - as far as I can tell, none are currently supported (but please correct me if that is wrong).

Bottlerocket 1.0.0 was released ~15 hours ago - what does the timeline look like for it being available in EKS? Will it be supported when using managed node groups? And - the big question for me - given that it uses containerd, does that mean we'll be able to plug in Firecracker/similar?

mreferre · 2020-09-01T09:31:41Z

@ecrousseau can you clarify what you mean by "sandboxing"? I'd argue EKS/Fargate does provide "sandboxing" in that each Kubernetes pod runs in its dedicated OS/kernel (or VM/instance if you will). When the user deploys a pod we source on the fly a dedicated vm/instance from the Fargate pool and use that dedicated vm/instance to run that specific pod. Rinse and repeat. This vm/instance could be an EC2 instance that is part of the Fargate pool or could be a micro-VM running on Firecracker. This is an implementation detail and would not be something that a user should be aware of. They both implement the same deployment pattern (1 pod per instance/vm).

Is this the "sandboxing" you are alluding to?

ecrousseau · 2020-09-01T23:45:17Z

Thanks @mreferre - yes, that kind of separation is what I was talking about. I will have a look at Fargate.

matthewhembree · 2020-11-05T07:47:39Z

EKS/Fargate isn't a perfect solution for sandboxing in my environment. We use the Datadog agent running as a daemonset to collect logs and metrics (and maybe soon APM) for our workloads.

I haven't tested this yet: To accomplish those collections with EKS/Fargate, we would need to run the Datadog agent as a sidecar container in every EKS/Fargate pod. This is based on the assumption that the Datadog agent could still do the collections as a non-privileged container (e.g. access /var/logs/pods possibly as a readOnly mount).

Assuming that we could access the logs and metrics, we'd still have to exponentially increase our resource utilization (i.e. costs) to run the sidecars versus the daemonset.

mreferre · 2020-11-05T10:08:05Z

@matthewhembree refer to the Datadog has this documentation to see how it's implemented in details. You are correct that, with the current model, you'd need to have an agent sidecar per each pod. Unless you are consuming nearly all resources you are sizing your pod for, I would speculate that resource consumption isn't as relevant as having to change the operational model to inject these sidecars to make logging work on EKS/Fargate. We want to solve for this by way of this feature that we are working on. The idea would be to have a router embedded into the Fargate service that you transparently use to ship logs (and more) to an external endpoint with a centralized configuration. Not only would you not to have to inject a sidecar into every pod but you would not have to deal with DeamonSets either (given with Fargate there are no nodes by definition). If that would be of interest to you please subscribe to that issue to get updates as we progress.

phenri00 · 2020-12-02T13:00:02Z

Docker is now deprecated(v1.20).

https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#changes-by-kind-1

Saziba · 2020-12-02T22:13:47Z

Calm down... it's not like Docker will be broken:

kubernetes/kubernetes#94624 (comment)

TBBle · 2020-12-03T03:26:30Z

As I understand, both Fargate and Bottlerocket are generally available and use containerd CRI, and are supported by tools like eksctl, c.f., Bottlerocket and Fargate.

Managed Nodegroups don't specifically support Bottlerocket yet. That's a different issue, #950. There is support for custom AMIs through a Launch Template, which could certainly be a Bottlerocket AMI, if you want this combination working now.

Is there anything left that is going to happen in the future for this ticket? Some things trawled from the comments on this ticket:

Migrating Amazon's non-bottlerocket EKS-optimised AMIs (i.e. AmazonLinux2) to use containerd CRI, or offering a containerd-based alternative AMI. There's plenty of users using docker-in-docker in privileged containers on their k8s clusters, who still need Docker running on the node for their workflow, and that will still work even after k8s is not using Docker as its container runtime (as long as containerd for k8s and Docker containerd don't conflict...)
What about Windows EKS worker nodes? k8s Windows containerd CRI support is beta in 1.19 and hopefully GA around 1.21.

If those aren't part of the current plan, then the plan at #313 (comment) has been achieved, so maybe resolve this ticket and track separate requests like the above two separately.

phenri00 · 2020-12-03T06:44:59Z

Calm down... it's not like Docker will be broken:

kubernetes/kubernetes#94624 (comment)

What if you are running dind in your CI? This will not work anymore, right?

(yes, I know you should use something like kaniko for building images)

TBBle · 2020-12-03T08:04:54Z

What if you are running dind in your CI? This will not work anymore, right?

You can still run Docker on the node for ~~dind~~ DooD use, it's just that you'll also need a containerd CRI (or cri-o, or other CRI implementation) on the node for k8s to use. It's like a GPU then, it's up to the cluster administrator to make Docker available as a system resource if necessary.

For a dind use-case, you should be able to run dockerd as a privileged non-sandboxed daemonset image, so you don't have to install it manually on your nodes, like we do with kube-proxy and many CSI/CNI plugins for example. ~~(It's possible that even a fully-privileged pod doesn't have the access it needs for this, I acknowledge).~~

It's also possible that by the time this happens, Docker's bundled containerd (on Linux) will have CRI available, so you could still install Docker on the node, and then point k8s at the Docker-bundled containerd's CRI. I'm not certain that will work, but if you have no choice but ~~dind~~ DooD, then it'll be worth exploring sooner rather than later.

Of course, if your workflow was assuming that the node would have access to ~~dind~~ DooD-built images without pushing them to a registry, that will break irretrievably. It's already a pretty-risky approach now though, so hopefully no one's still relying on that a year from now.

Edit: Actually, I think we're talking about DooD, "Docker outside of Docker", where a container has access to the host's /var/run/docker.sock. That's the flow that breaks when users are relying on the k8s install automatically including a Docker daemon. That flow also breaks now if you use Bottlerocket today, which doesn't use Docker, or Fargate, where you can't run privileged containers at all (and is also not using Docker). So this was already a bad idea on k8s clusters.

Docker-in-Docker is where you have a Docker daemon running in a privileged container with one of the docker-dind container images. I believe that will still work, as it just needs to have access to the host, and doesn't rely on the external container runtime also being Docker, so I expect that will work on Bottlerocket today.

So in short: Move to Bottlerocket and test your stuff. If anything breaks around Docker, move back and you have 3 or so k8s releases to fix that before the whole ecosystem becomes like that. And you'll have a less-brittle pipeline as a bonus.

Saziba · 2020-12-03T14:11:06Z

Calm down... it's not like Docker will be broken:
kubernetes/kubernetes#94624 (comment)

What if you are running dind in your CI? This will not work anymore, right?

(yes, I know you should use something like kaniko for building images)

https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/

phenri00 · 2020-12-03T14:24:54Z

Calm down... it's not like Docker will be broken:
kubernetes/kubernetes#94624 (comment)

What if you are running dind in your CI? This will not work anymore, right?
(yes, I know you should use something like kaniko for building images)

https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/

Thanks. Great stuff. So it seems like dind will be broken then(if you are mounting the socket):

One thing to note: If you are relying on the underlying docker socket (/var/run/docker.sock) as part of a workflow within your cluster today, moving to a different runtime will break your ability to use it. This pattern is often called Docker in Docker. There are lots of options out there for this specific use case including things like kaniko, img, and buildah

Kaniko looks great so will probably move to that.

MarcusNoble · 2020-12-03T14:29:45Z

I think you can still make use of the Docker socket by having the docker runtime on your host OS but using a different runtime with Kubernetes. I'd only go for that if updating your applications to something like Kaniko isn't possible right now though.

mikestef9 · 2021-05-04T00:30:23Z

Hey all some updates here,

We will be adding containerd as a container runtime to the EKS optimized Amazon Linux 2 AMI.

The current rollout plan is as follows:

Add a bootstrap flag to the EKS AMI allowing users to toggle between containerd and Docker as the container runtime. You can see a draft PR here. Note that Docker will remain the default runtime.
In a future EKS AMI Kubernetes minor version (currently targeting v1.22), change the default to containerd. You will no longer be able to use the Docker runtime.

We will have a blog with more info and recommendations on how to prepare for the removal of Docker as a supported container runtime coming in the next month or two.

mo-saeed · 2021-05-10T13:40:27Z

Hi @mikestef9 what about Amazon EKS optimized Ubuntu Linux AMIs

Thanks

TBBle · 2021-05-10T14:05:15Z

@mo-saeed: According to that link, that's a question for Canonical, as they provide those images. I don't see anything about a switch to containerd in their changelog, so you might want to open a feature request at https://bugs.launchpad.net/cloud-images for this.

sean-keane25 · 2021-05-29T02:11:30Z

@mikestef9 when and where will this be officially announced/confirmed:

"In a future EKS AMI Kubernetes minor version (currently targeting v1.21), change the default to containerd. You can still manually switch back to Docker with the bootstrap flag."

robert-heinzmann-logmein · 2021-06-09T14:44:00Z

It seems that containerd is now mentioned in the changelog since AMI Release v20210519

https://github.com/awslabs/amazon-eks-ami/blob/master/CHANGELOG.md

Does this mean containerd is now supported ?

stevehipwell · 2021-06-09T14:46:43Z

@robert-heinzmann-logmein it's been mentioned longer than that but that's because it's a dependency of Docker.

ulm0 · 2021-07-06T15:44:41Z

Hi, is there any time estimation for this to come live?

dany74q · 2021-07-19T08:19:17Z

I'd just add that it would be great if changing the CRI would be possible from the API for managed node groups, alongside the availability in eksctl which might change it in the ASG launch template 🙏

mikestef9 · 2021-07-20T16:20:44Z

The EKS optimized AMI now contains a bootstrap flag where you can optionally enable containerd as the runtime. See the v1.21 release blog for details as well as EKS documentation

Note that starting with EKS support for Kubernetes v1.22, containerd will become the default, and only available container runtime to use in the EKS optimized Amazon Linux 2 AMI.

Also, keep in mind, Bottlerocket is a production ready AMI optimized for EKS that already ships with containerd by default (and will soon be added as native option to EKS managed node groups #950)

dany74q · 2021-07-20T17:08:09Z

That's terrific ! @mikestef9 - quick q - I think a common use case would be provisioning node groups via terraform/pulumi or alike, where one can't override the launch template / bootstrap args directly (only by supplying a name + version).

Would it be possible to either have a dedicated launch template version for the crd runtime, or somehow integrate this to the creation API ?

artificial-aidan · 2021-09-30T02:48:02Z

Will containerd be updated to 1.5.0? This is a crucial feature for a lot of workloads, and the last AMI I pulled was on 1.4.6.

paavan98pm added the Proposed Community submitted issue label Jun 5, 2019

tabern added the EKS Amazon Elastic Kubernetes Service label Jul 2, 2019

rverma-nikiai mentioned this issue Aug 4, 2019

Support for containerd awslabs/amazon-eks-ami#309

Closed

whereisaaron mentioned this issue Oct 25, 2019

[EKS] [request]: CRI-O support #546

Open

kekoav mentioned this issue Nov 28, 2019

[Amazon Linux] [request]: Add containerd to yum repos #614

Closed

whereisaaron mentioned this issue May 5, 2020

support for containerd (or CRIO) CRI-runtime eksctl-io/eksctl#696

Closed

mikestef9 removed the Proposed Community submitted issue label Jul 15, 2020

Callisto13 mentioned this issue Apr 14, 2021

Setting up containerd with eksctl (with preBootstrapCommands) eksctl-io/eksctl#3572

Closed

rothgar mentioned this issue Apr 27, 2021

Container runtime bootstrap awslabs/amazon-eks-ami#656

Closed

Callisto13 mentioned this issue Jul 15, 2021

Expose support for containerd eksctl-io/eksctl#3979

Closed

mikestef9 closed this as completed Jul 20, 2021

dany74q mentioned this issue Jul 21, 2021

containerd-shim process isn't reaped for some killed containers containerd/containerd#5708

Open

mikestef9 mentioned this issue Jul 26, 2021

Creating containerd flavored AMIs awslabs/amazon-eks-ami#710

Closed

dany74q mentioned this issue Jul 27, 2021

[EKS] Managed Node Groups support for Bottlerocket #950

Closed

HarshadRanganathan mentioned this issue Aug 19, 2022

EKS Upgrades HarshadRanganathan/harshadranganathan.github.io#98

Open

[EKS] [CRI]: Support for Containerd CRI #313

[EKS] [CRI]: Support for Containerd CRI #313

Comments

paavan98pm commented Jun 5, 2019

rverma-nikiai commented Aug 3, 2019

lgg42 commented Aug 9, 2019

jtoberon commented Aug 23, 2019

whereisaaron commented Aug 23, 2019

owenthereal commented Sep 11, 2019

edify42 commented Sep 13, 2019

jtoberon commented Sep 16, 2019 • edited Loading

whereisaaron commented Sep 16, 2019

kvidhya commented Sep 24, 2019 • edited Loading

kvidhya commented Sep 24, 2019

Thutm commented Nov 13, 2019

kr3cj commented Jan 9, 2020

inductor commented Feb 4, 2020

mvisonneau commented May 5, 2020

kferrone commented May 12, 2020

hendrikhalkow commented May 21, 2020

midN commented May 28, 2020

mikestef9 commented Jul 9, 2020

ecrousseau commented Sep 1, 2020

mreferre commented Sep 1, 2020

ecrousseau commented Sep 1, 2020

matthewhembree commented Nov 5, 2020

mreferre commented Nov 5, 2020

phenri00 commented Dec 2, 2020

Saziba commented Dec 2, 2020

TBBle commented Dec 3, 2020

phenri00 commented Dec 3, 2020

TBBle commented Dec 3, 2020 • edited Loading

Saziba commented Dec 3, 2020

phenri00 commented Dec 3, 2020

MarcusNoble commented Dec 3, 2020

mikestef9 commented May 4, 2021 • edited Loading

mo-saeed commented May 10, 2021

TBBle commented May 10, 2021

sean-keane25 commented May 29, 2021

robert-heinzmann-logmein commented Jun 9, 2021

stevehipwell commented Jun 9, 2021

ulm0 commented Jul 6, 2021

dany74q commented Jul 19, 2021 • edited Loading

mikestef9 commented Jul 20, 2021 • edited Loading

dany74q commented Jul 20, 2021

artificial-aidan commented Sep 30, 2021

jtoberon commented Sep 16, 2019 •

edited

Loading

kvidhya commented Sep 24, 2019 •

edited

Loading

TBBle commented Dec 3, 2020 •

edited

Loading

mikestef9 commented May 4, 2021 •

edited

Loading

dany74q commented Jul 19, 2021 •

edited

Loading

mikestef9 commented Jul 20, 2021 •

edited

Loading