File descriptor limit change in AMI release `v20231220` #1551

mmerkes · 2023-12-22T14:54:39Z

What happened:
Customers are reporting hitting ulimits as a result of this PR #1535
What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

AWS Region:
Instance Type(s):
EKS Platform version (use aws eks describe-cluster --name <name> --query cluster.platformVersion):
Kubernetes version (use aws eks describe-cluster --name <name> --query cluster.version):
AMI Version:
Kernel (e.g. uname -a):
Release information (run cat /etc/eks/release on a node):

The text was updated successfully, but these errors were encountered:

johnkeates · 2023-12-22T15:15:24Z

We have hit this issue too, we have about ~1700 pods crashlooping in each cluster. I wonder if the CI doesn't test using a large enough workload?

mmerkes · 2023-12-22T16:06:43Z

We have already reverted the change that caused this issue (#1535), ~~we're rolling back the v20231220 release~~ and we're preparing to release new AMIs without the change ASAP. More guidance to come.

EDIT: We're not rolling back v20231220. We're focusing on rolling forward the next release with the change reverted.

maksim-paskal · 2023-12-22T16:16:30Z

It help us to restore our pods on new nodes, we using Karpenter

apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
...
spec:
 ....
  userData: |
    MIME-Version: 1.0
    Content-Type: multipart/mixed; boundary="BOUNDARY"

    --BOUNDARY
    Content-Type: text/x-shellscript; charset="us-ascii"

    #!/bin/bash

    rm -rf /etc/systemd/system/containerd.service.d/20-limitnofile.conf

    --BOUNDARY--

and drain all new nodes from cluster

jpedrobf · 2023-12-22T17:16:04Z

@mmerkes Can you please update us when the AMI is ready for usage?

adwittumuluri · 2023-12-22T17:23:45Z

☝️ adding to that, an ETA would much appreciated as well. Is it in the magnitude of hours or days?

atishpatel · 2023-12-22T17:35:24Z

I'm using this setup for now in karpenter userData. Bumping soft limit from 1024 to 102400

Adding this to our bootstrap for now to 10x the soft limit.

- /usr/bin/sed -i 's/^LimitNOFILE.*$/LimitNOFILE=102400:524288/' /etc/systemd/system/containerd.service.d/20-limitnofile.conf || true

pkoraca · 2023-12-22T17:46:11Z

If anyone needs, we fixed it in Karpenter by hardcoding the older AMI in AWSNodeTemplate CRD

spec:
  amiSelector:
    aws::ids: <OLD_AMI_ID>

cartermckinnon · 2023-12-22T17:46:35Z

A go runtime change in 1.19 automatically maxes-out the process’ NOFILE limit, so I would expect to see this problem with go binaries on earlier versions: golang/go#46279

Has anyone run into this problem with a workload that isn’t a go program?

mmerkes · 2023-12-22T17:59:37Z

an ETA would much appreciated as well. Is it in the magnitude of hours or days?

We are working on releasing a new set of AMIs ASAP. I will post another update in 3-5 hours on the status. We should have a better idea then.

1lann · 2023-12-22T18:29:52Z

Has anyone run into this problem with a workload that isn’t a go program?

People have mentioned running into this problem on envoy proxy, which is a C++ program.

cartermckinnon · 2023-12-22T18:45:15Z

People have mentioned running into this problem on envoy proxy

Yes, I've been looking into that. Envoy doesn't seem to bump its own soft limit, and it also seems to crash hard when the limit is hit (on purpose): aws/aws-app-mesh-roadmap#181

Other things I've noticed:

The soft limit of 1024 is the default on ECS: https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_Ulimit.html
Java's hotspot VM has bumped the limit by default for ~20 years; point being there's wide variety in how the nofile limit is handled: https://github.com/openjdk/jdk/blob/93fedc12db95d1e61c17537652cac3d4e27ddf2c/src/hotspot/os/linux/os_linux.cpp#L4575-L4589

suket22 · 2023-12-22T22:33:28Z

The EKS provided SSM Parameter to reference to the current EKS AMI has been reverted to reference the last good AMI in all regions globally. This will automatically resolve the issue for Karpenter and Managed node group users and any other systems that determine the latest EKS AMI from the SSM Parameter.

We will provide another update by December 29 at 5:00 PM with a deployment timeline for new AMIs.

polarathene · 2023-12-22T23:47:37Z

We have already reverted the change that caused this issue

It'd be ideal to identify what software is not compatible and actually getting that addressed, but I understand the need to revert for the time being.

So long as you avoid infinity, most software will have minimal regression:

2^10 vs 2^20 slows some affected tasks by roughly 1,000x, as opposed to 2^30 where the delta is substantial.
If software relies on the legacy select(2) syscall it expects the soft limit to be 1024 to correctly function (additional select() concerns documented here in a dedicated section).
For some software like Envoy, it can potentially exceed the traditional 2^20 hard limit. This has been reported on their GH issue tracker already. infinity would avoid that, but it would have been wiser for only Envoy to raise it's limit that high, than expect the environment to workaround Envoy needs, due to prior regression concern points.

If you need to set an explicit limit (presumably because defaults are not sufficient), and the advised 1024:524288 isn't enough due to software not requesting to raise it's limits... You could try matching the suggested hardlimit: LimitNOFILE=524288, or double that for the traditional hard limit (2^20).

That still won't be sufficient for some software as mentioned, but that is software that should know better and handle it's resource needs properly, exhausting the FD limit is per-process, so it's not necessarily an OOM event. The system-wide FD limit is much higher (based on memory IIRC).

People have mentioned running into this problem on envoy proxy, which is a C++ program.

Envoy requires a large number of FDs, they have expressed that they're not interested in raising the soft limit internally and that admins should instead set a high enough soft limit.

I've since opened a feature request to justify why Envoy should raise the soft limit rather than defer that to be externally set high where it can negatively impact other software.

References:

2. Java's hotspot VM has bumped the limit by default for ~20 years;

https://github.com/systemd/systemd/blob/1742aae2aa8cd33897250d6fcfbe10928e43eb2f/NEWS#L60..L94

Note that there are also reports that using very high hard limits (e.g. 1G) is problematic:
some software allocates large arrays with one element for each potential file descriptor (Java, …) — a high hard limit thus triggers excessively large memory allocations in these applications.

For infinity, this could require 1,000 - 1,000,000 times as much memory (MySQL, not Java but an example of excessive memory allocation impact, coupled with usual increased CPU load), even though you may not need that much FDs, hence a poor default.

For Java, related to the systemd v240 release notes, there was this github comment at the time about Java's memory allocation. With the 524288 hard-limit that was 4MB, but infinity when resolving to 2^30 (many modern distros) would equate to 2,000x that (8GB).

While you cite 20 years, note that the hard-limit has incremented over time.

That setting choice would have been a non-issue for most of that time, and a DIY workaround of overriding the hard-limit sweeps it under the rug 😅
The most substantial increase to the hard-limit was introduced with systemd v240 in 2018Q4 raising it to 2^30.
The systemd v240 release took a bit longer to arrive in downstreams of course. While some distros like Debian patched out the 2^30 hard-limit increase (their actual motivation for this IIRC was actually due to a patched PAM issue that wasn't being resolved properly).
I haven't looked into the present state of JDK or MySQL to see if they still allocate excessively with a 2^30 hard-limit.

point being there's wide variety in how the nofile limit is handled

This was all (excluding Envoy) part of my original research into moving the LimitNOFILE=1024:524288 change forward. If you want a deep-dive resource on the topic for AWS, I have you covered! 😂

Systemd has it right AFAIK, sane soft and hard limits. For AWS deployments some may need a higher hard limit, but it's a worry when software like Envoy doesn't document anything about that requirement and advises the stance of raising the soft limit externally.

adjain131995 · 2023-12-25T03:13:55Z

We were using karpenter which again is an AWS backed tool and it started picking up the new AMI dynamically as we started facing issues.
As a hot fix we have harcoded the previous AMI
amiSelector:
aws::name: amazon-eks*node-1.25-v20231201
However, looking forward to the AMI fix we can make it dynamic again

The root cause: #1535

ndbaker1 · 2023-12-26T22:31:30Z

As an update to the previous announcement, we are tracking for a new release by January 4th.

Collin3 · 2023-12-26T23:04:41Z

As an update to the previous announcement, we are tracking for a new release by January 4th.

@ndbaker1 is this file descriptor limit change expected to be reintroduced to that release? Or will that still be excluded? Just wondering if we need to pin our AMI version until we implement our own fix for istio/envoy workloads or something is implemented in envoy itself to handle that change better

cartermckinnon · 2023-12-27T00:59:33Z

@Collin3 that change has been reverted and will not be in the next AMI release 👍

cartermckinnon · 2024-01-03T16:51:52Z

This is resolved in the latest release: https://github.com/awslabs/amazon-eks-ami/releases/tag/v20231230

LimitNOFILE was either 1048576 or infinity since 2017 containerd@b009642 This means soft limit was at a minimum 1048576 since then. Since systemd 240, infinity is 1073741816 which causes issue, and we must for sure lower the hard limit. Removing LimitNOFILE is equivalent to 1024:524288, which is the standard on the host, but was not containerd default since 2017, so when AWS recently tried they had to revert: awslabs/amazon-eks-ami#1551 1048576:1048576 has been good since 2017, use that. Signed-off-by: Etienne Champetier <[email protected]>

mmerkes mentioned this issue Dec 22, 2023

Revert "Set containerd LimitNOFILE to recommended value (#1535)" #1552

Merged

tzneal pinned this issue Dec 22, 2023

lnattrass mentioned this issue Dec 22, 2023

Selector Terms for AMI Minimum Age aws/karpenter-provider-aws#5382

Open

yeazelm mentioned this issue Dec 22, 2023

containerd, docker: consider removing LimitNOFILE specification from service units. bottlerocket-os/bottlerocket#3638

Closed

jordanlewis mentioned this issue Dec 22, 2023

Remove LimitNOFILE from containerd.service containerd/containerd#8924

Merged

cartermckinnon changed the title ~~Customers reporting hitting ulimits after upgrading to v20231220~~ File descriptor limit change in AMI release v20231220 Dec 25, 2023

awslabs deleted a comment from cartermckinnon Dec 26, 2023

remiphilippe mentioned this issue Dec 27, 2023

consul-dataplane 1.23 and 1.3.1 keep crashing with "Too many open files" on EKS hashicorp/consul-dataplane#378

Closed

cartermckinnon closed this as completed Jan 3, 2024

cartermckinnon unpinned this issue Jan 3, 2024

champtar mentioned this issue Jan 19, 2024

Set LimitNOFILE=1048576 in containerd.service containerd/containerd#9660

Open

polarathene mentioned this issue Jan 20, 2024

Set LimitNOFILE=1024:524288 for crio.service cri-o/cri-o#7703

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File descriptor limit change in AMI release `v20231220` #1551

File descriptor limit change in AMI release `v20231220` #1551

mmerkes commented Dec 22, 2023 •

edited

Loading

johnkeates commented Dec 22, 2023

mmerkes commented Dec 22, 2023 •

edited

Loading

maksim-paskal commented Dec 22, 2023

jpedrobf commented Dec 22, 2023

adwittumuluri commented Dec 22, 2023

atishpatel commented Dec 22, 2023 •

edited

Loading

pkoraca commented Dec 22, 2023

cartermckinnon commented Dec 22, 2023 •

edited

Loading

mmerkes commented Dec 22, 2023

1lann commented Dec 22, 2023

cartermckinnon commented Dec 22, 2023 •

edited

Loading

suket22 commented Dec 22, 2023

polarathene commented Dec 22, 2023 •

edited

Loading

adjain131995 commented Dec 25, 2023 •

edited

Loading

ndbaker1 commented Dec 26, 2023

Collin3 commented Dec 26, 2023

cartermckinnon commented Dec 27, 2023

cartermckinnon commented Jan 3, 2024

File descriptor limit change in AMI release v20231220 #1551

File descriptor limit change in AMI release v20231220 #1551

Comments

mmerkes commented Dec 22, 2023 • edited Loading

johnkeates commented Dec 22, 2023

mmerkes commented Dec 22, 2023 • edited Loading

maksim-paskal commented Dec 22, 2023

jpedrobf commented Dec 22, 2023

adwittumuluri commented Dec 22, 2023

atishpatel commented Dec 22, 2023 • edited Loading

pkoraca commented Dec 22, 2023

cartermckinnon commented Dec 22, 2023 • edited Loading

mmerkes commented Dec 22, 2023

1lann commented Dec 22, 2023

cartermckinnon commented Dec 22, 2023 • edited Loading

suket22 commented Dec 22, 2023

polarathene commented Dec 22, 2023 • edited Loading

adjain131995 commented Dec 25, 2023 • edited Loading

ndbaker1 commented Dec 26, 2023

Collin3 commented Dec 26, 2023

cartermckinnon commented Dec 27, 2023

cartermckinnon commented Jan 3, 2024

File descriptor limit change in AMI release `v20231220` #1551

File descriptor limit change in AMI release `v20231220` #1551

mmerkes commented Dec 22, 2023 •

edited

Loading

mmerkes commented Dec 22, 2023 •

edited

Loading

atishpatel commented Dec 22, 2023 •

edited

Loading

cartermckinnon commented Dec 22, 2023 •

edited

Loading

cartermckinnon commented Dec 22, 2023 •

edited

Loading

polarathene commented Dec 22, 2023 •

edited

Loading

adjain131995 commented Dec 25, 2023 •

edited

Loading