Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error from server: error dialing backend: dial tcp 10.10.71.57:10250: i/o timeout #1181

Closed
ShibraAmin18 opened this issue Feb 13, 2023 · 7 comments

Comments

@ShibraAmin18
Copy link

ShibraAmin18 commented Feb 13, 2023

What happened:

for CIS ami changed /tmp to /home/ec2-user and --bin-dir /bin/ to /usr/local/bin

amazon-eks-ami/scripts/install-worker.sh

Line 138 in 343e830

sudo "${AWSCLI_DIR}/aws/install" --bin-dir /bin/
When using the image created with EKS node groups, cannot exec into or view pod logs.

Error from server: error dialing backend: dial tcp 10.10.71.57:10250: i/o timeout

What you expected to happen:
Successfully run the AMI created in EKS and be able to view logs and exec into the pod.

Environment:

  • AWS Region: us-east-2
  • Instance Type(s): t3a.medium
  • EKS Platform version (use aws eks describe-cluster --name <name> --query cluster.platformVersion): "eks.6"
  • Kubernetes version (use aws eks describe-cluster --name <name> --query cluster.version): "1.23"
  • AMI Version: CIS Amazon Linux 2 Benchmark v2.0.0.16 - Level 2-c41d38c4-3f6a-4434-9a86-06dd331d3f9c
@cartermckinnon
Copy link
Member

We need more information to understand the issue you're reporting. Does the node become Ready? Do the kubelet logs indicate any issues?

@cartermckinnon
Copy link
Member

I'm closing this issue because we don't have enough information to reach a conclusion.

@cartermckinnon cartermckinnon closed this as not planned Won't fix, can't repro, duplicate, stale May 25, 2023
@raackley
Copy link

Can this be reopened? I believe I'm seeing the same problem, when building on top of the CIS Level 1 Benchmark AL2 AMI.

The AMI seems to build fine, but when deployed, I see issues similar to what was described, "i/o timeout", etc. This only seems to happen for some pods, sometimes, when attempting to connect to the kubernetes API service endpoint IP that is local to the cluster.

I suspect the problem is from some certain sysctl parameter that was set in the CIS base image, but I can't find what exactly.

@cartermckinnon
Copy link
Member

@raackley we don't currently ship a CIS variant of the AMI, and this template doesn't explicitly support building on a CIS base. If there's a reasonable change we can make in the template to unblock that use case, we're not opposed; but we don't track CIS-specific issues here.

@raackley
Copy link

@cartermckinnon Sure, but is there a specific reason why it is known not to work? Seems like it should be fine. Obviously people are wanting this, so can this be a formal request for a CIS hardened base EKS image, or support to select a CIS hardened AMI to build your own?

@cartermckinnon
Copy link
Member

Sure, but is there a specific reason why it is known not to work?

The issue you're describing:

This only seems to happen for some pods, sometimes, when attempting to connect to the kubernetes API service endpoint IP that is local to the cluster.

Sounds different from the one reported here. The OP was describing communication failures between the API server and kubelet.

can this be a formal request for a CIS hardened base EKS image

This is a longstanding request (#99), it's absolutely on our radar.

support to select a CIS hardened AMI to build your own?

If the issue lies in the CIS base AMI as you suspect, there's probably not much we can do in this template to ensure compatibility. If some logic here is causing the breakage, we're totally open to PR's.

@Th0masL
Copy link

Th0masL commented Jul 13, 2023

I'm also building CIS-hardened AMIs on Amazon Linux, and we were experiencing similar issues when upgrading from EKS 1.23 (on Docker) to EKS 1.25 (on Containerd).

On our side, the problem was due to the fact that the CIS-hardened AMI is disabling IP Fowarding (most likely directly in /etc/sysctl.conf, but Docker is able to re-enable it on the fly when the EC2 instance starts.

When using ContainerD, it is not enabling IP Forwarding, so the containers don't have network access.

By looking in the sysctl config, we can see that the IP Forwarding parameters that are configured in the file /etc/sysctl.d/99-kubernetes-cri.conf are being overwritten by the parameters from /etc/sysctl.conf (that are disabled by the CIS-hardening script).

See this ticket that confirms the behavior of loading /etc/sysctl.conf after all the other files in /etc/sysctl.d/*.conf

In order to fix the network connectivity issue for the containers, we wrote a small bash script that re-enable IP Forwarding in /etc/sysctl.conf in our custom AMI after applying the CIS-hardening script.

So if you see problems with containers connectivity, you can run the following command to confirm that IP Forwarding is enabled:

$ sysctl -a 2>/dev/null | grep -E "ip_forward |\.forwarding "

Example output :

net.ipv4.conf.all.forwarding = 1
net.ipv4.conf.default.forwarding = 1
net.ipv4.conf.eni0f6f0f8b0af.forwarding = 1
net.ipv4.conf.eni101ddb65507.forwarding = 1
net.ipv4.conf.eni123f74d1678.forwarding = 1
net.ipv4.conf.eni25a0644a127.forwarding = 1
net.ipv4.conf.eni3c9bdd19a55.forwarding = 1
net.ipv4.conf.eni4cb3ea8d9b7.forwarding = 1
net.ipv4.conf.eni4f5f128fbd7.forwarding = 1
net.ipv4.conf.eni533143c3bfa.forwarding = 1
net.ipv4.conf.eni5a9462606df.forwarding = 1
net.ipv4.conf.eni7d229876a48.forwarding = 1
net.ipv4.conf.enic2b1da1faef.forwarding = 1
net.ipv4.conf.enic611f40a78a.forwarding = 1
net.ipv4.conf.enic68b338523f.forwarding = 1
net.ipv4.conf.enid9a8087c67d.forwarding = 1
net.ipv4.conf.eth0.forwarding = 1
net.ipv4.conf.eth1.forwarding = 1
net.ipv4.conf.lo.forwarding = 1
net.ipv4.ip_forward = 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants