-
Notifications
You must be signed in to change notification settings - Fork 754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Traffic not goes through routes at Fedora 33 #1614
Comments
Hi @rma945 Can we capture tcpdump on any of the nodes to verify if there is any issue with the ip tables or rules? Say when you ping from Pod-a to Pod-b install tcpdump on the node You can attach the pcap to the issue. Also the logs attached above doesn't seem to download so can you reattach it? |
There is the PCAP file, and I have also re-upload the debug logs |
Can you confirm the source and destination Pod IPs and also did you ping or curl? |
Yes, any address except 127.0.01 can't be accessed. I have tried ping and curl to different IP addresses (pods, internal kube api, external services) |
Can you please check if you are hitting this issue - #1600 (comment) |
yes, I have already tried to disable the NetworkManager routing rules, as suggested at this issue but this not help. Also, in my case - looks like that the routes are added properly, but the routing is blocked by some reason [root@ip-172-24-67-130 ~]# ip rule list
0: from all lookup local
512: from all to 172.24.67.25 lookup main <- pod cni ip
1024: from all fwmark 0x80/0x80 lookup main
32766: from all lookup main
32767: from all lookup default |
Thanks for checking @RomanCherednikovAZ . Sorry I meant in the pcap file attached, can you please confirm the source and destination IPs ? If you have some bandwidth can you also please capture tcp dump on the destination side? Basically if we have the tcp dump on the source and destination nodes then we can co-relate where the drop is happening. Also I see the CNI version in your logs is 1.7.10 so can you please confirm the cni version?
I do see this log in kubelet -
|
@RomanCherednikovAZ Any update w.r.t the above logs? It appears that https://github.com/aws/amazon-vpc-cni-k8s/blob/master/scripts/entrypoint.sh#L149 |
Sorry for the confusion you, but rma945 and RomanCherednikovAZ = the same person. Yeah, I have checked that the CNI binaries were successfully added at the node, and the Network state for the node was changed to Ready. So the CNI itself - works fine, but looks like there were some issue with routing at the node. Anyway, at this moment we were anyway migrate our worker nodes back to the AWS EKS AMI. |
What happened:
I have built a new AMI based on community Fedora 33 1.2 AMI with bootstrap scripts from this repository - https://github.com/awslabs/amazon-eks-ami and everything works fine, except the aws-vpc-cni. I have checked the nodes and found that pods are successfully created, the elastic interfaces are allocated and the routing tables are created successfully, but the pods can't ping or connect through a TCP to any external or local IP. But when I change the CNI plugin to calico - pods are able to reach any IP. The second worker node, based at AWS EKS AMI - works fine, the problem only with a Fedora-based AMI worker. Also, I was tried to switch the container runtime from docker to pure contained, but that not helps, the aws-vpc-cni still not works.
What you expected to happen:
Pods should be able to connect to local and internal services
How to reproduce it (as minimally and precisely as possible):
Get the Fedora 33 1.2 AMI, join it into the EKS cluster and add aws-vpc addon
Attached logs*
eks_i-0cca70aab4bdd2bc1_2021-09-13_0719-UTC_0.6.2.tar.gz
Anything else we need to know?:
Environment:
The text was updated successfully, but these errors were encountered: