-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[eks] [request]: Insufficient vpc.amazonaws.com/PrivateIPv4Address #240
Comments
@JasonChinsen Thanks a lot for reporting this issue. Could you please describe the actions performed to end up in this situation? Could you also provide logs from the |
@JasonChinsen Have you tried restarting the |
@vsiddharth This is a stip of the logs before I restarted the
After restart
Moreover, the pods in pending state started to run. Steps that I took to create the issue:
|
@JasonChinsen Thanks a lot for getting back with the relevant information. |
@vsiddharth there is nothing new when I describe the node ... but the pods did start to run.
|
@vsiddharth this may be out of context for this issue ... but ...
|
@vsiddharth I have spent a little more time on this issue and it seems like the node drops the network every so often (sorry I can't be more clear on that) but restarting pods, Do you want me to close this ticket for now? |
I am also experiencing issues with Tell us about your request Which service(s) is this request for? Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? I've noticed when restarting a pod running on the Windows EC2 instance that has an assigned IP the recreated pod can not get a new IP. I've seen the reported solution to restart the Are you currently working around this issue? Additional context |
Thanks a lot for reporting this. There are two different issues here. We have provided a workaround at #236 for one of the issue. |
Hey Aaron, for the issue that POD couldn't communicate to ElasticCache, can you share the details little more? Such as when POD couldn't communicate to ElasticCache,
|
@somujay Re-created the issue this morning and below is the log output.
No, output of requests below
No, output of requests below Preview is showing some logs cut off. I've created a gist that has all logs as well for your reference: https://gist.github.com/aarongooch/c7e484bc41936bb81050d9fa07909294
|
So a bit more on this issue:
Windows EKS host has the following in vpc-shared-eni log.
Execing into pod:
Describing pod, it def has IPv4 Labels:
Ensured that:
@somujay this is a drilldown on what @aarongooch is experiencing in our org. Thanks! |
This is interesting. I assume this happens randomly?? Did you check kube-proxy is running on this host machine? "Get-Service Kube-proxy" is the command to check the status. if it's running, run this command "Get-HNSPolicyList | ConvertTo-Json -Depth 3" to see it has the policy for DNS cluster IP. If you still has the problem, try restarting kube-proxy on host machine (restart-service kube-proxy). If the problem still persists then mail us, we need to set up a call to look into this issue. |
In my case adding a SecurityGroupEgress allowing access from the control panel to the worker nodes security group on port 443 solved the issue. Edit: but now I do face the issue that JasonChinsen is facing. From running N number of nodes I do only get a single node with one private address available to be used. |
Not sure if this is still somehow expected to happen, but I'm having this same error after the official windows support release. I have just created a new cluster with Pod stays as pending forever: Descrive says:
Any ideas on what's happening? |
Your pod is not being assigned an IP address. This could be due to the kube configuration, errors in the vpc-controller pod, or a lack of IPs in the subnet the ec2 node is assigned to. You're going to have to dig in to your cluster and find the issue. Too many variables in play. |
@aarongooch Thanks for the reply! As my VPC/subnets have plenty of free addresses, I checked your other suggestion and this is what I find in the
I check my subnet in the AWS web console and the route table is there, so it must be some access control issue. Anyone using Any help is appreciated. :) |
I experienced the same issue when trying the deploy a windows workload. The issue in my case turned out to be the node selector. This works: nodeSelector:
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/os: windows But this does not: nodeSelector:
kubernetes.io/arch: amd64
kubernetes.io/os: windows |
@niranjan94 tracked as #542. |
@jglick thanks for the update. Had not seen that. 😄 |
This is an old issue but is anyone facing this with the present EKS clusters running Windows? VPC Resource Controller now runs in control plane and is managed for the customers. |
Will be closing this issue as its been stale for a while, and it doesn't seem like it should be reoccurring. Please reopen if you continue to face this issue. |
Tell us about your request
I am running a mixed k8s cluster based on eks-windows-preview
I am running into an issue after turning off the windows node local firewall and restarting the windows node
netsh -r computername advfirewall set privateprofile state off
Which service(s) is this request for?
EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
This is a continuation of my efforts from the issue: 236 where I am trying to get my windows application to talk to MongoDB running on the Linux side of the cluster
Are you currently working around this issue?
N/A
Additional context
I have noticed that pods that were already alocated to a windows node before restaring would start up again once the node came online, however any new pods deployed after the reboot would not start, show pending, and have this message:
Warning FailedScheduling 2m33s (x49 over 42m) default-scheduler 0/6 nodes are available: 3 node(s) didn't match node selector, 6 Insufficient vpc.amazonaws.com/PrivateIPv4Address.
I have restarted both the vpc-admission-webhook-deployment and aws-node pods
I have also checked each windows node status (could there be a missing Annotation after reboot?)
both Windows and Linux nodes are running are on version
v1.11.5
Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)
The text was updated successfully, but these errors were encountered: