-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EKS Windows Nodes #69
Comments
Hi, is there any ETA on this issue? |
We are targeting a public beta in early 2019. Windows Server Containers is a beta feature in Kubernetes and we intend to support Windows following the same guidelines. Please +1 this issue and tell us what you'd like to see (your preferred Windows Server version, Kubernetes version, features...) so we can plan accordingly! :) |
Would like to see support for Windows Server 2019, which brings Windows containers much closer to feature parity with Linux containers. See http://stefanscherer.github.io/docker-on-windows-server-2019/. |
Also AWS announced recently that WS2019 is supported: |
Windows Server 2019 containers on a "current" (1.12/1.13) version of kubernetes would be great. Looks like 1.14 may have WS2019 as a minimum according to Kubernetes Sig Windows notes: |
I agree, we are looking to use Windows containers largely for a CI/CD workload on Kubernetes and would love to see this in place for EKS rather than having to manage our own K8s cluster. |
Agree. |
This is strongly desired. |
windows server 2019 with kubernetes 1.13.+ |
Do you mean, you got it working? |
It's a request. |
We'd love to see this feature. |
Yes. AFAIK, Azure Container Services does not support hybrid containers either. It would be very interesting to see AWS supporting this before Azure actually. |
Hi all, Learn more and get started here: https://github.com/aws/containers-roadmap/tree/master/preview-programs/eks-windows-preview Please leave feedback and comments on the preview using this ticket. |
Found an issue trying to launch the 'amazon-eks-cfn-quickstart-windows.yaml' template. There are three nested stacks in this template: 'EKSVPCStack', 'EKSLinuxWorkerStack', 'EKSWindowsWorkerStack'. The first two of these are providing a TemplateURL property pointing to an S3 URL, but the third nested stack (EKSWindowsWorkerStack) is pointing to a github url. Here is the resource:
The documentation states that these URLs can only be S3 URLs. This causes the stack to fail and roll back with the following error:
I have verified that the stack can create successfully if that template is put in an S3 bucket and the TemplateURL property is replaced with this S3 URL. Here is the quick create link I used to launch the stack so that this issue can be reproduced (just need to replace '<keyname>' with a valid keypair name):
|
@mike-mosher good call out - this was not right. I just updated the readme instructions to document how to download the YAML and upload to S3 so this works. We’re in the process of getting this into our service S3 buckets to simplify the setup as well. |
Opened #227 as the Windows example couldn’t get to run and few kube-system DS won’t run. |
We’ve added the windows-nodegroup and QuickStart YAML files to our production S3 buckets and updated the readme for provisioning the Windows worker nodes. This simplifies the setup process. |
Any 1.12 ami’s available for the README? |
@cdenneen we're working on making Windows AMIs for v1.12 available |
I get frequent nw related errors when deploying to EKS Windows nodes. Typical manifestation is that the PODs cannot access ClusterIP addresses within the cluster. They can access POD IP addresses though. To check I run nslookup on the Windows POD and when faulty this will time out attempting to connect to the core-dns ClusterIP. The work around is to restart the Windows nodes and the vpc-resource-controller. This may resolve the issue temporarily (a few hours at best). More recently I am finding the resolution lasts only a few minutes. Is anybody else having this problem? |
To expand upon the previous poster, this is my symptom.... Issue I have seen and cannot overcome, any suggestions would be greatly appreciated. Once a Window machine has been deployed, let us say I have 5 slots for pods, each with their own IP. Now after I have reached the deployment limit, I find that I lose the capacity to connect to those ClusterIPs and networking fails for internal communications. I still have external capability. This can be proven by deleting pods and waiting for them to be recreated, if I try to connect, it fails. No port is available to connect to…. The only solution is to redeploy the EC2 instance again. Is this a known issue, is there a better work around ? |
@cmboughey - This does sound similar to the issues I have been facing. What exactly do you mean when you say "Once a Window machine has been deployed, let us say I have 5 slots for pods, each with their own IP"? I have a script which reboots the EC2 instances which does make it a bit easier. Still a pain. |
@nigeldecosta - Basically, depending on the size of your compute instance, you have a limit on the IPAddresses which will be used for the instance. Part of the CNI configuration, AWS uses elastic network adapters and assigns them to machine. |
@cmboughey Is this related to the primary + secondary private IPs on the Windows EC2 instances? I am currently using EC2 type m4.16xlarge. Could I expect the limit to be higher on other types? I couldn't see where such limits are listed. If you have a link that would help. Thanks. |
@nigeldecosta
|
Hi all. I've had the same issue as @anjanitsip commented . I've added the required label to the windows iis sample yaml with a random IP form the subnet where the nodes are. Also, I have restarted the Windows instance, the vpc-resource-controller and the aws-node DaemonSet too. It VPC-* and aws-node are up, running and healthy. All logs are ok so I don't know where the label or the ip should come from. vpc-resource-* container log:
aws-node container log
vpc-admission* container log
I am more than happy to provide more logs for investigation just give me what you need. update It does not solve the issue. The windows-server-iis container in crashloopback... update the container goes to update Looks like the manual attached label for IPv4Address does not affect the container networking. I try to investigate as much as I can but slowly run out of ideas. See error below.
Few logs from the Windows node:
update Flanneld service is missing from the node. The question now, why and what step missed the installation. final update Solved. It was mostly user error. I mean, I have provisioned the instances within an environment with strict network policies and few port has been blocked. |
The reason I ask is that I've got two nodes in my cluster, one on Linux and one on Windows, however the Windows node only seems to be able to run 5 pods at a time (both instances are t3.medium) whereas the Linux node can handle 17. I can see that the Linux node has 3 ENIs and 18 total private IPs, however the Windows node seems to only have a single ENI and a single private IP. Update: @vsiddharth kindly responded via email and confirmed that ENIs are in fact not dynamically allocated for the Windows nodes at the moment. |
Is there an ETA on windows nodes for 1.14? |
Submitted pull request #453 - which adds versions of the quickstart shell scripts written in PowerShell - just in case anyone wanted to get going with the preview but didn't have access to a bash shell. |
How did u solve it in the end? Please share. |
IIRC 443 port has been blocked that caused malfunction on the Windows woker/pod side. |
Is the EKS Windows preview still actually running/being developed? There doesn't seem to be a huge amount of activity here (other than people raising issues) and today I got an email from AWS saying that 1.11 is going to be deprecated in early November, making it impossible (presumably) to actually run a 1.11 cluster with Windows nodegroups, so not sure what the plan is? |
I thought the public release of Windows EKS was imminent. Not pleased that 1.11 will be deprecated before we get a supported version of Windows EKS. |
How can i get istio installed? I am getting failed to parse Kubernetes args: pod does not have label vpc.amazonaws.com/PrivateIPv4Address |
GA availability for EKS Windows will be released before 1.11 is no longer supported. This issue will be updated as soon as we make the GA announcement. |
@mikestef9 thanks for that! Where can we find the most up-to-date windows node AMIs? should we still use the 1.11 ones? |
Amazon EKS now fully supports adding Windows nodes as worker nodes and scheduling Windows containers. Windows workloads are supported with Amazon EKS clusters running Kubernetes version 1.14 or later. Please note that as of today, at least one Linux worker node is required in the cluster to support Windows node and container networking. We recommend two for high availability. Learn how to configure a cluster for Windows support and add Windows workers nodes in the EKS documentation. |
Interested to hear any feedback so far on EKS Windows support. Feel free to leave comments on this issue. Thanks! |
@mikestef9 my path trying to use Windows worker nodes is being pretty rocky so far... Firstly I bumped into a issue where I could not have IPs assigned to my pods: eksctl-io/eksctl#1512 After manually creating another routing table and finally overcoming this first issue, I kinda had Windows nodes working, but then I started having Linux pods being scheduled to windows nodes (yeah, this is documented, but it's very hard to set node selectors to all my deployments, including "internal" ones such as ingress and other things), so I decided to put a taint and bumped into another (it seems) Manually tainting the node (and also tolerate the taints in windows pods) seems to be working, but I cannot roll out this to production until everything is fully automated, so I guess I have to wait a little bit until someone at |
@mikestef9 we're using EKS mixed OS clusters in production right now. Aside from some hiccups with the vpc-resource-controller evicting itself after writing 500GB of logs (might not be windows related but I am working on pulling logs/opening aws support case) my BIGGEST gripe hands down is that none of the tooling that AWS provides surrounding EKS supports mixed OS clusters right now. New node termination handler: aws/aws-node-termination-handler#8 Container Insights: #503 This is certainly more widespread than just AWS and I understand that some of this can be provided b the OSS community but I'd really like to see more support surrounding mixed OS EKS, not just spitting out AMIs that interact with the EKS masters. |
@brunojcm I escalated the Windows node group taints to eksctl team and they have created a Pull Request that will land in the next eksctl release. @rparsonsbb thanks for opening those issues. We will continue to work on improving tooling in the Kubernetes Windows ecosystem. |
@mikestef9 thanks for that! I got the notification there, I was wondering if you had helped it to happen anyhow :) |
EKS Windows worker nodes to run Windows containers.
Update – 10/8/2019
Amazon EKS now fully supports Windows containers and Windows worker nodes. #69 (comment)
Get started by looking at the EKS documentation: https://docs.aws.amazon.com/eks/latest/userguide/windows-support.html
The text was updated successfully, but these errors were encountered: