-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EKS] [IPv6 on instance TGs]: Add IPv6 support to instance-type target groups for EKS support #1653
Comments
See also case #9628572941 |
My understanding was that IP targeting mode required that you knew the IPs you would be targeting. Since the cluster is autoscaled and my service's target is running on all nodes as a daemonset, couldn't that list change? |
That's the job of the AWS LB Controller. It watches service endpoints in the cluster and auto updates ALB/NLB target groups with the latest list of pod IP addresses. |
Ok I think I have this design working - at least on my existing IPv4 setup. I'll need to rebuild to try it on an IPv6-only cluster. One question, though, as this process involved quite a bit more than my original design. What kinds of benefits does it provide over a more simple one that just uses an instance-based TG (with IPv6 support) to connect an NLB to the cluster's autoscaling group? Thanks |
Instance mode load balancer can potentially go through an additional instance hop before the traffic gets to the pod. This adds a higher latency as compared to the case where load balancer can send the traffic directly to the pods. This is possible because VPC CNI directly uses VPC IP addresses, so ALB/NLB can send traffic directly to pods and skip node ports and kube-proxy. |
That makes sense. This is sounding more like an ELB feature request. Should I switch it over to their queue instead? |
@mikestef9 don't forget that with instance mode you need the extra complexity of external traffic policy set to local if you have compliance policies for public traffic. @eshicks4 assuming you're using the AWS Load Balancer Controller (which you should be as the the in tree controller is deprecated) either ALB IP backed ingress or NLB IP backed ingress controller services are the simplest solution. |
@stevehipwell while that may be the simplest solution that currently works for IPv6, I'm not sure I'd call it the simplest overall. Envoy runs as a daemonset in Project Contour's design so it's going to route to all available nodes anyway. An NLB configured to route to a static NodePort and auto-updated by the autoscaler doesn't require an ELB controller deployment or any of the IAM role setup that goes along with it. It works perfectly with IPv4 so, once IPv6 support is added to instance-based target groups, the only real benefit the controller has for us is the direct IP routing that bypasses kube-proxy. |
@eshicks4 I'd suggest that you could switch Contour to use deployments for Envoy and nlb-ip service annotations; this will allow you to use IPv6 and have a HA ingress (see pod readiness gates). I'm sure there are some limited cases where the daemonset and instance mode is better but I can't think of many cases where the pros outweigh the cons? Obviously you might have some of these so this is just a friendly suggestion. |
@stevehipwell Just the reduction in complexity really. (fewer moving parts to break, etc.) There may be other reasons but, in our case, Kubernetes is still pretty new and we just have more people familiar with AWS. That said, I have it all working & documented with the in-cluster ELB controller and no real reason to switch back since there are benefits to using it. That's why I'm thinking it's best to move this over to the ELB team's feature request bucket instead. |
Helllo, We’re more or less in the same situation as @eshicks4. We’d like to attach our ingress nodes autoscaling group to our load balancer target group. The reason why we’d like to do so is the same: reduction of the complexity (no need to deploy the load balancer controller, one less piece which could break, etc). Another reason is that, currently, the load balancer controller doesn’t handle the case where two clusters are behind the same target group, which is how we do some blue/green upgrades. Also, to avoid the additional hop when using instance type target groups and node ports, we deploy our ingress controllers using host ports. |
I've been struggling to get the suggested alternative (ip based) solution working with ipv6. Are there guidelines anywhere for troubleshooting this? Logs from service description Name: <name>
Namespace: <namespace>
Labels: app.kubernetes.io/instance=<namespace>
Annotations: service.beta.kubernetes.io/aws-load-balancer-ip-address-type: dualstack
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-subnets: subnet-<id>, subnet-<id>
Selector: <selector>
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv6
IP: <address>:e8fb::5a07
IPs: <address>:e8fb::5a07
LoadBalancer Ingress: <redacted>.elb.us-east-1.amazonaws.com
Port: <unset> 80/TCP
TargetPort: 5000/TCP
NodePort: <unset> 30538/TCP
Endpoints: [<address>:bf0::1a]:5000,[<address>:bf0::1c]:5000
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfullyReconciled 26m (x3 over 72m) service Successfully reconciled edit: Turns out this was an issue with my app and what it was listening to. Traffic routed by kube-proxy and port-forward works fine when bound to localhost, when coming directly to the pod using the IP method it does not which should probably be obvious. I'll leave this here in case anyone else has the same issue and have server binding to |
I ran into this a few times too. The pods run IPv6-only so there is no 127.0.0.1 or 0.0.0.0 to bind to. Sometimes localhost works (depends on the container's /etc/hosts file) but it's generally been safer or even necessary to override app defaults and force it to bind listeners to [::1] or [::] instead. |
I really need IPv6 ALB with IPv6 instance target groups for IPv6 native subnets. This feature being discussed requires that to be implemented first. Is there a better place to express interest in this feature outside 'container' roadmap? |
Any update on this feature? :) |
o/ Waving from the void on this :) |
This is still dependent on ALB/NLB first adding service for instance IPv6 target groups (which is coming later this year). When that happens, we can add support in the controller. |
This feature has been implemented and is now available on AWS. |
My ASG refuses to add worker nodes to the target group. Did you get this working? Please provide a link to the merged PR that backs your claim. Otherwise you are being misleading.. |
With the recent launch of support in ELB to register instances using IPv6 address([1]), you can use AWS load balancer (LB) controller to create ALB/NLB in instance type for IPv6. We recommend using AWS LB controller v2.5.1+ to get started. |
That doesn't really meet my use case. We use a single ingress controller of type NodePort, and 1 internal and external NLB each. I just wrote a Lambda to enable the primary IPv6 IP as each instance come up. Ideally the node group would have that as a feature that can be turned on, as it isn't something that can be specified in a custom launch template due to not having the CIDR yet. |
@sjastis Otherwise the instances are never added to the target group due to not having a primary IPv6 IP. |
I confirm that there’s a hole in the racket here: EKS nodes cannot be registered to these new IPv6 instance type target groups due to missing primary IPv6 address. I tried setting |
Can't really set it on the launch template because it would require that the node IPv6 CIDR is already known, as it uses the first one in the block. |
I’m not sure to understand: in the launch template I was able to set |
The CIDR ranges are automatically assigned to your nodes - they are not known in advance. When I tried to go in and manually create a launch template - w/ the primary IPv6 IP option set to true, it wouldn't not let me unless I specified the IPv6 CIDR in advance. |
@matthenry87 I tried to modify the launch template generated by EKS to set EDIT: I also had to set |
@yann-soubeyrand, thanks for confirming and glad it works for you.
|
@oliviassss don’t get me wrong, what I did is a hacky test, I think we must never touch the launch template generated by EKS. However, the path to a fully working solution doesn’t seem so hard at first sight: either EKS should set |
@yann-soubeyrand @oliviassss What you're proposing is not possible to do in an automated way for if/when new nodes are brought up in the context of an autoscaling group. The IPv6 CIDRs are assigned automatically/dynamically - so it's not possible to flip enablePrimaryIpV6 to true in the launch template. We want to be able to have the autoscaling group behave the same way it does with IPv4. That means - that when the ASG lifecyle hook notifies the EKS service that a new node has launched, that EKS service should grab the instanceId, grab it's networkAdapterId, and then update the network adapter to have a primary IPv6 selected from it's already assigned IPv6 CIDR range. Why make users have to manually edit the network adapter of every new node, after it's been automatically assigned it's CIDR range? |
@yann-soubeyrand Er I now understand you were able to get it working - but I don't want to use a custom launch template. |
@yann-soubeyrand, just for my own understanding, by "launch template" do you mean the EKS template or CFN template? I think the |
@oliviassss we don’t use CloudFormation, we use Terraform. Here is what I did:
From an external point of view, it seems that the best solution would be that EKS sets In the meantime, I didn’t find a solution to make everything work using Terraform and I consider that I must not edit the EKS generated launch template. |
@yann-soubeyrand, thanks for the details. There's an open issue in terraform for this missing field: hashicorp/terraform-provider-aws#33733 |
If you want to use a custom launch template for a managed node group and you want to use Terraform to create said launch template, you'll need the feature requested by that open issue. I agree with @yann-soubeyrand that Managed Node Group's creation of default launch templates for IPv6 clusters needs improvement. I have not yet looked into Karpenter's support for primary IPv6 addresses. |
@oliviassss in addition to what @johngmyers said, even with the above Terraform issue fixed, there’s still the issue that EKS doesn’t take the |
Agreed @johngmyers. Thanks for the feedback @yann-soubeyrand . We are tracking this as an enhancement for Managed Node Group and Karpenter. |
Hi @sjastis, do you have any public pointer (like a GitHub issue) where we can track the progress? |
Hi @sjastis, do you have news and/or pointers we can follow? |
@oliviassss We faced the same issue and currently we have to perform post actions manually in Launch Templates for EKS Autoscaling Groups so Primary IPv6 flag is set for instances. |
Hi, could you please share details about the difficulty to fix this issue? It could help us being more understanding, because, from a user point of view, it seems to be “just” a matter of taking into account the value of |
Hello, I see that this issue has been moved to shipped in the roadmap, what does it mean? I tried again setting |
Checking with our managed node groups team again on this one. Do you know if same issue exists in Karpenter? |
I’ve not tested Karpenter, unfortunately. |
We've merged the change to keep PrimaryIpv6 in launch template used by MNG. This should be rolled out globally within two weeks. |
I confirm that attaching an IPv6 instance-type target group to the autoscaling group of an EKS managed node group now works as expected. In French we say « Mieux vaut tard que jamais ! ». Thanks! |
Good to hear. Will leave this issue open as we consider whether to make that the default behavior so setting in launch template is not necessary. |
Community Note
Tell us about your request
Please add IPv6 support to instance-type target groups so that we can use EKS cluster autoscaling groups with ALBs/NLBs.
Which service(s) is this request for?
EKS, ELBs
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
EKS creates an autoscaling group onto which we can attach target groups; however, the new IPv6-based clusters don't bind NodePorts to the EC2 nodes' IPv4 IPs. We have dual-stack ELBs and IPv6-enabled EKS clusters but seem to be missing that connecting piece in-between.
Are you currently working around this issue?
We aren't. We're currently stuck with IPv4 clusters.
Additional context
An alternative could be to make EKS clusters dual-stack so we can use the ipFamilies & ipFamilyPolicy features. IPv6-only would be the default to avoid IP exhaustion but we could selectively bind IPv4 IPs as-needed.
Attachments
N/A
The text was updated successfully, but these errors were encountered: