Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimized security group rules #2118

Closed
kishorj opened this issue Jul 7, 2021 · 12 comments
Closed

Optimized security group rules #2118

kishorj opened this issue Jul 7, 2021 · 12 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@kishorj
Copy link
Collaborator

kishorj commented Jul 7, 2021

AWS Load balancer controller - optimized SG rules

Overview

There are two limitations to the security groups handling currently

  • If custom security groups are specified for ALB, i.e. controller doesn’t auto-create one for the ingress group, then we expect the users to manually configure their ENI/Node security groups to permit the ingress traffic from the load balancer.
  • In case of auto-generated SG, controller creates a unique security group for each LB. This will cause the node group SG rules to grow linearly with the number of load balancers. As a result, the SG rule quota effectively limits the total number of LBs.

The goal of this feature is to enable specifying shared security groups for load balancers and the controller automatically add to the ENI/node group security groups to allow traffic from the load balancer. We will provide a controller flag to specify a default SG to use. The shared SG approach will enable users to use a constant number of SG rules for the LB ingress traffic and the SG rule quota will no longer be a bottleneck for the number of LBs.

Security groups

We will classify security groups into two categories

  1. Frontend SG
  2. Backend SG

Frontend SG

Frontend security groups control the clients that can access the load balancer. These SGs contain rules from the inbound-cidrs to the listen-ports. The frontend security groups can be configured via the exclusive annotation on the ingress resource -

alb.ingress.kubernetes.io/security-groups

If the annotation is not specified, controller will automatically create one security group per load balancer to allow the traffic from inbound-cidrs to listen-ports.

Backend SG

Backend security groups control the traffic between the load balancer and the EC2 instances or the ENIs. These SGs are attached to the load balancer to tag the LB traffic and are used as traffic source in the ENI/Instance SG rules. The backend security groups are shared between multiple load balancers, the controller will ensure that the shared rules do not get deleted prematurely. Backend SGs are be configured via the following mechanisms -

  • Default security groups for cluster
    • Specified via controller command line flag or auto-generated based on the flag
  • Fallback to the auto-created frontend SG
    • In case the default backend SG is not specified
    • Auto-generated frontend SG gets used as backend SG
    • This configuration provides backwards compatibility with prior releases

Management of Backend SG rules

When the controller auto-creates the frontend SG for a load balancer, it automatically adds the security group rules to allow egress traffic from the load balancer to the EC2/Fargate instances. In case security group is specified via annotation, the SG rules do not get added by default. The automatic management of instance/ENI security group can be controlled via an additional annotation on the ingress resource

    alb.ingress.kubernetes.io/manage-backend-security-group-rules: true/false

When this annotation is specified, SG rules are automatically managed if the value is true, and not managed if the value is false. This annotation gets ignored in case of auto-generated security groups.

If SG rule management is enabled for an ingress, default backend security group configuration is required in case the frontend security groups are not auto-generated . If default backend security group configuration is not available, controller will fail to reconcile the ingress/ingress group.

Port range restriction for Backend SG

The default behavior is to add backend SG based rules for ALL traffic since the frontend SGs restrict the clients and the listen ports. The traffic from the load balancer will be to the target ports or health check ports. Customers have expressed concerns with allowing ALL traffic since their security scanning tools flag the “wide open” rule [2]. To further restrict the SG rules, we will provide an additional options to configure the port ranges instead of allow everything. For example, if port range 3000 - 32767 is configured, the SG rules for allowing TCP/UDP traffic is be as follows -

from backendSG, TCP, port range 30000 - 32767
from backendSG, UDP, port range 30000 - 32767

The UDP rule is added if required for NLB. In case of ALB, SG rules are restricted to TCP.

Controller flags

Default security group

We will provide an additional controller flags to configure the default backend security groups for the cluster. These security groups will be used for ingresses/ingress group configured for management of SG rules.

  • --enable-backend-security-group
    • type boolean, default true
    • if true, enable default security group to use as backend SG
  • --backend-security-group
    • type string, default empty
    • if empty, auto-generate a security group with the following name and tags -
      • name: k8s-<cluster_name>-traffic-<hash of vpc, cluster name>
      • tags: elbv2.k8s.aws/cluster: <cluster_name>, elbv2.k8s.aws/type: backend

Alternative flags for default security group

Use a single command line flag to configure the default backend flags
--default-backend-security-groups

  • If empty, auto generate backend security group
  • If value is disabled, do not configure backend SG
  • Use the list of security groups specified in the flag

A single command line flag, but hacky.

Port range restriction

The targetgroup binding model for ingress doesn’t specify a port restriction. As a result the backend SG rules have the port range 0 - 65535. This apparently causes security scanners to flag the rule as insecure [2]. We want to restrict the backend SGs to specific port ranges as discussed in section Port range restriction for Backend SG.

Dynamic port range

Dynamically adjust the port range based on the minimum and maximum values for target and health check ports seen by the controller. This will make the feature work without getting input from the end user. The cons are complex implementation, and no control to the end users.

TGB model changes
The target group model builder generates networking rules with specific ports instead of the current nil ports. If the health check ports are different from traffic ports, additional rules get added to the TGB spec. For IP targets, the targetPort is used, for instance targets, nodeport gets used. For example, if there are two TGBs targetting node ports 31223 and 32331, backend SG sg-backend the consolidated networking rules are as follows

from sg-backend, TCP, port 31223
from sg-backend, TCP, port 32331

Networking manager
Networking manager will take the consolidated rules as input and calculete the optimized list of rules using the port ranges. The rules are grouped by the protocol and the source. Then for each matching source and protocol, calulate the min and max ports and use a port range based rule. In the above example, the optimized rule is as follows

from sg-backend, TCP, port range 31223 - 32331

In case a TGB rule doesn’t have the port specified, use 0 as the min and 65535 as the max for the affected (source, protocol).

Command line flags
Provide a command line flag --disable-restricted-sg-rules if set to true, revert to the existing behavior of using unrestricted SG rules. The default value for this flag is false.

Alternative solution: Command line flag

Allow customer to restrict the port ranges for traffic tagged with the backend SGs. The port ranges can be specified via the command line flag --backend-traffic-port-ranges. The flag is of type string list and the default value is empty. Each port range can either be an individual port value or port range start - end. The values must be in the range [0 - 65535].

Pros: simpler implementation
Cons: customer either require a wider range of ports, or re-adjust port ranges as required.

Backwards compatibility

The default configuration will behave differently than the prior versions in terms of the security group rules. The existing ENI/Instance security group will get modified to add the new backend SG rules, and the existing SG rules referring to the frontend SG will get deleted.

If customers desire backward compatibility to the existing release, we provide a controller flag to disable the default security group feature.

Constant number of SG rules

With the default configuration, we allow traffic from the configured load balancers using a single additional rule per SG. Even with multiple backend security groups configured, the number of SG rules will still be independent of the total number of load balancers configured on the k8s cluster.

Extension to NLB

When security group feature is available for NLBs, we will extend the same behavior to the NLBs provisioned by the controller. The following annotations will be used on the service resource to configure the security groups -

  • Frontend SG

    • Auto generate if not specified
    • service.beta.kubernetes.io/aws-load-balancer-security-groups
  • Backend SG

    • service.beta.kubernetes.io/aws-load-balancer-manage-backend-security-group-rules

References

[1] #1791
[2] #1993

@kishorj kishorj added kind/feature Categorizes issue or PR as related to a new feature. kind/design Categorizes issue or PR as related to design. labels Jul 7, 2021
@kishorj kishorj self-assigned this Jul 7, 2021
@rifelpet
Copy link
Contributor

With the new tagging support for Security Group Rules it would be nice if managed backend security group rules were tagged with their k8s service

@M00nF1sh
Copy link
Collaborator

M00nF1sh commented Jul 15, 2021

@rifelpet
The current design above proposes to use a single dedicated managed backend security group for all NLB/ALB in cluster. So that we can workaround the rule limitation on the worker node. With this approach, this single backend SG is responsible for multiple k8s services, do you think tagging the k8s service name is still needed?

BTW, the new tagging & dedicated SG rule management API will indeed will help us simplify our implementation. Especially when GC rules created by the controller since now we can query rules by tags(e.g. cluster name tag)

@rifelpet
Copy link
Contributor

Ah, I was under the impression that the backend security group would have a rule per k8s Service but the proposal above suggests a single rule allowing a port range of traffic, so in this case a service tag on the SG rules wouldn't be applicable.

@MadhavJivrajani
Copy link

/remove-kind design
/kind feature
kind/design is migrated to kind/feature, see kubernetes/community#6144 for more details

@k8s-ci-robot k8s-ci-robot removed the kind/design Categorizes issue or PR as related to design. label Oct 11, 2021
@johngmyers
Copy link
Contributor

Is there an issue tracking the extension to NLB?

@kishorj
Copy link
Collaborator Author

kishorj commented Oct 20, 2021

Is there an issue tracking the extension to NLB?

NLB currently doesn't support security groups. If NLB supports SG in future, we will extend the same behavior.

@nicolasappdirect
Copy link

How do you manage security groups on TCP LB managed by kube services if NLB don't support security groups ? Is there a plan for NLB to support security groups ? I find it very surprising and odd that ELB was made deprecated whereas there is no alternative to manage security groups on NLB.

@kishorj
Copy link
Collaborator Author

kishorj commented Oct 26, 2021

@nicolasappdirect, the NLB support for security groups is outside the scope of the controller - controller will support NLB with SG once AWS NLB supports it. You can contact AWS support to raise your concerns.

As for the TCP LB management, our current approach is to add the relevant IP CIDRs and destination port based rule to the worker node security groups.

@kishorj
Copy link
Collaborator Author

kishorj commented Oct 26, 2021

Closing this issue, since we released v2.3.0 with the support for optimized security groups.

@Lujade
Copy link

Lujade commented Feb 27, 2023

When using the annotation:
alb.ingress.kubernetes.io/inbound-cidrs: x.x.x.x/x

The Load Balancer Controller will always create 2 security groups. 1 for the above annotation, and 1 for the shared backend.

Setting alb.ingress.kubernetes.io/manage-backend-security-group-rules: "false" does not remove the shared backend SG, and using alb.ingress.kubernetes.io/security-groups: xxx over writes the inbound-cidrs annotation.

My point is that when using the inbound-cidrs annotation, there is no way to prevent the controller from creating the shared backend SG.

Can this be changed? Or am I missing something?

@TheBaus
Copy link

TheBaus commented Jun 19, 2024

@kishorj Now that AWS have enabled SG's for NLB's, is there a roadmap item to address the NLB's?

@TheBaus
Copy link

TheBaus commented Nov 15, 2024

@kishorj Now that AWS have enabled SG's for NLB's, is there a roadmap item to address the NLB's?

@M00nF1sh can you perhaps provide some guidance here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

9 participants