Add a configuration knob to allow Pods to use different VPC subnets #119

liwenwu-amazon · 2018-06-27T19:11:29Z

Today, ipamD (design) uses primary ENI's subnets and security groups when allocating new ENIs. This means Pods running on the nodes are using same subnets and security groups as node's primary ENI.

Here are few use cases that requires Pods to used different VPC subnets than the subnet used by node's primary ENI:

There is a limited IP addresses available in the subnets used by node's primary ENI. This limits the number of Pods can be created for the cluster

xdrus · 2018-06-28T12:02:27Z

Our use case is multi-"zone" network where we run different types of workload in a specific subnets. Simplified example: publicly faced pods shouldn't be in a network with access to internal DB. Right now we have to run an ASG in every our zone. Having ability to attach ENIs from different subnets and use annotations to map pods to specific subnets can drastically increase density and node's utilization.
Moreover, from a security perspective, it would be great to separate POD's subnets from instance's network (do not use primary ENI for pods).

liwenwu-amazon · 2018-06-28T17:54:24Z

@xdrus Thank you for sharing your use case. I have few questions:

When you mention drastically increase density and node's utilization, do you envision Pods of different "zone" are running on a single node?
- if this is the case, do you worry about security isolation between Pods from different "zone"?
How do you make kube-scheduler schedule Pods onto the nodes where there is enough IP addresses for Pod's subnet?

xdrus · 2018-06-29T00:55:23Z

@liwenwu-amazon

Yes, our security team is ok to run pods from different zones on one instance. Container isolation + host hardening + enforced security both on host and Kubernetes level make them happy :) Actually we do use this configuration in on-prem clusters with Calico.
With calico we use ippools for different zones and annotations to assign pods to a specific pool. Then we use Calico's BGP to propagate our cluster's ip pools to corporate network and manage inter-zone access on corporate firewall + with Calico policies on cluster level.
One of the possible approaches for AWS VPC CNI plugin might be using native tags to select subnets and annotation on pod level to map pods. lyft plugin uses tags to choose subnets but the don't allow to choose subnet for pods.
So cni.conf can look like (for our use-case):

{
    "cniVersion": "0.3.1",
    "name": "amazon-vpc-cni-k8s",
    "plugins": [
	{
	    "cniVersion": "0.3.1",
	    "type": "amazon-vpc-cni-k8s-ipam",
	    "interfaceIndex": 1,
	    "subnetTags": {
		"zone": "external",
                "kubernetes.io/cluster/ClusterName": ""
	    },
	    "secGroupIds": [
		"sg-1234"
	    ]
	},
       {
	    "cniVersion": "0.3.1",
	    "type": "amazon-vpc-cni-k8s-ipam",
	    "interfaceIndex": 2,
	    "subnetTags": {
		"zone": "internal",
                "kubernetes.io/cluster/ClusterName": ""
	    },
	    "secGroupIds": [
		"sg-5678"
	    ]
	},
...
}

and pod spec:

annotations:
    "vpc-cni-k8s.amazonaws.com/subnet_selector": "zone in (internal)"

(e.g. using syntax for selectors).

That said I don't have enough knowledge how to solve scheduling issue. It is not a problem with calico as it is not constrained on number of IPs per node. I wish AWS supported up to 100 IP per ENI, then it is not an issues (as max number of pods per node is 100 anyway). Of course in this case we will need a way to control how many secondary IPs per interface are pre allocated.
Probably extended node's resources and using resource requests instead of annotations above can do the trick, but I never played with them yet.

liwenwu-amazon · 2018-08-14T20:41:49Z

this is a duplicate of #131

liwenwu-amazon mentioned this issue Jul 3, 2018

Ability to configure secondary IP preallocation #114

Closed

sdavids13 mentioned this issue Jul 11, 2018

Add support for VPC Secondary CIDR ranges #131

Closed

liwenwu-amazon closed this as completed Aug 14, 2018

sedefsavas mentioned this issue Mar 8, 2022

Only split SecondaryCidrBlock when there is no subnets specified in AWSMCP kubernetes-sigs/cluster-api-provider-aws#3276

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a configuration knob to allow Pods to use different VPC subnets #119

Add a configuration knob to allow Pods to use different VPC subnets #119

liwenwu-amazon commented Jun 27, 2018

xdrus commented Jun 28, 2018

liwenwu-amazon commented Jun 28, 2018

xdrus commented Jun 29, 2018

liwenwu-amazon commented Aug 14, 2018

Add a configuration knob to allow Pods to use different VPC subnets #119

Add a configuration knob to allow Pods to use different VPC subnets #119

Comments

liwenwu-amazon commented Jun 27, 2018

xdrus commented Jun 28, 2018

liwenwu-amazon commented Jun 28, 2018

xdrus commented Jun 29, 2018

liwenwu-amazon commented Aug 14, 2018