Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VPC: Default AZs to 3 #996

Closed
rix0rrr opened this issue Oct 23, 2018 · 7 comments · Fixed by #1543
Closed

VPC: Default AZs to 3 #996

rix0rrr opened this issue Oct 23, 2018 · 7 comments · Fixed by #1543
Labels
@aws-cdk/aws-ec2 Related to Amazon Elastic Compute Cloud effort/small Small work item – less than a day of effort feature-request A feature should be added or improved.

Comments

@rix0rrr
Copy link
Contributor

rix0rrr commented Oct 23, 2018

Right now, the default for the # of AZs is "as many as you can get", which leads to consistent problems in us-east-1 (because it has so many AZs): I can't start a default VPC because I run out of EIPs.

What's worse, code that runs fine in one region fails to start in us-east-1 because of this, which is worse. You think you've tested something, but then it doesn't work.

Since there's diminishing returns of additional AZs after the 3rd, why not default to 3? (But let users raise the number, obviously, if they really want to).

Also, we should probably make it possible to define an offset as well, so that you can have two VPCs spread over all 6 AZs (one a-b-c, the other d-e-f).

@eladb
Copy link
Contributor

eladb commented Oct 23, 2018

@mindstorms6 @PaulMaddox what do you guys think?

@PaulMaddox
Copy link
Contributor

I agree with diminishing returns for spreading architectures across more than 3 AZs, but i don’t think your suggestion is the right way of achieving this.

If we split the VPC CIDR into three AZs worth of subnets, what happens later if they want to add more AZs? The IP space will be all allocated, and their options would be super limited and likely result in a very disruptive redeployment of their whole application.

I think it’s better to carve the VPC subnets into all available AZs, but maybe have a method that gets public/private subnets for a limited number of AZs, which can be used to deploying the applications.

The downside is NAT gateways would be deployed to all SubnetType.Private subnets (incurring a cost), and I’m open to suggestion on ways to improve that - I just can’t think of a nice API experience right now.

Maybe a boolean flag on a subnet configuration for “enabled”? That way subnets could be in a non-active state, reserved for future use.

@yandy-r
Copy link

yandy-r commented Oct 23, 2018

Why does all IP space need to be deployed from the get-go? Why not calculate based on all available AZs, but deploy only to the default of 3? So if us-east-1 has 6 AZ, calculate subnets based on those, then take the subset of those and deploy to the 3 default AZs (random?).

This may require keeping state (not ideal) or checking available space for future deployments. Though it may help if say, someone wants to add another subnet or a set of them. It would allow for app scale, as right now it's all or nothing.

Right now the default is to create public and private subnets if not specified; correct? What if there's a need to add isolated subnets in the future?

What happens when you deploy to say us-west-2 and only six /19 get generated? what happens to the rest of the unused space? Should/Does CDK check for available space? maybe this can be tracked as a parameter or even a tag..

@moofish32
Copy link
Contributor

moofish32 commented Oct 23, 2018

So for brevity I think defaulting to 3 is a good idea. Now the how is interesting.

If you try to say spread the subnets across all AZs and pick every nth one, the logic is much more complicated and the user is going to have to understand more options because no matter what you do this algorithm of subnet allocation is inherently public.

I personally think for advanced additions adding additional vpc associations should be considered. As an enterprise customer we intentionally allocate medium to small blocks for IPAM considerations (aka you are not going to see a /8 /16 in any of our accounts unless it's experimentation). The second part is we (CDK) do not prevent teams from adding subnets to VPCs using the current L2 construct; do we? So as long as you don't allocate all IP space, you can manually carve up your extra space as you see fit. I think it's fair to expect a user that has saturated IP space to be a knowledgeable user and be able to implement this, but that's my opinion with 0 data to support.

So all this leads me to asking what if we kept what is there today, but default the maxAZs to 3 unless you specify more? How bad is our L2 Subnet addition experience? I think this is mostly what @rix0rrr was proposing.

The downside is NAT gateways would be deployed to all SubnetType.Private subnets (incurring a cost), and I’m open to suggestion on ways to improve that - I just can’t think of a nice API experience right now.

@PaulMaddox what do you think of the current natGateways: number, natGatewayPlacement: VpcPlacementStrategy? The issue is if you want to control exactly which subnets use specific Gateways it's not 100% in your control. Again if you have this type of concern, I tend to expect you to be a power user and able to create a proper L2 based on L1s to meet your extremely high performance use case.

Also, we should probably make it possible to define an offset as well, so that you can have two VPCs spread over all 6 AZs (one a-b-c, the other d-e-f).

@rix0rrr - is it an offset or should we just let user pick a list of AZs instead of auto selecting? The irony of course is that this is only accurate for placement relative to one account. AZ A in account A and AZ A in account B might actually be physically different, so there is no value in cross account placement here.

@rix0rrr
Copy link
Contributor Author

rix0rrr commented Oct 24, 2018

Actually don't know if offsetting adds any value; only comes into play if you have 2 vpcs in 1 account, and customers would only care about failure domains, not sharding for capacity.

Although I did get an EKS error that in one AZ there was no more EKS capacity, I would want recourse when that happens

@moofish32
Copy link
Contributor

@rix0rrr - is it worth separating this issue?

@rix0rrr
Copy link
Contributor Author

rix0rrr commented Oct 26, 2018

Probably

@rix0rrr rix0rrr added feature-request A feature should be added or improved. @aws-cdk/aws-ec2 Related to Amazon Elastic Compute Cloud labels Jan 2, 2019
@rix0rrr rix0rrr added the effort/small Small work item – less than a day of effort label Jan 8, 2019
rix0rrr added a commit that referenced this issue Jan 14, 2019
For most users, default deployments of a VPC in `us-east-1` would fail
because that region has more AZs (6) than the allowed EIP allocations in
an acccount (5).

This is a very poor experience. Plus, it is wasteful since there is
not much additional value in using more than 3 AZs for a VPC.

Fixes #996.
rix0rrr added a commit that referenced this issue Jan 15, 2019
For most users, default deployments of a VPC in `us-east-1` would fail
because that region has more AZs (6) than the allowed EIP allocations in
an acccount (5).

This is a very poor experience. Plus, it is wasteful since there is
not much additional value in using more than 3 AZs for a VPC.

Fixes #996.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-ec2 Related to Amazon Elastic Compute Cloud effort/small Small work item – less than a day of effort feature-request A feature should be added or improved.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants