-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zone redundant NAT Gateways #331
Conversation
2d6e613
to
55ca5f8
Compare
55ca5f8
to
8f82e4e
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
/assign |
/test |
This comment has been minimized.
This comment has been minimized.
1c1ee15
to
49fdbb5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now also done the validation part.
After an intensive pair review on that on functional and code level I have only minor comments lefts. After checking and addressing them where necessary I think we are finally good with this PR.
func ValidateInfrastructureConfigAgainstCloudProfile(oldInfra, infra *apisazure.InfrastructureConfig, shootRegion string, cloudProfile *gardencorev1beta1.CloudProfile, fld *field.Path) field.ErrorList { | ||
allErrs := field.ErrorList{} | ||
|
||
if len(infra.Networks.Zones) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor hint: You could use your helper IsUsingSingleSubnetLayout()
here ;)
} | ||
|
||
allErrs = append(allErrs, validateVnetConfig(&config, infra.ResourceGroup, workerCIDR, nodes, pods, services, zonesPath, vNetPath)...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this before the if else
block?
Then we could eliminate the if else
by returning after the helper.IsUsingSingleSubnetLayout
if block is completed.
@kon-angelo You need rebase this pull request with latest master branch. Please check. |
…der-azure into multi-zone-nat6
…der-azure into multi-zone-nat6
/test |
Testrun: e2e-rz54w +---------------------+---------------------+--------+----------+ | NAME | STEP | PHASE | DURATION | +---------------------+---------------------+--------+----------+ | infrastructure-test | infrastructure-test | Failed | 22m25s | +---------------------+---------------------+--------+----------+ |
…der-azure into multi-zone-nat6
a89e349
to
90e66c4
Compare
/test |
Testrun: e2e-jzbw7 +---------------------+---------------------+-----------+----------+ | NAME | STEP | PHASE | DURATION | +---------------------+---------------------+-----------+----------+ | infrastructure-test | infrastructure-test | Succeeded | 33m17s | +---------------------+---------------------+-----------+----------+ |
/lgtm |
How to categorize this PR?
/area control-plane
/kind enhancement
/platform azure
What this PR does / why we need it:
This PR allows the deployment of shoots with a new network setup for Azure. The new setup is akin to other extensions where it is possible to have finer-grained control over the networking of each availability zone. The main motivation is to have zone redundant NAT Gateways. Prior to this PR we used a single subnet and thus a single NAT Gateway instance. Using the new setup each subnet can potentially have its own NAT gateway instance and separate public IPs.
In addition, there is a migration path from our current zoned setup to this "enhanced" zone setup. For this to be possible, we impose the constraint that the first zone specified in the infrastructure must match the CIDR range of the existing
workers
subnet. The main reason is to preserve the subnet of an existing cluster and hence prevent having to teardown all the existing VMs to recreate the network setup. Instead, by preserving the original subnet we can have a controlled node migration (via MCM) to the newmachineClasses
and ensure minimal downtime during the migration.Which issue(s) this PR fixes:
Fixes #https://github.com/orgs/gardener/projects/7#card-49971123
Special notes for your reviewer:
Currently missing:
azurerm
provider.Unfortunately there is a major bug with with TF provider for Azure that we encountered (link) - when creating multiple "association" resources, the terraform provider is permanently stuck due to locking issues. With the current (at the time of writing) version we use it is happening consistently. The "custom" version of the terraformer used in this PR is build based on a newer version (2.60) and while the occurrence rate has fallen, it is still around 50% according to my tests and I consider it a showstopper until fixed. The custom terraform image is there to facilitate reviewers and testers.
We have a PR with a potential fix waiting for review (hashicorp/terraform-provider-azurerm#12267).
Release note: