You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We created an AKS instance with a node count of 1. We then scaled up to a node count of 2. We had deployed a service containing a single pod with a replica count of 3, set to auto-scale up to 10 pods. We have recently attempted to implement some network security devices, including a Azure Application Firewall, and a Palo Alto firewall device.
Because we did not have ability to modify the virtual network, we added several subnets to the virtual network that was provisioned as part of the MC_{RESOURCE_GROUP}{CLUSTER_NAME}{REGION} that was created when the AKS object was created.
We reconfigured our service to use an internal load balancer, and then we included several routing rules to ensure that any traffic to our service was routed through both firewall devices before reaching the internal load balancer. This was accomplished by defining several network routes within the AKS-agentpool routetable.
With this configuration in place, we noticed that any attempts to access our service would fail roughly 1/3rd of the time. I suspected that this was an issue with our routing, and rather than attempting to solve that issue, we decided to scaled down our nodepool from 2 nodes to 1. This was achieved by running the command "az aks scale" with the appropriate flags. This completed successfully, and we resumed troubleshooting our deployment.
Once we had confirmed that our deployment was functioning, and that all traffic was being routed through our two firewalls, we then attempted to scale up to 2 nodes. This resulted in an error:
Deployment failed. Correlation ID: [correlationId]. Subnet [subnet] is in use by [firewall network interface] and cannot be deleted.
This results in the AKS resource reporting a node size of 2, and being in a failed state. We can then scale back down to a single node, which completes successfully and removes the failed state message.
I can reproduce this in both CentralUS and CanadaEast.
Is there anyway to scale the node count up without removing the vnet?
The text was updated successfully, but these errors were encountered:
Today no. The way we are scaling requires us to give a list of all subnets and we just generate all the subnets we made at the start. This is unsupported today. We might be able to make a change to support this I'm talking with the team about desire for this.
This looks to be resolved now given we have moved to aks-engine and may changes have followed, closing as a result. @CMHR-MichaelLanglois please open a new ticket if you still see scale operations try to delete networking objects.
We created an AKS instance with a node count of 1. We then scaled up to a node count of 2. We had deployed a service containing a single pod with a replica count of 3, set to auto-scale up to 10 pods. We have recently attempted to implement some network security devices, including a Azure Application Firewall, and a Palo Alto firewall device.
Because we did not have ability to modify the virtual network, we added several subnets to the virtual network that was provisioned as part of the MC_{RESOURCE_GROUP}{CLUSTER_NAME}{REGION} that was created when the AKS object was created.
We reconfigured our service to use an internal load balancer, and then we included several routing rules to ensure that any traffic to our service was routed through both firewall devices before reaching the internal load balancer. This was accomplished by defining several network routes within the AKS-agentpool routetable.
With this configuration in place, we noticed that any attempts to access our service would fail roughly 1/3rd of the time. I suspected that this was an issue with our routing, and rather than attempting to solve that issue, we decided to scaled down our nodepool from 2 nodes to 1. This was achieved by running the command "az aks scale" with the appropriate flags. This completed successfully, and we resumed troubleshooting our deployment.
Once we had confirmed that our deployment was functioning, and that all traffic was being routed through our two firewalls, we then attempted to scale up to 2 nodes. This resulted in an error:
Deployment failed. Correlation ID: [correlationId]. Subnet [subnet] is in use by [firewall network interface] and cannot be deleted.
This results in the AKS resource reporting a node size of 2, and being in a failed state. We can then scale back down to a single node, which completes successfully and removes the failed state message.
I can reproduce this in both CentralUS and CanadaEast.
Is there anyway to scale the node count up without removing the vnet?
The text was updated successfully, but these errors were encountered: