Skip to content
This repository was archived by the owner on Jan 11, 2023. It is now read-only.

Add maxPods to Kubernetes configuration #1103

Merged
merged 3 commits into from
Aug 21, 2017
Merged

Conversation

ofiliz
Copy link
Contributor

@ofiliz ofiliz commented Jul 25, 2017

What this PR does / why we need it:

This PR adds the feature described in issue #1102. It adds a "maxPods" field to KubernetesConfig to customize the maximum number of pods that can be scheduled on a node.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #1102

This PR adds the optional "maxPods" field to KubernetesConfig. It defaults to 110 (the current Kubernetes default) when using spoofed IP addresses, and 30 when using actual IP addresses with CNI plugin enabled.
Customers can set custom maxPods values appropriate for their scenarios.

This is general goodness whether CNI plugin is used or not. However when CNI plugin is enabled, maxPods gives finer control over the number of IP addresses allocated per node, allowing larger cluster sizes with fewer addresses per node (e.g. 4096/110=37 vs. 4096/64=64 vs. 4096/30=136)

Special notes for your reviewer:

Release note:

fixes #1102


This change is Reviewable

@acs-bot
Copy link

acs-bot commented Jul 25, 2017

Can one of the admins verify this patch? Say "@acs-bot test this please" to start tests.

Copy link

@dmitsh dmitsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add an api-model example to examples/vnet/ ?

Copy link
Contributor

@seanknox seanknox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good, but we need unit tests for any new functionality.

@dmitsh
Copy link

dmitsh commented Aug 1, 2017

@acs-bot test this please

@ofiliz
Copy link
Contributor Author

ofiliz commented Aug 11, 2017

@seanknox Any more feedback?

@ofiliz ofiliz force-pushed the k8s-maxpods branch 2 times, most recently from 40729d9 to eed23f1 Compare August 11, 2017 22:33
@@ -0,0 +1,9 @@
# Microsoft Azure Container Service Engine - Kubernetes Features
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally change this path to kubernetes-custom-networking. Features is too generic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But maxPods is not a networking setting. It is only indirectly related because it affects number of IP addresses.

@anhowe @dmitsh Can you please advise on how this document tree should be structured?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the path to "kubernetes-config" because all of these settings are under orchestratorProfile.kubernetesConfig in the API model. I thought that was a good trade-off between being too generic (as in "features") vs. being too specific (as in "networking"). Please let me know if you disagree/have a proposal.

Copy link
Member

@lachie83 lachie83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a fair few snippets of information in the code that I feel should be verbalized in docs. I'm concerned the knowledge will be lost in the code. For example your arithmetic an the minmax pod setting. We should call out everything you need to consider when using Azure CNI

@ofiliz
Copy link
Contributor Author

ofiliz commented Aug 14, 2017

@seanknox @lachie83 Can you please sign-off unless you have other comments? I'd love to have this merged soon so we can end-to-end test CNI plugin in large clusters.

@@ -35,6 +35,7 @@ Here are the valid values for the orchestrator types:
|clusterSubnet|no|The IP subnet used for allocating IP addresses for pod network interfaces. The subnet must be in the VNET address space. Default value is 10.244.0.0/16.|
|dockerBridgeSubnet|no|The specific IP and subnet used for allocating IP addresses for the docker bridge network created on the kubernetes master and agents. Default value is 172.17.0.1/16. This value is used to configure the docker daemon using the [--bip flag](https://docs.docker.com/engine/userguide/networking/default_network/custom-docker0).|
|enableRbac|no|Enable [Kubernetes RBAC](https://kubernetes.io/docs/admin/authorization/rbac/) (boolean - default == false) |
|maxPods|no|The maximum number of pods per node. The minimum valid value, necessary for running kube-system pods, is 5. Default value is 110.|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also looks like the default is 30 if networkPolicy == azure. Please document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@ofiliz ofiliz force-pushed the k8s-maxpods branch 2 times, most recently from 4aed84e to da4c75c Compare August 15, 2017 23:33
@ofiliz
Copy link
Contributor Author

ofiliz commented Aug 16, 2017

@dmitsh @lachie83 Can we please merge this?

@lachie83
Copy link
Member

LGTM. Going to run e2e

@lachie83
Copy link
Member

@ofiliz can you please rebase so we can get this merged.

@ofiliz ofiliz force-pushed the k8s-maxpods branch 2 times, most recently from b4f97d3 to 11b187a Compare August 18, 2017 21:50
@ofiliz
Copy link
Contributor Author

ofiliz commented Aug 18, 2017

@lachie83 Rebased, again. Can we please get this merged today? These commits (as part of earlier PRs) are more than 3 months old now, and still waiting to get merged.

// when VNET integration is enabled. It can be overridden per pool by setting the pool's IPAdddressCount property.
DefaultAgentMultiIPAddressCount = 128
// DefaultKubernetesMaxPods is the maximum number of pods to run on a node.
DefaultKubernetesMaxPods = 110
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this value consisent across all acs-engine supported k8s versions? (1.5, 1.6, and 1.7)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@@ -14,7 +14,7 @@
},
"agentPoolProfiles": [
{
"name": "agent",
"name": "agentpool1",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing this filename breaks our E2E tests, which consume them. Is there a good semantic reason to make a change? If so, we'll have to update our E2E test configurations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll revert them to "agent". Do you need all examples in kubernetes-config directory to use "agent" as pool name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually the filename change that is breaking E2E configuration.

examples/custom-pod-cidr/kubernetes.json --> examples/kubernetes-config/kubernetes-clustersubnet.json

Do we mean to move the examples/custom-pod-cidr/kubernetes.json file around?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a previous CR feedback by @dmitsh. Instead of creating a new directory for every little KubernetesConfig knob we introduce, we are grouping them under one directory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O.K., @dmitsh's recommendations makes sense. We'll have to fix the E2E configs to point to the new files then. You can do that easily thusly:

In the following three files:

  • test/acse-conf/acse-feature-validation.json
  • test/acse-conf/acse-regression.json
  • test/e2e/kubernetes-deployments.json

Replace all "cluster_definition" values of "custom-pod-cidr/kubernetes.json" with the new relative filepath "kubernetes-config/kubernetes-clustersubnet.json"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a new commit that updates the above files.

@@ -14,7 +14,7 @@
},
"agentPoolProfiles": [
{
"name": "agent",
"name": "agentpool1",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto comment above

@@ -260,7 +260,7 @@
"autoUpgradeMinorVersion": true,
"settings": {},
"protectedSettings": {
"commandToExecute": "[concat('/usr/bin/nohup /bin/bash -c \"/bin/bash /opt/azure/containers/provision.sh ',variables('tenantID'),' ',variables('subscriptionId'),' ',variables('resourceGroup'),' ',variables('location'),' ',variables('subnetName'),' ',variables('nsgName'),' ',variables('virtualNetworkName'),' ',variables('routeTableName'),' ',variables('primaryAvailabilitySetName'),' ',variables('servicePrincipalClientId'),' ',variables('servicePrincipalClientSecret'),' ',variables('clientPrivateKey'),' ',variables('targetEnvironment'),' ',variables('networkPolicy'),' ',variables('cloudProviderBackoff'),' ',variables('cloudProviderBackoffRetries'),' ',variables('cloudProviderBackoffExponent'),' ',variables('cloudProviderBackoffDuration'),' ',variables('cloudProviderBackoffJitter'),' ',variables('cloudProviderRatelimit'),' ',variables('cloudProviderRatelimitQPS'),' ',variables('cloudProviderRatelimitBucket'),' ', variables('useManagedIdentityExtension'),' ',variables('useInstanceMetadata'),' >> /var/log/azure/cluster-provision.log 2>&1 &\" &')]"
"commandToExecute": "[concat('/usr/bin/nohup /bin/bash -c \"/bin/bash /opt/azure/containers/provision.sh ',variables('tenantID'),' ',variables('subscriptionId'),' ',variables('resourceGroup'),' ',variables('location'),' ',variables('subnetName'),' ',variables('nsgName'),' ',variables('virtualNetworkName'),' ',variables('routeTableName'),' ',variables('primaryAvailablitySetName'),' ',variables('servicePrincipalClientId'),' ',variables('servicePrincipalClientSecret'),' ',variables('clientPrivateKey'),' ',variables('targetEnvironment'),' ',variables('networkPolicy'),' ',variables('maxPods'),' ',variables('cloudProviderBackoff'),' ',variables('cloudProviderBackoffRetries'),' ',variables('cloudProviderBackoffExponent'),' ',variables('cloudProviderBackoffDuration'),' ',variables('cloudProviderBackoffJitter'),' ',variables('cloudProviderRatelimit'),' ',variables('cloudProviderRatelimitQPS'),' ',variables('cloudProviderRatelimitBucket'),' ', variables('useManagedIdentityExtension'),' ',variables('useInstanceMetadata'),' >> /var/log/azure/cluster-provision.log 2>&1 &\" &')]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We lost the change from this recent commit to master:

f477966

s/primaryAvailablitySetName/primaryAvailabilitySetName/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this. I missed it during the merge. Fixed.

@@ -529,7 +529,7 @@
"autoUpgradeMinorVersion": true,
"settings": {},
"protectedSettings": {
"commandToExecute": "[concat('/usr/bin/nohup /bin/bash -c \"/bin/bash /opt/azure/containers/provision.sh ',variables('tenantID'),' ',variables('subscriptionId'),' ',variables('resourceGroup'),' ',variables('location'),' ',variables('subnetName'),' ',variables('nsgName'),' ',variables('virtualNetworkName'),' ',variables('routeTableName'),' ',variables('primaryAvailabilitySetName'),' ',variables('servicePrincipalClientId'),' ',variables('servicePrincipalClientSecret'),' ',variables('clientPrivateKey'),' ',variables('targetEnvironment'),' ',variables('networkPolicy'),' ',variables('cloudProviderBackoff'),' ',variables('cloudProviderBackoffRetries'),' ',variables('cloudProviderBackoffExponent'),' ',variables('cloudProviderBackoffDuration'),' ',variables('cloudProviderBackoffJitter'),' ',variables('cloudProviderRatelimit'),' ',variables('cloudProviderRatelimitQPS'),' ',variables('cloudProviderRatelimitBucket'),' ',variables('useManagedIdentityExtension'),' ',variables('useInstanceMetadata'),' ',variables('apiServerPrivateKey'),' ',variables('caCertificate'),' ',variables('caPrivateKey'),' ',variables('masterFqdnPrefix'),' ',variables('kubeConfigCertificate'),' ',variables('kubeConfigPrivateKey'),' ',variables('username'),' >> /var/log/azure/cluster-provision.log 2>&1\"')]"
"commandToExecute": "[concat('/usr/bin/nohup /bin/bash -c \"/bin/bash /opt/azure/containers/provision.sh ',variables('tenantID'),' ',variables('subscriptionId'),' ',variables('resourceGroup'),' ',variables('location'),' ',variables('subnetName'),' ',variables('nsgName'),' ',variables('virtualNetworkName'),' ',variables('routeTableName'),' ',variables('primaryAvailablitySetName'),' ',variables('servicePrincipalClientId'),' ',variables('servicePrincipalClientSecret'),' ',variables('clientPrivateKey'),' ',variables('targetEnvironment'),' ',variables('networkPolicy'),' ',variables('maxPods'),' ',variables('cloudProviderBackoff'),' ',variables('cloudProviderBackoffRetries'),' ',variables('cloudProviderBackoffExponent'),' ',variables('cloudProviderBackoffDuration'),' ',variables('cloudProviderBackoffJitter'),' ',variables('cloudProviderRatelimit'),' ',variables('cloudProviderRatelimitQPS'),' ',variables('cloudProviderRatelimitBucket'),' ',variables('useManagedIdentityExtension'),' ',variables('useInstanceMetadata'),' ',variables('apiServerPrivateKey'),' ',variables('caCertificate'),' ',variables('caPrivateKey'),' ',variables('masterFqdnPrefix'),' ',variables('kubeConfigCertificate'),' ',variables('kubeConfigPrivateKey'),' ',variables('username'),' >> /var/log/azure/cluster-provision.log 2>&1\"')]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/primaryAvailablitySetName/primaryAvailabilitySetName/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Please double check.

@jackfrancis
Copy link
Member

@ofiliz comments above are blocking progress of this PR, I'm ready to merge once they're addressed. Let me know if you have any questions!

@jackfrancis jackfrancis merged commit 53c42de into Azure:master Aug 21, 2017
@ofiliz ofiliz deleted the k8s-maxpods branch August 28, 2017 18:32
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add maxPods to Kubernetes configuration
7 participants