-
Notifications
You must be signed in to change notification settings - Fork 558
VirtualMachineScaleSets support for Kubernetes #2620
Conversation
👍 VirtualMachineScaleSets is only supported with kubernetes v1.10 and above versions. So I think we don't need to check backward compatibility. |
19a7d05
to
61cc30b
Compare
I don't really understand why is this documented as supported if there is an open PR. Is that a bug? https://github.com/Azure/acs-engine/blob/master/docs/clusterdefinition.md#agentpoolprofiles |
@todoesverso VMSS is supported for Swarm and DCOS but not for Kubernetes yet (support is added with Kubernetes v1.10) |
@sozercan hey! we're trying to test this PR with kubernetes 1.10.0 and cluster autoscaler. We've got a problem with unregistered nodes.
Digging through other issues, I found this Is useInstanceMetadata supposed to be false with VMSS and k8s 1.10.0? |
Yep, this is supported in v1.10. But there are a bug for vmss with autoscaler 1.2.0, the fix is here kubernetes/autoscaler#758. It should be included in v1.2.1. |
@feiskyer oh thx! We'll try it with 1.2.1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few questions/changes but i think it is a great start
] | ||
} | ||
}, | ||
"servicePrincipalProfile": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let us default our examples to MSI. MSI (and later EMSi when possible)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, updated
parts/k8s/kubernetesmastervars.t
Outdated
@@ -119,7 +119,11 @@ | |||
"cloudProviderRatelimitQPS": "[parameters('cloudProviderRatelimitQPS')]", | |||
"cloudProviderRatelimitBucket": "[parameters('cloudProviderRatelimitBucket')]", | |||
"useManagedIdentityExtension": "{{ UseManagedIdentity }}", | |||
{{if .HasVirtualMachineScaleSets}} | |||
"useInstanceMetadata": "false", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the reason for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1.10 supports setting useInstanceMetadata to both true and false, so we should keep same logic: "useInstanceMetadata": "{{ UseInstanceMetadata }}"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
due to kubernetes/kubernetes#62611, if instanceMetadata
is enabled and vmType
is set to vmss
, master kubelet won't start. If it's disabled, it works fine. One possible (and messy) workaround is to duplicate cloud config on the masters, so master has kubelet vmType
as standard and controller-manager has vmss. Better option is to wait until 1.10.2 with above pr merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is experimental (we'll add docs later) let's change the above to "useInstanceMetadata": "{{ UseInstanceMetadata }}"
. Experimenters pre-1.10.2 should specify "useInstanceMetadata": "false"
in their api models until then.
This way we don't have to submit a follow-up PR after 1.10.2.
parts/k8s/kubernetesbase.t
Outdated
@@ -55,6 +55,8 @@ | |||
{{if $index}}, {{end}} | |||
{{if .IsWindows}} | |||
{{template "k8s/kuberneteswinagentresourcesvmas.t" .}} | |||
{{else if .IsVirtualMachineScaleSets}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a reason not support windows as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
windows agents should be working now
for _, agentProfile := range cs.Properties.AgentPoolProfiles { | ||
if agentProfile.IsAvailabilitySets() { | ||
return true | ||
} | ||
} | ||
return false | ||
}, | ||
"AnyAgentUsesVirtualMachineScaleSets": func() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we shouldn't allow bootstrapping a mixed mode cluster. Either availability sets
or scale sets
but not both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added validation check so mixed mode clusters are not allowed
"secrets": "[variables('linuxProfileSecrets')]" | ||
{{end}} | ||
}, | ||
"storageProfile": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a PR recently enabling users to provision using their own images. is this included? Used currently by openshift on acs-engine @jim-minter FYI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, seems to be included, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
custom images should be included now
"osDisk": { | ||
"createOption": "FromImage", | ||
"caching": "ReadWrite" | ||
{{if .IsStorageAccount}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jackfrancis I think we should validate against storage accounts for new clusters while allowing them for scale ops. Is this possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@khenidak I'm not sure what you mean.
@JackQuincy can you confirm that there's no special validation/template processing context for deployment vs scale?
@kkmsft for autoscaler + VMSS work |
90346d0
to
092547e
Compare
@sozercan Thanks for awesome work. LGTM @JiangtianLi Could you also have a look at this? |
"subnetName": "$global:SubnetName", | ||
"securityGroupName": "$global:SecurityGroupName", | ||
"vnetName": "$global:VNetName", | ||
"routeTableName": "$global:RouteTableName", | ||
"primaryAvailabilitySetName": "$global:PrimaryAvailabilitySetName", | ||
"primaryScaleSetName": "$global:PrimaryScaleSetName", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is primaryScaleSetName used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both vmType
and primaryScaleSetName
are part of the cloud provider config for Azure for v1.10.
@@ -188,11 +186,13 @@ Write-AzureConfig() | |||
"aadClientSecret": "$AADClientSecret", | |||
"resourceGroup": "$global:ResourceGroup", | |||
"location": "$Location", | |||
"vmType": "$global:VmType", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is vmType used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
@@ -346,7 +345,7 @@ $KubeletCommandLine | |||
"@ | |||
} else { | |||
$kubeStartStr += @" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for cleaning up. :)
@@ -0,0 +1,193 @@ | |||
{{if .IsStorageAccount}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the diff of this file with kubernetesagentresourcesvmss.t?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
74c74
< "resourceNameSuffix" : "[variables('winResourceNamePrefix')]",
---
> "resourceNameSuffix" : "[variables('nameSuffix')]",
131,134c131,148
< "computerNamePrefix": "[concat(substring(variables('nameSuffix'), 0, 5), 'acs')]",
< {{GetKubernetesWindowsAgentCustomData .}}
< "adminUsername": "[variables('windowsAdminUsername')]",
< "adminPassword": "[variables('windowsAdminPassword')]"
---
> "adminUsername": "[variables('username')]",
> "computerNamePrefix": "[variables('{{.Name}}VMNamePrefix')]",
> {{GetKubernetesAgentCustomData .}}
> "linuxConfiguration": {
> "disablePasswordAuthentication": "true",
> "ssh": {
> "publicKeys": [
> {
> "keyData": "[parameters('sshRSAPublicKey')]",
> "path": "[variables('sshKeyPath')]"
> }
> ]
> }
> }
> {{if HasLinuxSecrets}}
> ,
> "secrets": "[variables('linuxProfileSecrets')]"
> {{end}}
137c151,153
< {{GetDataDisks .}}
---
> {{if not (UseAgentCustomImage .)}}
> {{GetDataDisks .}}
> {{end}}
139,142c155,162
< "offer": "[variables('agentWindowsOffer')]",
< "publisher": "[variables('agentWindowsPublisher')]",
< "sku": "[variables('agentWindowsSku')]",
< "version": "[variables('agentWindowsVersion')]"
---
> {{if UseAgentCustomImage .}}
> "id": "[resourceId(variables('{{.Name}}osImageResourceGroup'), 'Microsoft.Compute/images', variables('{{.Name}}osImageName'))]"
> {{else}}
> "offer": "[variables('{{.Name}}osImageOffer')]",
> "publisher": "[variables('{{.Name}}osImagePublisher')]",
> "sku": "[variables('{{.Name}}osImageSKU')]",
> "version": "[variables('{{.Name}}osImageVersion')]"
> {{end}}
163,165c183,185
< "publisher": "Microsoft.Compute",
< "type": "CustomScriptExtension",
< "typeHandlerVersion": "1.8",
---
> "publisher": "Microsoft.Azure.Extensions",
> "type": "CustomScript",
> "typeHandlerVersion": "2.0",
169c189
< "commandToExecute": "[concat('powershell.exe -ExecutionPolicy Unrestricted -command \"', '$arguments = ', variables('singleQuote'),'-MasterIP ',variables('kubernetesAPIServerIP'),' -KubeDnsServiceIp ',variables('kubeDnsServiceIp'),' -MasterFQDNPrefix ',variables('masterFqdnPrefix'),' -Location ',variables('location'),' -AgentKey ',variables('clientPrivateKey'),' -AADClientId ',variables('servicePrincipalClientId'),' -AADClientSecret ',variables('servicePrincipalClientSecret'),variables('singleQuote'), ' ; ', variables('windowsCustomScriptSuffix'), '\" > %SYSTEMDRIVE%\\AzureData\\CustomDataSetupScript.log 2>&1')]"
---
> "commandToExecute": "[concat(variables('provisionScriptParametersCommon'),' /usr/bin/nohup /bin/bash -c \"/bin/bash /opt/azure/containers/provision.sh >> /var/log/azure/cluster-provision.log 2>&1\"')]"
178c198
< "type": "ManagedIdentityExtensionForWindows",
---
> "type": "ManagedIdentityExtensionForLinux",
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this. Generally lgtm. left a few comments.
Just one more request @sozercan: could you kindly add a validation that requires instanceMetadata to be false for k8s < 1.10.2? (I know that's basically everything now but that will become useful over time.) Thanks so much for this, looks great! |
@jackfrancis that is a bit of an overkill? we can run imds on 1.8 and 1.9 |
Is the description for
|
@jackfrancis added validation when using VMSS with k8s < v1.10.2 @CecileRobertMichon thanks, updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
What this PR does / why we need it:
Which issue this PR fixes:
fixes #2403
If applicable:
ie. deploy with previous version, upgrade with this branch)Release note:
cc @feiskyer @JiangtianLi