Recently we discovered a few ACS kubernetes clusters that were not responding after system reboots of the Master VM's. All ACS RPv1 clusters deployed will see this issue and ACS RPv2 clusters deployed after Fri Oct 20 06:49:20 PDT 2017 will see this issue.

Upon investigation, we found out that that this was due to etcd not restarting.

As a fix, we set etcd2 restart to 'always' after rebooting master VM's. This issue has been fixed and has being rolled out to RPv2 resions.

To fix this issue manually, please run the following commands on all the master nodes in your cluster

+- sudo /bin/sed -i s/Restart=on-abnormal/Restart=always/g /lib/systemd/system/etcd.service

+- systemctl daemon-reload

List of all ACS RPv1 regions:

australiasoutheast
northeurope
brazilsouth
australiaeast
japaneast
northcentralus
westus
eastasia
eastus2
southcentralus
southeastasia
eastus
westeurope
Centralus

List of RPv2 regions:

UK West
UK South
West Central US
West US 2
Canada East
Canada Central
West India
South India
Central India
japanwest

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2017-09-11_Kubernetes-cluster-not-responding-issue-fix.md

2017-09-11_Kubernetes-cluster-not-responding-issue-fix.md

Files

2017-09-11_Kubernetes-cluster-not-responding-issue-fix.md

Latest commit

History

2017-09-11_Kubernetes-cluster-not-responding-issue-fix.md

File metadata and controls