-
Notifications
You must be signed in to change notification settings - Fork 558
acs-engine scaling failed with error [segmentation violation code=0x1 addr=0x8 pc=0x11aa9be] #3337
Comments
Hi @saromba, could share you cluster configuration json (after removing secrets)? Did you manually change anything in the generated azuredeploy.json before deploying? |
Here's similar issue #2649 that was fixed a while ago in case that helps. |
Hi @CecileRobertMichon: Which file do you need? There are many files during the provisioning. |
@saromba |
Hi @CecileRobertMichon: No I didn't change a file manually. Here is the cluster config: { |
@CecileRobertMichon Regards, |
@saromba no update yet, I wasn't able to repro and then went on vacation. I'll try to give it another look this week. Did you by any chance scale or upgrade your cluster previously? Or was this cluster in its original state when you attempted to scale? |
@CecileRobertMichon |
we fixed that ourselves, scaling is now possible again...
Regards, |
Hi @saromba, I'm glad to hear that you are unblocked. Was the root cause that the VM tags were missing? How can we fix this so that others don't run into the same issue? |
We have run into this on v0.20.6 and v0.21.1. We did not manually change anything in our generated azuredeploy.json before deploying. |
@CecileRobertMichon Based on @saromba's comment, I cheched our VM tags and found both poolName and resourceNameSuffix present. |
@ryanlovett can you please share your apimodel and exact steps you took so I can try and repro? |
We had manually added a non-cluster VM to the resource group and this VM did not have those tags. I just manually added them and acs-engine hasn't crashed yet. I think scale.go should check for whether the poolName and resourceNameSuffix tags exist before trying to reference them. I know nothing about go, otherwise I'd create a PR. |
@ryanlovett agreed, we have an issue tracking this problem at #3663. I will close this one. |
Great, thanks! |
Is this a request for help?:
Yes
Is this an ISSUE or FEATURE REQUEST? (choose one):ISSUE
What version of acs-engine?:0.18.9
Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm)
Kubernetes (1.9.8)
What happened:
[36mINFO[0m[0000] validating...
[36mINFO[0m[0001] Name suffix: %s 15892004
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x11aa9be]
goroutine 1 [running]:
github.com/Azure/acs-engine/cmd.(*scaleCmd).run(0xc420386000, 0xc42037c6c0, 0xc42035f320, 0x0, 0x12, 0x0, 0x0)
/Users/jackfrancis/work/src/github.com/Azure/acs-engine/cmd/scale.go:224 +0x38e
github.com/Azure/acs-engine/cmd.newScaleCmd.func1(0xc42037c6c0, 0xc42035f320, 0x0, 0x12, 0x0, 0x0)
/Users/jackfrancis/work/src/github.com/Azure/acs-engine/cmd/scale.go:69 +0x52
github.com/Azure/acs-engine/vendor/github.com/spf13/cobra.(*Command).execute(0xc42037c6c0, 0xc42035f200, 0x12, 0x12, 0xc42037c6c0, 0xc42035f200)
/Users/jackfrancis/work/src/github.com/Azure/acs-engine/vendor/github.com/spf13/cobra/command.go:647 +0x3f1
github.com/Azure/acs-engine/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc42032f8c0, 0xc42037c6c0, 0xc42037c480, 0xc42037c240)
/Users/jackfrancis/work/src/github.com/Azure/acs-engine/vendor/github.com/spf13/cobra/command.go:726 +0x2fe
github.com/Azure/acs-engine/vendor/github.com/spf13/cobra.(*Command).Execute(0xc42032f8c0, 0xc42000c018, 0x0)
/Users/jackfrancis/work/src/github.com/Azure/acs-engine/vendor/github.com/spf13/cobra/command.go:685 +0x2b
main.main()
/Users/jackfrancis/work/src/github.com/Azure/acs-engine/main.go:12 +0x74
What you expected to happen:
Correct scaling of the cluster
How to reproduce it (as minimally and precisely as possible):
/root/deploy/acs-engine/acs-engine scale --auth-method client_secret --client-id $AZURE_CLIENT_ID --client-secret $AZURE_CLIENT_SECRET --subscription-id $AZURE_SUBSCRIPTION_ID --resource-group $AZURE_RESOURCEGROUP --location westeurope --deployment-dir tmpDir --new-node-count $NODE_COUNT --master-FQDN $AZURE_RESOURCEGROUP.westeurope.cloudapp.azure.com
Anything else we need to know:
The text was updated successfully, but these errors were encountered: