-
Notifications
You must be signed in to change notification settings - Fork 558
Not able to deploy K8s Cluster with ACS-Engine 14.6 version on azure #2591
Comments
Hi @rakeshkulkarni6 , what does your apimodel look like? |
Hi here is the api model for your reference { |
Hi I have tried with new version acs-engine 0.15.0 to deploy kubernetes cluster on azure. I am getting below error:
New-AzureRmResourceGroupDeployment : 5:54:07 PM - VM has reported a failure when processing extension 'cse0'. Error message:
|
@rakeshkulkarni6 please remove all secrets/keys from the apimodel you shared |
@rakeshkulkarni6 can you please try deploying with k8s 1.9.6? There are known upstream bugs in 1.9.0 |
Hi @CecileRobertMichon , Again its sowing same error, Please find the below error details. New-AzureRmResourceGroupDeployment : 1:24:27 PM - Resource Microsoft.Compute/virtualMachines/extensions
New-AzureRmResourceGroupDeployment : 1:24:27 PM - VM has reported a failure when processing extension 'cse0'. Error message:
New-AzureRmResourceGroupDeployment : 1:24:27 PM - Template output evaluation skipped: at least one resource deployment operation
New-AzureRmResourceGroupDeployment : 1:24:27 PM - Template output evaluation skipped: at least one resource deployment operation
|
I have used vlabs in apiversion and kept dnd prefix empty in agentpoolprofile.still I am getting above error while deploying can you please help I need to setup production cluster using ACS-ENGINE |
I got similar error when deploying cluster using 14.6 acs-engine version. I deleted the deployment and redeployed the cluster in to a new resource group with kubernetes version 1.9.5 it got deployed successfully and working fine with out any errors. |
@rakeshkulkarni6 @rakidu I am trying to repro this error. It looks like 14.6 might have introduced a regression causing transient deployment errors (possibly a race condition). In the meantime, if you retry you might get lucky and get a working cluster @rakeshkulkarni6. If you still have a cluster that failed with this error can you please share the content of |
I am seeing a similar issue with version 15.1 of acs-engine and version 1.8.9 of kubernetes. I am trying to deploy an cluster into an existing vNet. The vNet has multiple subnets, but I am getting the same error when deploying master & agents to the same subnet or splitting the master & agents into different subnets. |
Same here with 15.2, kubernetes 1.9.6 and 3 masters, 3 agents. The last lines from @CecileRobertMichon I tried now many times hoping to get the cluster deployed once, but had no luck... |
I also got this, acs-engine 0.15.2, kubernetes 1.9.6 and 1 master, 4 agents. cluster-provision.log
Interestingly, if I run Edit: in my case, it was resolved by deleting the cluster and re-creating. |
Err, I'll take that (edit) back - it deployed on second time but without heapster, dashboard, kubedns and tiller.
|
Is there any configuration of which it is possible to generate Kubernetes cluster at the moment. I have tried quite a many orchestratorRelease & orchestratorVersion combinations and getting the same error with all of the trials: "VM has reported a failure when processing extension 'cse0'." Here is my latest trial: |
@lehtiton this isn't a bug with a specific configuration but rather transient vm provisioning errors which result in one or more of the nodes not being ready in a certain amount of time. The improvements I mentioned above aim to catch those errors and add retries and timeouts to better handle infrastructure flakiness. If you are seeing 100% failures, please send me the content of /var/log/azure/cluster-provision.log and the output of |
@CecileRobertMichon thanks for your help. I managed finally getting rid of this issue by changing something in the configs (I guess). I have tried so many times with so many different configurations that cannot keep book any more of those. However, I still faced a couple of issues that I commented to another issue #2476 in case you would have any hints how to work on those. I also shared my configurations there. |
@CecileRobertMichon I am trying to create cluster using acs-engine v0.16.0 in China East2 which is a new region but there are lot of errors while running the custom execution script in the master VM. Have changed lot of configurations and the registry Url, able to fetch the images in the master VM but still the provisioning fails with an exit code 30.
Please help |
@ankitsingh11 please refer to https://github.com/Azure/acs-engine/blob/master/docs/kubernetes/troubleshooting.md#vmextensionprovisioningerror-or-vmextensionprovisioningtimeout for a guide on troubleshooting these issues and open a new issue if you still need help. I will close this one for now as it is outdated. Any reason why you are using acs-engine v0.16.0? The latest versions contain a lot of improvements for vm extensions. |
Is this a request for help?:
Is this an ISSUE or FEATURE REQUEST? (choose one):
ISSUE
What version of acs-engine?:
acs-engine v0.14.6
Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm)
kubernetes 1.9.0
What happened:
I have deployed K8S cluster using acs-engine v0.14.6 . After cluster deployes I am not able to see any nodes listed. I have checkd docker images and docker container where any containers are not created.
Docker Images:
root@k8s-master-63864159-0:/var/lib/waagent/Microsoft.Azure.Extensions.CustomScript-2.0.6/status# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s-gcrio.azureedge.net/hyperkube-amd64 v1.9.0 0e4e0ed658bb 3 months ago 618 MB
Docker PS:
root@k8s-master-63864159-0:/var/lib/waagent/Microsoft.Azure.Extensions.CustomScript-2.0.6/status# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
root@k8s-master-63864159-0:/var/lib/waagent/Microsoft.Azure.Extensions.CustomScript-2.0.6/status#
What you expected to happen:
Kubernetes cluster should get deployed with Kubeernetes V 1.9.0 using acs-engine v0.14.6
How to reproduce it (as minimally and precisely as possible):
acs-engine generate
Anything else we need to know:
Extension Status:
root@k8s-master-63864159-0:/var/lib/waagent/Microsoft.Azure.Extensions.CustomScript-2.0.6/status# cat 0.status
[
{
"version": 1,
"timestampUTC": "2018-04-04T15:15:45Z",
"status": {
"operation": "Enable",
"status": "error",
"formattedMessage": {
"lang": "en",
"message": "Enable failed: failed to execute command: command terminated with exit status=3\n[stdout]\n\n[stderr]\n"
}
}
}
I have tried many version from acs-engine v 0.12.0 to 0.14.6 andy of the version is not deploying kubernetes cluster
The text was updated successfully, but these errors were encountered: