-
Notifications
You must be signed in to change notification settings - Fork 558
Sovereign cloud support for AzureGermanCloud and AzureUSGovCloud. #499
Conversation
@wangtt03, |
Can one of the admins verify this patch? |
@colemickens Could you please review this PR? Thanks! |
I'm back in the office only on Tuesday - at the moment i have crappy internet so the latest on Tuesday morning (monday evening for you) i'll regen and deploy |
might be wrong but i guess you are making that pull request from a private repo so i can't get the branch... and i can't merge into a new brach here since i don't have access (obviously)... So i'm stuck - can get the changes locally.
another question: do you use the location to determine the right FQDN or have another attribute? this is what i set:
|
The repo is open, something like this should work:
|
worked! didn't have to change vm image name and now works like a charm... i still ssh-ed on the master (11PM... will try remote tomorrow)
great - tnx guys! you can merge from my point of view ;) PS: didn't do any regression for non-Azure.de |
@raulfiru I am so happy to hear that! Thanks! 👍 |
I had to change the osImageVersion as
After correcting the image version the master vm extension deployment fails as the custom script exits with Checking
Any hints where I should look to troubleshoot further? |
It seems that the apiserver is not started correctly. Please check if hyperkube docker image is pulled, could you please attach the docker ps output? |
I suspect I mangled the location configuration. Which points are relevant for setting the target environment? |
So, now I manually adjusted the How did you plan on setting the location during the template generation? |
@benkoller it worked like a charm for me... did you get the pull request as @colemickens wrote above (i had to change some paths) or you used the master branch? You seam to know what you are doing (much more then me to be honest), but what i did is to get the pull locally (overwrite master), then follow the install guide where i've overwritten:
with the git folder that has the pull then followed all the other steps and worked... |
I am on the I unfortunately can't deploy the first workloads today but I'll deploy some stuff tomorrow. Until now all seems fine though. |
that is strange indeed... i tried on 2 azure germany subscriptions. on one worked every time (tried 3 times) - cluster-provision.log:
on another..also tried 3 times... side by side with different dns names and then i reused the exact same file and deleted the one in the other subscription (i thought it could be the names/certs)
failed every time... same as @benkoller |
It's almost surely because of ServicePrincipal problems. I'm guessing the SP that you are using only has permissions on one of the subscriptions. |
i'm using different SP for each. and everything else gets provisioned except: k8s-master-12051246-0/cse0 - and that looks like a script that run on the vm:
they run fine on agents but not on the master. It hangs there until the loop (600 sec) timesout |
@raulfiru Yes, it will fail if the SP is invalid. Please follow the troubleshooting steps here to rule out the SP credentials being invalid or having wrong permissions: https://github.com/Azure/acs-engine/blob/master/docs/kubernetes.md#misconfigured-service-principal |
you were right
i'll check the SP tomorrow and retry. tnx and good night for now |
works! It was my mistake as i was using in the other case the SP name and not the ID. now, both subscriptions work |
@raulfiru and to confirm, you didn't have to edit anything for FQDN or ubuntu image? I'd like to understand why this isn't working for @benkoller, if that is indeed still the case. |
regarding the azure VM image - while the default image in the azuredeply.json is 16.04.201703070, the one is azuredeply.parameters.json is 16.04.201701130 - so it worked and i didn't change anything. |
@colemickens I have to power down the current deployment and will be out of the office until Monday so I can't check again. Until then I'd like to rule out I missed something during template generation, there are no additional steps to perform apart from |
There should be a location field target to 'germanycentral' or 'germanynortheast' |
Can you share / add an updated |
{ Hope it helps!😊 |
Thx @wangtt03, deployment went without a hitch with the location field. I had used |
@wangtt03 could you rebase this on master, or give me admin perms on ChinaCloudGroup:sovereign_cloud and I'd be happy to do it myself. Thanks either way! |
@jackfrancis ,I will rebase this PR, and I will give you the admin permission also. |
change default kubernetes image base to crproxy for China and delete template.go
@wangtt03 I checked out the |
@wangtt03 false alarm, I hadn't accepted the collaboration invite, all works now! thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one change remaining, otherwise looks good.
parts/configure-swarm-cluster.sh
Outdated
fi | ||
sleep 10 | ||
done | ||
apt-get update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all "apt" commands should have retries around them. Please see other areas of code where we do this.
@@ -159,18 +161,18 @@ echo "$HOSTADDR $VMNAME" | sudo tee -a /etc/hosts | |||
|
|||
echo "Installing and configuring docker" | |||
|
|||
# simple general command retry function | |||
retrycmd_if_failure() { for i in 1 2 3 4 5; do $@; [ $? -eq 0 ] && break || sleep 5; done ; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anhowe See this function and the below invocations to address your retry apt commands suggestion.
🚀 |
checking it today/tomorrow awesome! |
Changes:
defaults.go
azureconst.go
This change is
Fixes #1066