Skip to content
This repository has been archived by the owner on Jan 11, 2023. It is now read-only.

Calico (2.6.3 -> 3.1.1) upgrade fails between acs-engine 0.16.0 and 0.18.1 #3191

Closed
oivindoh opened this issue Jun 6, 2018 · 3 comments · Fixed by #3208
Closed

Calico (2.6.3 -> 3.1.1) upgrade fails between acs-engine 0.16.0 and 0.18.1 #3191

oivindoh opened this issue Jun 6, 2018 · 3 comments · Fixed by #3208

Comments

@oivindoh
Copy link
Contributor

oivindoh commented Jun 6, 2018

Is this a request for help?:
No

Is this an ISSUE or FEATURE REQUEST? (choose one):
ISSUE

What version of acs-engine?:
0.16.0 and 0.18.1

Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm)
kubernetes 1.9.7 and 1.10.3
acs-engine 0.16.0 and 0.18.1

What happened:

  • I deployed a cluster with 3 masters and 3 nodes on kubernetes 1.9.7 via acs-engine 0.16.0 to simulate an upgrade scenario on a production cluster we run.

  • I then used acs-engine 0.18.1 to perform an upgrade operation on it, targeting kubernetes 1.10.3. During this process I also expected an upgrade of calico from 2.6.3 to 3.1.1 due to Update Calico to 3.1 #2521 (which is one major reason for performing this upgrade)

  • After an hour of waiting for VMs to delete/create, the cluster was successfully up, but still running calico 2.6.3.

  • The old daemonset created by 0.16.0 deployment has an update strategy of OnDelete (new one has rollingupdate), so pods needed to be killed, but the daemonset actually running in the cluster hadn't been updated anyway, so I deleted the daemonset itself and watched the new one come up.

  • These new calico 3.1.1 nodes came up with the following issue:

2018-06-06 11:49:31.577 [INFO][9] startup.go 99: Datastore is ready
2018-06-06 11:49:31.586 [INFO][9] customresource.go 217: Error getting resource Key=ClusterInformation(default) Name="default" Resource="ClusterInformations" Revision="" error=clusterinformations.crd.projectcalico.org "default" not found
2018-06-06 11:49:31.592 [INFO][9] migrate.go 811: cannot migrate from version v2.6.3: migration to v3 requires a tagged release of Calico v2.6.5+
2018-06-06 11:49:31.592 [ERROR][9] migrate.go 720: Unable to migrate data from version 'v2.6.3': migration to v3 requires a tagged release of Calico v2.6.5+
2018-06-06 11:49:31.592 [ERROR][9] startup.go 106: Unable to ensure datastore is migrated. error=unable to migrate data from version 'v2.6.3': migration to v3 requires a tagged release of Calico v2.6.5+
2018-06-06 11:49:31.593 [WARNING][9] startup.go 1058: Terminating
Calico node failed to start

I then reapplied the /etc/kubernetes/addons/calico-node.yaml manifest after changing node tag from v3.1.1 to v2.6.10, and cni from v3.1.1 to v2.0.6, and calico came up nicely. I then applied again with node tag v3.1.1, cni tag v3.1.1, watched it successfully migrate, and now have a functional 1.10.3 cluster with calico 3.1.1.

What you expected to happen:
Calico version 3.1.1 up and running upon cluster upgrade completion.

How to reproduce it (as minimally and precisely as possible):

  • Deploy cluster with acs-engine 0.16.0 and calico enabled
  • Upgrade cluster with acs-engine 0.18.1
  • Observe calico-node version, observe complaints from calico-node when actually running 3.1.1 due to attempt to migrate from < 2.6.5 to 3.1.1.

Anything else we need to know:
I was a bit unsure about submitting this issue - is this whole process something we expect acs-engine to handle, or are the manual steps I had to go through expected? Could acs-engine somehow handle a multi-step process of first migrating to the most recent 2.6.x before moving to 3.1.x?

@CecileRobertMichon
Copy link
Contributor

@dtzar do you know if this is something acs-engine could handle automatically? If not we should at least document it.

@dtzar
Copy link
Contributor

dtzar commented Jun 6, 2018

@oivindoh thanks for submitting this issue - it is good for awareness. Although it is technically possible to have acs-engine support major verison upgrades of Calico, IMO I believe this is a significant amount of effort to create and maintain. Minor version upgrades of Calico should work without issue. The 2.6.5 to 3.x upgrade information can be found in the release notes here.

@CecileRobertMichon - perhaps we just add some documentation that this upgrade is not supported via acs-engine at least for the time being and point them to the manual upgrade guide?

@CecileRobertMichon
Copy link
Contributor

CecileRobertMichon commented Jun 6, 2018

Thanks @dtzar! That sounds good to me. @oivindoh would you like to start a PR to add the steps you followed to the upgrade documentation so others can see them?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants