-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kops upgrade 1.7 -> 1.8 release notes - downtime required with canal? #3911
Comments
See #3905 And we just dropped a PR |
@chrislovecnm I read all of that before posting this issue :) this line worries me!
Thanks for clarification :) |
/assign @KashifSaadat These are the instructions that @KashifSaadat worked out. @caseydavenport who is a good canal guru? @pierreozoux I will let the experts comment, but yes downtime with Kubernetes is not the best. |
I ran into the following issues when trying to do a gradual rolling-update with no downtime:
|
The linked procedure is designed to have no downtime. But, it needs to be performed in a particular order due to removal of support of TPRs in the k8s API. Essentially these things need to happen in order:
The key being that any canal v2.4 pod will stop working when k8s is updated to v1.8, so the update of canal needs to happen first. |
@caseydavenport do you have a daemon set manifest for the Canal upgrade to v2.6? OR can I just edit my current one. Does the 1.6/1.7 kops version of canal support the CRD storage engine? Can I do that data migration and then leave it before I move on. |
@jurgenweber the manifests here will work for v2.6 - https://github.com/projectcalico/canal/tree/ff2a346124ac0a2203237c3f76e1a5428c8369ab/k8s-install/1.7 You'll need the new CRDs if you're upgrading from pre-v2.5.
You should be able to do the data migration, then upgrade kops/canal. You need at least Kubernetes v1.7 in order to use Canal v2.5+, because CRDs do not exist in earlier versions of Kubernetes. |
I am currently on k8s 1.7.10. So looking at the currently deployed DS I have:
Calico 2.5.1 supports both TPR and CRD? And so does k8s 1.7. I have already the configuration in CRD from my first botched attempt to go to k8s 1.8:
Because I still have the TPR's as well:
Should I blow these away and do the data migration again? How do I know what datastore calico is currently using? I see this in the DS:
but that does not clarify it. So by the looks of things I need to discern: What datastore is in use and if I need to do the data migration again. sorry for all the questions, my first attempt was a bit of a disaster. I had some pods with no internet access and unable to function. I was successful in rolling back after noticing the issue but by this time I had already upgraded the masters and one of my instance groups. It did take a bit of hacking in etcd-server to get cronjobs to work again. :) The good news is that I had no production downtime. Thanks |
@jurgenweber Calico v2.4 and less uses TPRs to store data. Calico v2.5+ uses CRDs only. As for Kubernetes, k8s v1.7. supports both CRDs and TPRs, but k8s v1.8 supports only CRDs. So, you need to migrate the data from TPRs to CRDs on k8s v1.7 before upgrading to Calico v2.5. All of this is ONLY when using the DATASTORE_TYPE=kubernetes options, which kops does not use for Calico, but DOES use for canal. |
Sorry, I have not gotten back to this really busy but over Christmas when things are slow I hope to do the upgrade. So going by my Canal deployment image tags, I am on CRD's.... Assuming "Calico v2.5+" == " image: quay.io/calico/node:v2.5.1"? Or is the CNI version the version I should be concerned with? Do I? What image/part of Calico is in question here? Ok. How can I tell which one is in use? |
@jurgenweber yep, |
I'll be honest, I do not know how that happened or maybe it was always 2.5.1... or I dunno. Anyway, sounds like I am on CRD's. Thank you for all of your patience with my questions. |
Just dropping a note to say I managed the upgrade to 1.8.6 this morning. Thank you for all your advice and help. |
Good to hear it's working, thanks for letting us know. Going to close this issue, feel free to update / reopen if you think there's anything outstanding here. /close |
I'm preparing our migration to 1.8, and reading the relase-note I'm a bit worried.
It says:
will involve downtime
. Do you mean that if we are using canal, we have to suffer downtime to upgrade?Reading this document it doesn't seem like it needs a downtime at all:
https://github.com/projectcalico/calico/blob/master/upgrade/v2.5/README.md
I'm just wondering:
If there is a path without downtime, I'd be happy to find it with you and document it!
The text was updated successfully, but these errors were encountered: