From 751b3fc09632cf184bf947be8f74d771554a23d0 Mon Sep 17 00:00:00 2001 From: Sofer Athlan-Guyot Date: Fri, 11 Oct 2019 16:10:00 +0200 Subject: [PATCH] Workaround ovn cluster failure during update when schema change. During update the ovndb server can have a schema change. The problem is that an updated slave ovndb wouldn't connect to a master which still has the old db schema. At some point (200000ms) pacemaker put the resource in error Time Out. Then it will wait for the operator to cleanup the resource. Meaning that the update can goes like this: - Original state: (Master, Slave, Failed): nothing updated - ctl0-M-old - ctl1-S-old - ctl2-S-old - First state: after update of ctl0 - ctl0-F-new - ctl1-M-old - ctl2-S-old - Second state: after update of ctl1 - ctl0-F-new - ctl1-F-new - ctl2-M-old - Third and final state: after update of ctl2 - ctl0-F-new - ctl1-F-new - ctl2-M-new During the third state we have a cut in the control plane as ctl2 is the master and there is no slave to fall back to. Then we end up loosing HA as only one node is active. The error persists after reboot. Only a pcs resource cleanup will bring the cluster online. The real solution will come from ovndb and the associated ocf agent, but in the meantime, we workaround it by: - cleanup - ban the resource; in step 1 and: - cleanup - unban the resource in step 5. This has the net effect of preventing the cut in the control plane for the last node as we move master to the updated controller which will form a cluster of one master and one slave (as two are updated). The last one will happily join then when it will be updated. That means: - we always have either 1 or 2 nodes working; - we end the update with the cluster converged back to a stable state. The problems are : - we could hide a real ovndb cluster issue; - if the update break in-between we could have a leftover ban on one of the node; But, all things considered, this looks like the best compromise for the time being. Change-Id: I8f71bf83ddafca167deae1a38ca819f7d930fb80 Closes-Bug: #1847780 --- deployment/ovn/ovn-dbs-pacemaker-puppet.yaml | 26 ++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/deployment/ovn/ovn-dbs-pacemaker-puppet.yaml b/deployment/ovn/ovn-dbs-pacemaker-puppet.yaml index dae6ac9ae0..4dda3b43ec 100644 --- a/deployment/ovn/ovn-dbs-pacemaker-puppet.yaml +++ b/deployment/ovn/ovn-dbs-pacemaker-puppet.yaml @@ -257,6 +257,23 @@ outputs: container_image: {get_param: ContainerOvnDbsImage} container_image_latest: *ovn_dbs_image_pcmklatest update_tasks: + # When a schema change happens, the newer slaves don't connect + # back to the older master and end up timing out. So we clean + # up the error here until we get a fix for + # https://bugzilla.redhat.com/show_bug.cgi?id=1759974 + - name: Clear ovndb cluster pacemaker error + shell: "pcs resource cleanup ovn-dbs-bundle" + when: + - step|int == 1 + # Then we ban the resource for this node. It has no effect on + # the first two controllers, but when we reach the last one, + # it avoids a cut in the control plane as master get chosen in + # one of the updated Stopped ovn. They are in error, that why + # we need the cleanup just before. + - name: Ban ovndb resource on the current node. + shell: "pcs resource ban ovn-dbs-bundle $(hostname | cut -d. -f1)" + when: + - step|int == 1 - name: Get docker ovn-dbs image set_fact: ovn_dbs_docker_image: {get_param: ContainerOvnDbsImage} @@ -292,6 +309,15 @@ outputs: container_image_latest: "{{ovn_dbs_docker_image_latest}}" # Got to check that pacemaker_is_active is working fine with bundle. # TODO: pacemaker_is_active resource doesn't support bundle. + # We remove any leftover error and remove the ban. + - name: Ensure the cluster converge back even in case of schema change + shell: "pcs resource cleanup ovn-dbs-bundle" + when: + - step|int == 5 + - name: Remove the ban + shell: "pcs resource clear ovn-dbs-bundle" + when: + - step|int == 5 # When ovn-dbs-bundle support was added, we didn't tag the ovn-dbs image # with pcmklatest. So, when update is run for the first time we need to # update the ovn-dbs-bundle resource to use the 'pcmklatest' tagged image.