Documentation: downgrade clarification #10461

hexfusion · 2019-02-09T18:07:07Z

If a cluster is upgraded to 3.3 from 3.2 the documentation states that downgrade is not possible. [1] This holds true with regard to the binary version of etcd members. For example, if a cluster is upgraded to 3.3 and all members agree on 3.3 api, version is then appended to the state file. The downgrade of etcd members to a minor version lower then 3.3 (ie 3.2)will then result in panic.

But in the case of a restore operation of a 3.3 cluster snapshot to 3.2 no panic occurs.

Questions:

1.) Is restore from 3.3 to 3.2 with 3.3 snapshot a valid downgrade path?
2.) If not although no panic occurs it seems like differnt versions of boltdb etc could cause unknown problems. Should this restore cause panic in this case?

[1] Documentation/upgrades/upgrade_3_3.md#downgrade

/cc @gyuho @xiang90 @jpbetz @wenjiaswe

hexfusion · 2019-02-15T14:56:19Z

More information on this after a bit of research. The member will hit the below and exit with 1 if the agreed upon cluster version is greater than the etcd binary version of the member itself. So as I noted above in a 3.3 cluster a 3.2 member will fail.

etcd/etcdserver/membership/cluster.go

Line 510 in e06761e

    
           plog.Fatalf("cluster cannot be downgraded (current version: %s is lower than determined cluster version: %s).", version.Version, version.Cluster(cv.String()))

In 3.x the underlying data format has not changed. So in the same manner that we reset the member bucket in a restore operation we also can reset the cluster version.

The code is intended to stop etcd members to enter the cluster which could not achieve the minimum API version. I am going to be taking on documenting this process as well as looking into implementing/collaborating with Google team on @wenjiaswe graceful downgrade RFC.

ref: #7308

jpbetz · 2019-02-16T01:32:08Z

Sam, By all means work closely with Wenjia. I know she is close to kicking off development based on the current design so any design feedback you have would be very welcome. And implementation contributions would also be very welcome.

…

-Joe

On Fri, Feb 15, 2019 at 6:56 AM Sam Batschelet ***@***.***> wrote: More information on this after a bit of research. The member will hit the below and exit with 1 if the agreed upon cluster version is greater than the etcd binary version of the member itself. So as I noted above in a 3.3 cluster a 3.2 member will fail. https://github.com/etcd-io/etcd/blob/e06761ed79c0ddcc411e300de171ada63bb5bef2/etcdserver/membership/cluster.go#L510 In 3.x the underlying data format has not changed. So in the same manner that we reset the member bucket in a restore operation we also can reset the cluster version. The code is intended to stop etcd members to enter the cluster which could not achieve the minimum API version. I am going to be taking on documenting this process as well as looking into implementing/collaborating with Google team on @wenjiaswe <https://github.com/wenjiaswe> graceful downgrade RFC. ref: #7308 <#7308> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#10461 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAf9Rme1wkNbqi5Lz0Lv-ppx8volAZphks5vNsqagaJpZM4ayqTP> .

wenjiaswe · 2019-02-19T17:49:19Z

@hexfusion thanks for digging into this and sorry for the delay. Yes you are right, once the cluster version is already at the higher minor version, then adding lower version member is not acceptable and that's exactly what we could solve in downgrade support. And the first step we want to do is to temporarily whitelist the one minor version during downgrade: #9306 (comment). As we discussed offline, thank you so much for offering starting the POC and let's work together to solve this:)

stale · 2020-04-07T03:11:59Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

hexfusion added type/question area/documentation labels Feb 9, 2019

sayf-eddine-scality mentioned this issue Sep 21, 2019

Downgrade between two minor releases is not possible when etcd version change scality/metalk8s#1750

Open

stale bot added the stale label Apr 7, 2020

stale bot closed this as completed Apr 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation: downgrade clarification #10461

Documentation: downgrade clarification #10461

hexfusion commented Feb 9, 2019

hexfusion commented Feb 15, 2019

jpbetz commented Feb 16, 2019 via email

wenjiaswe commented Feb 19, 2019

stale bot commented Apr 7, 2020

Documentation: downgrade clarification #10461

Documentation: downgrade clarification #10461

Comments

hexfusion commented Feb 9, 2019

hexfusion commented Feb 15, 2019

jpbetz commented Feb 16, 2019 via email

wenjiaswe commented Feb 19, 2019

stale bot commented Apr 7, 2020