Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: configure Raft Pre-Vote to reduce disruptive rejoining servers #9352

Merged
merged 4 commits into from
Mar 7, 2018

Conversation

gyuho
Copy link
Contributor

@gyuho gyuho commented Feb 23, 2018

Address #9333 and #8499.

/cc @wojtek-t @jpbetz

In simplest form, Raft leader steps down to follower when it receives a message with higher term. For instance, a flaky(or rejoining) member drops in and out, and starts campaign. This member will end up with a higher term, and ignore all incoming messages with lower term. In this case, a new leader eventually need to get elected, thus disruptive to cluster availability. Same happens with isolated member or slow incoming network. In etcd, isolated member with higher term would send pb.MsgAppResp, in response to pb.MsgHeartbeat from leader, and pb.MsgAppResp to leader forces leader to step down, and cluster becomes unavailable.

Raft implements Pre-Vote phase to prevent this kind of disruptions. If enabled, Raft runs an additional phase of election to check if pre-candidate can get enough votes to win an election. If not, it would remain as follower, and most likely, receive leader heartbeats to reset its elapsed election ticks.

Pre-Vote feature is recommended in deployments that would require higher availability, at the cost of more expensive election protocols. Thus, it's added as an experimental feature. Will be enabled by default in 3.5.


Added tests in Raft package, but not in etcd server layer, since CheckQuorum is true by default for etcd, hard to test this Raft logic in server-side.

// TestDisruptiveRejoinPreVote simulates isolated follower starts campaign
// to disrupt current leader, which can be prevented by Raft Pre-Vote.
func TestDisruptiveRejoinPreVote(t *testing.T) {
	defer testutil.AfterTest(t)

	clus := NewClusterV3(t, &ClusterConfig{
		Size:    3,
		PreVote: true,
	})
	defer clus.Terminate(t)

	lead1 := clus.WaitLeader(t)

	// simluate m[lead1+2] sending pre-vote requests to leader
	// current leader never steps down from timeouts
	clus.Members[lead1].s.Cfg.ElectionTicks *= 1000
	// m[lead1+1] never starts a campaign
	clus.Members[(lead1+1)%3].s.Cfg.ElectionTicks *= 1000

	// manually disable check quorum in m[lead1]

	// drop incoming messages to m[lead1+2]
	// m[lead1+2] will pre-vote request to m[lead1]
	clus.Members[lead1].s.PauseSending()
	clus.Members[(lead1+1)%3].s.PauseSending()

	// wait until m[lead1+2] becomes pre-candidate
	timeout := clus.Members[(lead1+2)%3].ElectionTimeout()
	time.Sleep(timeout + timeout/2)

	clus.Members[lead1].s.ResumeSending()
	clus.Members[(lead1+1)%3].s.ResumeSending()

	// If pre-vote were not enabled,
	// force leader m[lead1] to step down by sending
	//   1. MsgAppResp with higher term
	//   2. MsgVote with higher term

	lead2 := clus.WaitLeader(t)
	if lead1 != lead2 {
		t.Fatal("with pre-vote enabled, expect no leader disruption")
	}
}

We can add more tests around this once we refactor integration package with embedded etcd and proxy layer.

/cc @bdarnell @siddontang for Raft tests.

@gyuho gyuho changed the title *: configure Rat Pre-Vote to reduce disruptive rejoining servers *: configure Raft Pre-Vote to reduce disruptive rejoining servers Feb 23, 2018
@gyuho gyuho force-pushed the raft-pre-vote branch 2 times, most recently from 2ae5b2a to 08f3072 Compare February 25, 2018 19:10
@xiang90
Copy link
Contributor

xiang90 commented Feb 26, 2018

this wont fix #9333.

as @wojtek-t pointed out later in the thread, the restarted node starts the ticking too fast.

@gyuho
Copy link
Contributor Author

gyuho commented Feb 26, 2018

@xiang90 Will investigate more based on the server logs.

@codecov-io
Copy link

Codecov Report

Merging #9352 into master will decrease coverage by 0.09%.
The diff coverage is 70%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #9352     +/-   ##
=========================================
- Coverage   72.54%   72.44%   -0.1%     
=========================================
  Files         362      362             
  Lines       30783    30793     +10     
=========================================
- Hits        22330    22309     -21     
- Misses       6826     6855     +29     
- Partials     1627     1629      +2
Impacted Files Coverage Δ
raft/raft.go 91.68% <ø> (ø) ⬆️
etcdserver/config.go 79.8% <ø> (ø) ⬆️
etcdmain/config.go 81.21% <100%> (+0.1%) ⬆️
embed/etcd.go 67.42% <100%> (+0.1%) ⬆️
etcdserver/raft.go 89.36% <100%> (+0.1%) ⬆️
embed/config.go 60.85% <100%> (+0.13%) ⬆️
etcdserver/server.go 78.44% <25%> (-0.59%) ⬇️
auth/simple_token.go 87.03% <0%> (-6.49%) ⬇️
clientv3/namespace/watch.go 72.72% <0%> (-6.07%) ⬇️
pkg/adt/interval_tree.go 84.98% <0%> (-4.81%) ⬇️
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 211523f...b48d3eb. Read the comment docs.

@gyuho
Copy link
Contributor Author

gyuho commented Mar 7, 2018

Tested pre-vote feature in functional-testing with no failure. Will keep testing. Pre-vote is disable by default, so should be safe to merge.

#9333 will be addressed in #9364.

@gyuho gyuho merged commit 89292af into etcd-io:master Mar 7, 2018
@gyuho gyuho deleted the raft-pre-vote branch March 7, 2018 18:15
july2993 added a commit to july2993/raft-rs that referenced this pull request Mar 12, 2018
BusyJay pushed a commit to tikv/raft-rs that referenced this pull request Mar 23, 2018
shbieng pushed a commit to shbieng/raft-rs that referenced this pull request Sep 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants