Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential fixes for flapping leadership #6951

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 48 additions & 2 deletions nomad/leader.go
Original file line number Diff line number Diff line change
Expand Up @@ -57,11 +57,26 @@ var defaultSchedulerConfig = &structs.SchedulerConfiguration{
// as the leader in the Raft cluster. There is some work the leader is
// expected to do, so we must react to changes
func (s *Server) monitorLeadership() {
// We use the notify channel we configured Raft with, NOT Raft's
// leaderCh, which is only notified best-effort. Doing this ensures
// that we get all notifications in order, which is required for
// cleanup and to ensure we never run multiple leader loops.
leaderCh := s.leaderCh

var weAreLeaderCh chan struct{}
var leaderLoop sync.WaitGroup
for {
select {
case isLeader := <-s.leaderCh:
case isLeader := <-leaderCh:
// suppress leadership flapping
isLeader, suppressed := suppressLeadershipFlaps(isLeader, leaderCh)

// if gained and lost leadership immediately, move on without emitting error
if suppressed && !isLeader && weAreLeaderCh == nil {
s.logger.Info("cluster leadership acquired but lost immediately")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does flapping indicate something an operator should be aware of such as excessive cpu load or network issues? If so we might want to document that in at least a code comment as I'm worried this log message may flummox the first users who hit it. If it does hint at some degenerate state, users who see this might already be stressed and confused, so if there's any more hints or context we can offer we should.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add to this, I've often found as an operator that info log level is indecisive. If it's something I need to pay attention to and take action on, it should probably be at warn. Otherwise it's telling me "hey this is something you want to know about" without telling me what to do with that information.

(My rule of thumb has often been that info is for information at startup like "I'm listening on port 1234 and running with plugins X and Y", and then not used after that, but I don't think we have firm guidelines anywhere on Nomad.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, are this eliminate leaderCh overflow? I think that is no. For example this is not remove situation when leadership lost, and you just increase leaderCh buffer size which is not enough imho in all cases

continue
}

switch {
case isLeader:
if weAreLeaderCh != nil {
Expand Down Expand Up @@ -96,6 +111,35 @@ func (s *Server) monitorLeadership() {
}
}

// suppressLeadershipFlaps suppresses cases where we gained but lost leadership immediately.
// Protect against case where leadership transitions multiple times while server updates
// internal leadership related structures.
//
// This uses a conservative approach - mainly avoid establishing leadership if server already lost it
//
// Params:
// isLeader: the last value dequeued from channel
// ch: leadership channel
// Returns:
// leader: last buffered leadership state
// suppressed: true if method dequeued elements from channel
func suppressLeadershipFlaps(isLeader bool, ch <-chan bool) (leader, suppressed bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Effectively what we're doing here is that if we get a notification that we're now the leader, we drain the notification channel to make sure that there hasn't been more than one notification since we handled the last one (in the outer loop of monitorLeadership).

While this solves for a very tight loop of leadership flapping, I think we can still get into a flapping state if the leadership transitions are happening at a rate which is slower than the outer loop of monitorLeadership -- in that case there will only ever be one notification in the queue at a time but the leadership loop will still be continuously flapping.

(Moving the WAIT: label in leaderLoop on the other hand I suspect will be really valuable for more realistic flapping cases.)

if !isLeader {
return isLeader, false
}

leader = isLeader
for {
select {
case v := <-ch:
leader = v
suppressed = true
default:
return leader, suppressed
}
}
}

// leaderLoop runs as long as we are the leader to run various
// maintenance activities
func (s *Server) leaderLoop(stopCh chan struct{}) {
Expand Down Expand Up @@ -148,16 +192,18 @@ RECONCILE:
// updates
reconcileCh = s.reconcileCh

WAIT:
// Poll the stop channel to give it priority so we don't waste time
// trying to perform the other operations if we have been asked to shut
// down.
select {
case <-stopCh:
return
case <-s.shutdownCh:
return
default:
}

WAIT:
// Wait until leadership is lost
for {
select {
Expand Down
90 changes: 90 additions & 0 deletions nomad/leader_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1234,3 +1234,93 @@ func waitForStableLeadership(t *testing.T, servers []*Server) *Server {

return leader
}

func TestSuppressLeadershipFlaps(t *testing.T) {
t.Run("steps down don't get surpressed", func(t *testing.T) {
init := false
ch := make(chan bool, 5)
ch <- true

leader, suppressed := suppressLeadershipFlaps(init, ch)
require.False(t, leader)
require.False(t, suppressed)

select {
case v := <-ch:
require.True(t, v)
default:
require.Fail(t, "ch should be ready")
}
})
t.Run("single steps up don't get surpressed", func(t *testing.T) {
init := true
ch := make(chan bool, 5)

leader, suppressed := suppressLeadershipFlaps(init, ch)
require.True(t, leader)
require.False(t, suppressed)

select {
case v := <-ch:
require.Failf(t, "channel has ready element unexpected", "element: %v", v)
default:
// channel isn't ready, yay
}
})
t.Run("single flap gets suppressed", func(t *testing.T) {
init := true
ch := make(chan bool, 5)
ch <- false

leader, suppressed := suppressLeadershipFlaps(init, ch)
require.False(t, leader)
require.True(t, suppressed)

select {
case v := <-ch:
require.Failf(t, "channel has ready element unexpected", "element: %v", v)
default:
// channel isn't ready, yay
}
})

t.Run("multiple transitions get suppressed, end at true", func(t *testing.T) {
init := true
ch := make(chan bool, 5)
ch <- false
ch <- true
ch <- false
ch <- true

leader, suppressed := suppressLeadershipFlaps(init, ch)
require.True(t, leader)
require.True(t, suppressed)

select {
case v := <-ch:
require.Failf(t, "channel has ready element unexpected", "element: %v", v)
default:
// channel isn't ready, yay
}

})
t.Run("multiple transitions get suppressed, end at false", func(t *testing.T) {
init := true
ch := make(chan bool, 5)
ch <- false
ch <- true
ch <- false

leader, suppressed := suppressLeadershipFlaps(init, ch)
require.False(t, leader)
require.True(t, suppressed)

select {
case v := <-ch:
require.Failf(t, "channel has ready element unexpected", "element: %v", v)
default:
// channel isn't ready, yay
}

})
}
4 changes: 2 additions & 2 deletions nomad/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -1234,8 +1234,8 @@ func (s *Server) setupRaft() error {
}
}

// Setup the leader channel
leaderCh := make(chan bool, 1)
// Set up a channel for reliable leader notifications.
leaderCh := make(chan bool, 10)
Copy link
Contributor

@tantra35 tantra35 Jan 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:-) Hm, in many cases, this will be enough, and why you extend this value? You already add suppressLeadershipFlaps function which doesn't allow to overflow leaderCh. Just in case?

s.config.RaftConfig.NotifyCh = leaderCh
s.leaderCh = leaderCh

Expand Down