Skip to content

Commit

Permalink
storage: quiesce less aggressively
Browse files Browse the repository at this point in the history
PR cockroachdb#26911 added a mechanism which during quiescence decisions ignored
nodes which (at some recent point in time) seemed non-live.
Unfortunately, the corresponding mechanism introduced to unquiesce when
the node becomes live is quite racy, at least when tormented with the
wild timings we throw at it in unit testing.  The basic discrepancy is
that unquiescing is triggered by an explicit liveness event (i.e. the kv
record being overwritten), whereas a node may seem non-live for mere
reasons of timing inbetween. In effect this means that quiescence is
more aggressive than unquiescence.

I explored various venues to address this (going as far as introducing
an unquiesce message to the Raft scheduler and serializing unquiescence
through the store Raft ticker goroutine), but it's hopeless. It's an
equally idle diversion to try to adjust the test liveness durations.

So, in this commit, the liveness check becomes more lenient: even when a
node seems non-live, we count it as live if there's a healthy connection
established to it. This effectively deals with issues in tests but also
seems like a good addition for real world workloads: when the liveness
range breaks down, we don't want to aggressively quiesce.

Fixes cockroachdb#27545.
Fixes cockroachdb#27607.

Release note: None
  • Loading branch information
tbg committed Jul 16, 2018
1 parent cb1f3a5 commit 960868d
Showing 1 changed file with 18 additions and 1 deletion.
19 changes: 18 additions & 1 deletion pkg/storage/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -3795,7 +3795,24 @@ func (s *Store) raftTickLoop(ctx context.Context) {
rangeIDs = rangeIDs[:0]
// Update the liveness map.
if s.cfg.NodeLiveness != nil {
s.livenessMap.Store(s.cfg.NodeLiveness.GetIsLiveMap())
nextMap := s.cfg.NodeLiveness.GetIsLiveMap()
for nodeID, isLive := range nextMap {
if isLive {
continue
}
// Liveness claims that this node is down, but ConnHealth gets the last say
// because we'd rather quiesce a range too little than one too often.
//
// NB: This has false negatives. If a node doesn't have a conn open to it
// when ConnHealth is called, then ConnHealth will return
// rpc.ErrNotHeartbeated regardless of whether the node is up or not. That
// said, for the nodes that matter, we're likely talking to them via the
// Raft transport, so ConnHealth should usually indicate a real problem if
// it gives us an error back. The check can also have false positives if the
// node goes down after populating the map, but that matters even less.
nextMap[nodeID] = (s.cfg.NodeDialer.ConnHealth(nodeID) == nil)
}
s.livenessMap.Store(nextMap)
}

s.unquiescedReplicas.Lock()
Expand Down

0 comments on commit 960868d

Please sign in to comment.