Skip to content

Commit

Permalink
Merge #26910
Browse files Browse the repository at this point in the history
26910: storage: tick only raft groups which are not quiesced r=bdarnell a=spencerkimball

Previously we were ticking every replica, regardless of whether
or not it was part of a quiesced range, at each tick interval.
For 10,000 ranges this was noticeable. Each `Replica.tick()`
operation enqueues on a scheduler queue, is serviced by a
goroutine which calls `time.Now()`, and acquires several mutexes
before discovering it's quiesced and exiting.

We happen to have a carefully maintained map of unquiesced
replicas, which this change uses instead of all of the store's
replicas, when populating the `rangeIDs` slice used to enqueue
Raft ticks.

Release note (performance improvement): less CPU utilization
with many ranges.

Co-authored-by: Spencer Kimball <[email protected]>
  • Loading branch information
craig[bot] and spencerkimball committed Jun 26, 2018
2 parents 3c112e8 + 296629c commit 2ab0f31
Showing 1 changed file with 10 additions and 9 deletions.
19 changes: 10 additions & 9 deletions pkg/storage/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -3716,15 +3716,16 @@ func (s *Store) raftTickLoop(ctx context.Context) {
case <-ticker.C:
rangeIDs = rangeIDs[:0]

s.mu.replicas.Range(func(k int64, v unsafe.Pointer) bool {
// Why do we bother to ever queue a Replica on the Raft scheduler for
// tick processing? Couldn't we just call Replica.tick() here? Yes, but
// then a single bad/slow Replica can disrupt tick processing for every
// Replica on the store which cascades into Raft elections and more
// disruption.
rangeIDs = append(rangeIDs, roachpb.RangeID(k))
return true
})
s.unquiescedReplicas.Lock()
// Why do we bother to ever queue a Replica on the Raft scheduler for
// tick processing? Couldn't we just call Replica.tick() here? Yes, but
// then a single bad/slow Replica can disrupt tick processing for every
// Replica on the store which cascades into Raft elections and more
// disruption.
for rangeID := range s.unquiescedReplicas.m {
rangeIDs = append(rangeIDs, rangeID)
}
s.unquiescedReplicas.Unlock()

s.scheduler.EnqueueRaftTick(rangeIDs...)
s.metrics.RaftTicks.Inc(1)
Expand Down

0 comments on commit 2ab0f31

Please sign in to comment.