Skip to content

Commit

Permalink
kvserver: fix clearrange/* tests
Browse files Browse the repository at this point in the history
Fixes cockroachdb#104696.
Fixes cockroachdb#104697.
Fixes cockroachdb#104698.
Part of cockroachdb#98703.

In 072c16d (added as part of cockroachdb#95637) we re-worked the locking
structure around the RaftTransport's per-RPC class level send queues.
When new send queues are instantiated or old ones deleted, we now also
maintain the kvflowcontrol connection tracker, so such maintenance now
needs to happen while holding a kvflowcontrol mutex. When rebasing
\cockroachdb#95637 onto master, we accidentally included earlier queue deletion
code without holding the appropriate mutex. Queue deletions now happened
twice which made it possible to hit a RaftTransport assertion about
expecting the right send queue to already exist.

Specifically, the following sequence was possible:
- (*RaftTransport).SendAsync is invoked, observes no queue for
  <nodeid,class>, creates it, and tracks it in the queues map.
  - It invokes an async worker W1 to process that send queue through
    (*RaftTransport).startProcessNewQueue. The async worker is
    responsible for clearing the tracked queue in the queues map once
    done.
- W1 expects to find the tracked queue in the queues map, finds it,
  proceeds.
- W1 is done processing. On its way out, W1 clears <nodeid,class> from
  the queues map the first time.
- (*RaftTransport).SendAsync is invoked by another goroutine, observes
  no queue for <nodeid,class>, creates it, and tracks it in the queues
  map.
  - It invokes an async worker W2 to process that send queue through
    (*RaftTransport).startProcessNewQueue. The async worker is
    responsible for clearing the tracked queue in the queues map once
    done.
- W1 blindly clears the <nodeid,class> raft send queue the second time.
- W2 expects to find the queue in the queues map, but doesn't, and
  fatals.

Release note: None
  • Loading branch information
irfansharif committed Jun 10, 2023
1 parent 61806f4 commit 154b8d2
Showing 1 changed file with 0 additions and 1 deletion.
1 change: 0 additions & 1 deletion pkg/kv/kvserver/raft_transport.go
Original file line number Diff line number Diff line change
Expand Up @@ -916,7 +916,6 @@ func (t *RaftTransport) startProcessNewQueue(
}
}()
defer cleanup(q)
defer t.queues[class].Delete(int64(toNodeID))
defer func() {
t.kvflowControl.mu.Lock()
t.queues[class].Delete(int64(toNodeID))
Expand Down

0 comments on commit 154b8d2

Please sign in to comment.