storage: use RWMutex in NodeLiveness #36316

ajwerner · 2019-03-29T00:37:51Z

When at Mutex profiles for heavily loaded TPC-C clusters we noticed that a lot
of time was being spent blocked on a RWMutex held by Replica.leaseGoodToGo which
underneath was reading NodeLiveness state in a read-only way. This PR adds a
RWMutex to NodeLiveness to eliminate contention. Prior to this change we
observed nearly 60% of lock contention on leaseGoodToGo. After we observe
closer to 20%.

Release note: None

cockroach-teamcity · 2019-03-29T00:37:57Z

This change is

ajwerner · 2019-03-29T01:13:47Z

Before:

After:

nvanbenschoten

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @nvanbenschoten)

pkg/storage/node_liveness.go, line 598 at r1 (raw file):

// a former liveness update on restart.
func (nl *NodeLiveness) Self() (*storagepb.Liveness, error) {
	nl.mu.Lock()

Read lock here as well?

pkg/storage/node_liveness.go, line 619 at r1 (raw file):

	nl.mu.RLock()
	defer nl.mu.RUnlock()
	lMap := IsLiveMap{}

Let's pull this allocation out of the lock.

pkg/storage/node_liveness.go, line 845 at r1 (raw file):

	}

	nl.mu.Lock()

Read lock?

pkg/storage/node_liveness.go, line 857 at r1 (raw file):

// registered callbacks if the node became live in the process.
func (nl *NodeLiveness) maybeUpdate(new storagepb.Liveness) {
	nl.mu.Lock()

While we're here, I wonder if we should improve this. There are a few easy wins. For starters, if shouldReplaceLiveness is false some non-negligible percent of time (I'm not sure about this) then it's probably worth initially grabbing a read-lock and only upgrading to a write-lock if necessary. Doing this would also give us the ability to optimistically allocate the callbacks slice outside of the lock.

pkg/storage/node_liveness.go, line 935 at r1 (raw file):

	maxOffset := nl.clock.MaxOffset()

	nl.mu.Lock()

Read lock?

ajwerner

Updated, PTAL.

Reviewable status: complete! 0 of 0 LGTMs obtained

nvanbenschoten

mod the question about whether the optimistic read-locking in maybeUpdate could be a pessimization.

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)

pkg/storage/node_liveness.go, line 866 at r2 (raw file):

	// Note that this works fine even if `old` is empty.
	should := shouldReplaceLiveness(old, new)
	if !should {

Nice! Any idea how frequently we hit this path? Running a roachtest manually with a bit of instrumentation would be enough to get some feel for this.

pkg/storage/node_liveness.go, line 872 at r2 (raw file):

	callbacks := make([]IsLiveCallback, 0, numCallbacks)
	nl.mu.Lock()
	old = nl.mu.nodes[new.NodeID]

Drop a comment above this that we check shouldReplaceLiveness again now that we're under and exclusive lock.

ajwerner · 2019-04-02T22:12:01Z

Nice! Any idea how frequently we hit this path? Running a roachtest manually with a bit of instrumentation would be enough to get some feel for this.

It does indeed seem to be a pessimization. We seem to hit the fast path roughly 1/3rd of the time on a 3 node cluster and 1/9th of the time on a 9 node cluster (pattern?)

I'll rip it out

When at Mutex profiles for heavily loaded TPC-C clusters we noticed that a lot of time was being spent blocked on a RWMutex held by Replica.leaseGoodToGo which underneath was reading NodeLiveness state in a read-only way. This PR adds a RWMutex to NodeLiveness to eliminate contention. Prior to this change we observed nearly 60% of lock contention on leaseGoodToGo. After we observe closer to 20%. Release note: None

nvanbenschoten

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)

ajwerner · 2019-04-02T23:03:34Z

bors r+

36316: storage: use RWMutex in NodeLiveness r=ajwerner a=ajwerner When at Mutex profiles for heavily loaded TPC-C clusters we noticed that a lot of time was being spent blocked on a RWMutex held by Replica.leaseGoodToGo which underneath was reading NodeLiveness state in a read-only way. This PR adds a RWMutex to NodeLiveness to eliminate contention. Prior to this change we observed nearly 60% of lock contention on leaseGoodToGo. After we observe closer to 20%. Release note: None Co-authored-by: Andrew Werner <[email protected]>

craig · 2019-04-02T23:21:12Z

Build succeeded

GitHub CI (Cockroach)

ajwerner requested review from nvanbenschoten and a team March 29, 2019 00:37

nvanbenschoten reviewed Mar 29, 2019

View reviewed changes

ajwerner force-pushed the ajwerner/rw-mutex-for-nodeliveness branch from acd7c16 to d4e9c9c Compare April 2, 2019 13:08

ajwerner commented Apr 2, 2019

View reviewed changes

nvanbenschoten reviewed Apr 2, 2019

View reviewed changes

ajwerner force-pushed the ajwerner/rw-mutex-for-nodeliveness branch from d4e9c9c to a134caa Compare April 2, 2019 22:21

nvanbenschoten approved these changes Apr 2, 2019

View reviewed changes

craig bot merged commit a134caa into cockroachdb:master Apr 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: use RWMutex in NodeLiveness #36316

storage: use RWMutex in NodeLiveness #36316

ajwerner commented Mar 29, 2019

cockroach-teamcity commented Mar 29, 2019

ajwerner commented Mar 29, 2019

nvanbenschoten left a comment

ajwerner left a comment

nvanbenschoten left a comment

ajwerner commented Apr 2, 2019

nvanbenschoten left a comment

ajwerner commented Apr 2, 2019

craig bot commented Apr 2, 2019

storage: use RWMutex in NodeLiveness #36316

storage: use RWMutex in NodeLiveness #36316

Conversation

ajwerner commented Mar 29, 2019

cockroach-teamcity commented Mar 29, 2019

ajwerner commented Mar 29, 2019

nvanbenschoten left a comment

Choose a reason for hiding this comment

ajwerner left a comment

Choose a reason for hiding this comment

nvanbenschoten left a comment

Choose a reason for hiding this comment

ajwerner commented Apr 2, 2019

nvanbenschoten left a comment

Choose a reason for hiding this comment

ajwerner commented Apr 2, 2019

craig bot commented Apr 2, 2019

Build succeeded