-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: switch Store.{replicas,replicaQueues} from sync.Map to IntMap #17625
Conversation
IntMap is a fork of sync.Map which models map[int64]unsafe.Pointer rather than map[interface{}]interface{}. Using a concrete type for the key and unsafe.Pointer for the value provides a significant speedup. Changed IntMap.read from an atomic.Value to an unsafe.Pointer that is updated atomically. The use of atomic.Value was unnecessary because we always know the concrete type. Will investigate upstreaming this change. name old time/op new time/op delta LoadMostlyHits/*syncutil.DeepCopyMap-8 18.5ns ±21% 7.1ns ±24% -61.69% (p=0.000 n=20+20) LoadMostlyHits/*syncutil.RWMutexMap-8 60.5ns ± 4% 63.5ns ± 1% +4.88% (p=0.000 n=20+19) LoadMostlyHits/*syncutil.Map-8 20.6ns ± 8% 5.9ns ± 5% -71.49% (p=0.000 n=20+19) LoadMostlyMisses/*syncutil.DeepCopyMap-8 15.0ns ±20% 6.9ns ± 1% -54.02% (p=0.000 n=20+20) LoadMostlyMisses/*syncutil.RWMutexMap-8 63.0ns ± 3% 61.3ns ± 6% ~ (p=0.082 n=20+20) LoadMostlyMisses/*syncutil.Map-8 15.3ns ±25% 5.1ns ±35% -66.41% (p=0.000 n=20+20) LoadOrStoreBalanced/*syncutil.RWMutexMap-8 424ns ± 3% 252ns ± 1% -40.66% (p=0.000 n=20+18) LoadOrStoreBalanced/*syncutil.Map-8 433ns ± 1% 263ns ± 3% -39.34% (p=0.000 n=18+19) LoadOrStoreUnique/*syncutil.RWMutexMap-8 688ns ± 2% 356ns ± 1% -48.22% (p=0.000 n=18+19) LoadOrStoreUnique/*syncutil.Map-8 855ns ± 3% 431ns ± 2% -49.62% (p=0.000 n=20+20) LoadOrStoreCollision/*syncutil.DeepCopyMap-8 7.82ns ± 3% 6.61ns ± 2% -15.53% (p=0.000 n=20+20) LoadOrStoreCollision/*syncutil.RWMutexMap-8 161ns ± 3% 164ns ± 5% ~ (p=0.050 n=19+20) LoadOrStoreCollision/*syncutil.Map-8 8.78ns ± 1% 4.56ns ±50% -48.13% (p=0.000 n=18+20) Range/*syncutil.DeepCopyMap-8 4.42µs ± 2% 4.30µs ± 2% -2.54% (p=0.000 n=19+20) Range/*syncutil.RWMutexMap-8 67.9µs ± 6% 62.7µs ± 3% -7.64% (p=0.000 n=19+19) Range/*syncutil.Map-8 4.94µs ± 3% 4.65µs ± 3% -5.90% (p=0.000 n=20+19) AdversarialAlloc/*syncutil.DeepCopyMap-8 1.05µs ± 2% 0.67µs ± 3% -36.59% (p=0.000 n=19+20) AdversarialAlloc/*syncutil.RWMutexMap-8 67.0ns ± 4% 68.1ns ± 5% +1.57% (p=0.012 n=20+19) AdversarialAlloc/*syncutil.Map-8 368ns ± 2% 265ns ± 1% -27.84% (p=0.000 n=20+15) AdversarialDelete/*syncutil.DeepCopyMap-8 222ns ± 1% 148ns ± 0% -33.27% (p=0.000 n=20+20) AdversarialDelete/*syncutil.RWMutexMap-8 81.1ns ± 8% 78.4ns ± 9% ~ (p=0.058 n=20+20) AdversarialDelete/*syncutil.Map-8 70.0ns ± 1% 71.0ns ± 3% +1.49% (p=0.000 n=17+14)
Motivated by seeing sync.Map.Load() high on profiles when there are large numbers of replicas on a node. See cockroachdb#17609. name old time/op new time/op delta StoreGetReplica-8 9.12ns ±10% 3.22ns ± 5% -64.65% (p=0.000 n=10+8)
1025742
to
017f716
Compare
The switch of Also, we might as well use this for Reviewed 4 of 4 files at r1, 4 of 4 files at r2, 2 of 2 files at r3. Comments from Reviewable |
The
Yes, we can use it for everything except Review status: all files reviewed at latest revision, all discussions resolved, some commit checks failed. Comments from Reviewable |
Btw, do you happen to know how to run benchmarks in the Go runtime:
Review status: all files reviewed at latest revision, all discussions resolved, some commit checks failed. Comments from Reviewable |
Add 2 more commits to convert |
Review status: 6 of 8 files reviewed at latest revision, all discussions resolved, some commit checks failed. Comments from Reviewable |
I'm not sure about running benchmarks in the Go runtime. I guess you could just create a wrapper benchmark around whatever you want to measure, but it seems strange that you'd need to. |
Motivated by seeing sync.Map.Load() high on profiles when there are large
numbers of replicas on a node. See #17609.