Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High tail latency around Index compactions #9506

Closed
jcalvert opened this issue Mar 28, 2018 · 4 comments
Closed

High tail latency around Index compactions #9506

jcalvert opened this issue Mar 28, 2018 · 4 comments

Comments

@jcalvert
Copy link
Contributor

jcalvert commented Mar 28, 2018

While using a cluster on 3.3.1 when there are millions of keys in etcd, we have seen very long durations (5-10 seconds) on the index compaction stat measured here. This coincided with increases in request latency and client side timeouts. We have further observed that almost all of the time spent in this function has been around the tree index compaction here. Looking into this function, we observed the the following comment. Since this function locks for the duration of compacting the B-Tree, this affects overall throughput performance. We wrote a benchmark in order to validate the run time in isolation that is provided below with our results. Unfortunately these results do not line up with the O(10ms) comment. We ran these benchmarks on a 16 logical CPU machine with Xeon E5-2670 v2 @ 2.50GHz. Is this an expected bottleneck?

package mvcc

import (
        "testing"
)

func BenchmarkIndexCompact1(b *testing.B)   { benchmarkIndexCompact(b, 1) }
func BenchmarkIndexCompact100(b *testing.B) { benchmarkIndexCompact(b, 100) }
func BenchmarkIndexCompact10000(b *testing.B) { benchmarkIndexCompact(b, 10000) }
func BenchmarkIndexCompact100000(b *testing.B) { benchmarkIndexCompact(b, 100000) }
func BenchmarkIndexCompact1000000(b *testing.B) { benchmarkIndexCompact(b, 1000000) }

func benchmarkIndexCompact(b *testing.B, size int) {
        plog.SetLevel(0) // suppress log entries
  kvindex := newTreeIndex()

        bytesN := 64
        keys := createBytesSlice(bytesN, size)
        for i := 1; i < size; i++ {
                kvindex.Put(keys[i], revision {main: int64(i), sub: int64(i)})
        }
        b.ResetTimer()
        for i := 1; i < b.N; i++ {
                kvindex.Compact(int64(i))
        }
}
BenchmarkIndexCompact1-16                2000000               666 ns/op
BenchmarkIndexCompact100-16               100000         18643 ns/op
BenchmarkIndexCompact10000-16               2000            866163 ns/op
BenchmarkIndexCompact100000-16               100          10112223 ns/op
BenchmarkIndexCompact1000000-16              100         385412535 ns/op
@xiang90
Copy link
Contributor

xiang90 commented Mar 28, 2018

@jcalvert

You can do something similar to https://github.com/coreos/etcd/pull/9384/files#diff-d741eeb0ba73b4c9fdb36742f975395dR88 to fix the problem. I might have time to take a look in the next couple of weeks.

@xiang90
Copy link
Contributor

xiang90 commented Mar 28, 2018

While using a cluster on 3.3.1 when there are millions of keys in etcd, we have seen very long durations (5-10 seconds) on the index compaction stat measured here

the benchmark does not explain why the pause is at second level. maybe there is something else going on?

@jcalvert
Copy link
Contributor Author

The benchmark shows that compaction scales worse than linearly. Adding a benchmark for 5000000 shows on this same machine a result in excess of 2 seconds. Extrapolating this to 10 million index entries it would seem is sufficient to explain that much pause.

@xiang90
Copy link
Contributor

xiang90 commented Mar 29, 2018

@jcalvert ok. makes sense if you have 10 million of keys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants