You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
for i in `seq 10000`; do cockroach sql --insecure --database test -e "create database if not exists db${i}; use db${i}; create table if not exists test${i} (id int primary key); insert into test${i} values (1),(2),(3);"; done
stopped the test after creating 8,000 databases.
What did you expect to see?
even on a small MacBook, the database was operational.
What did you see instead?
checking for abnormal metric and saw that consistency checker queue continued to increase after workload stopped.
I180808 13:11:07.180262 175 storage/replica_proposal.go:222 [n1,s1,r22/1:/Table/5{1-2}] new range lease repl=(n1,s1):1 seq=4 start=1533733867.175762791,0 epo=1 pro=1533733867.175800343,0 following repl=(n3,s3):2 seq=3 start=1533647315.171325321,1 epo=1 pro=1533647316.120851806,0
I180808 13:11:07.464286 185 storage/replica_proposal.go:222 [n1,s1,r11463/1:/Table/1145{7-8}] new range lease repl=(n1,s1):1 seq=5 start=1533733867.461558025,0 epo=1 pro=1533733867.461561871,0 following repl=(n3,s3):2 seq=4 start=1533725275.949570639,0 epo=2 pro=1533725275.949596737,0
I180808 13:11:08.520167 142 storage/replica_proposal.go:222 [n1,s1,r5279/1:/Table/527{3-4}] new range lease repl=(n1,s1):1 seq=5 start=1533733868.516460494,0 epo=1 pro=1533733868.516464838,0 following repl=(n3,s3):2 seq=4 start=1533725275.949570639,0 epo=2 pro=1533725275.949596737,0
I180808 13:11:09.537130 145 storage/replica_proposal.go:222 [n1,s1,r5310/1:/Table/530{4-5}] new range lease repl=(n1,s1):1 seq=5 start=1533733869.533432123,0 epo=1 pro=1533733869.533435833,0 following repl=(n3,s3):2 seq=4 start=1533725275.949570639,0 epo=2 pro=1533725275.949596737,0
I180808 13:11:10.636281 183 storage/replica_proposal.go:222 [n1,s1,r12234/1:/Table/1222{8-9}] new range lease repl=(n1,s1):1 seq=5 start=1533733870.632955126,0 epo=1 pro=1533733870.632959801,0 following repl=(n3,s3):2 seq=4 start=1533725275.949570639,0 epo=2 pro=1533725275.949596737,0
I180808 13:11:15.481832 229 server/status/runtime.go:425 [n1] runtime stats: 606 MiB RSS, 173 goroutines, 311 MiB/1.2 GiB/1.8 GiB GO alloc/idle/total, 226 MiB/270 MiB CGO alloc/total, 190.00cgo/sec, 0.13/0.03 %(u/s)time, 0.00 %gc (1x)
I180808 13:11:16.042702 177 storage/replica_proposal.go:222 [n1,s1,r11195/1:/Table/111{89-90}] new range lease repl=(n1,s1):1 seq=5 start=1533733876.040034707,0 epo=1 pro=1533733876.040038989,0 following repl=(n3,s3):2 seq=4 start=1533725275.949570639,0 epo=2 pro=1533725275.949596737,0
I180808 13:11:18.227249 180 storage/replica_proposal.go:222 [n1,s1,r6809/1:/Table/680{3-4}] new range lease repl=(n1,s1):1 seq=5 start=1533733878.223059716,0 epo=1 pro=1533733878.223064142,0 following repl=(n3,s3):2 seq=4 start=1533725275.949570639,0 epo=2 pro=1533725275.949596737,0
I180808 13:11:20.962705 522080 rpc/nodedialer/nodedialer.go:90 [n1] connection to n2 established
I180808 13:11:22.432859 140 storage/replica_proposal.go:222 [n1,s1,r13805/1:/Table/13{799-800}] new range lease repl=(n1,s1):1 seq=5 start=1533733882.428899568,0 epo=1 pro=1533733882.428905510,0 following repl=(n3,s3):2 seq=4 start=1533725275.949570639,0 epo=2 pro=1533725275.949596737,0
I180808 13:11:23.440705 141 storage/replica_proposal.go:222 [n1,s1,r14739/1:/Table/1473{3-4}] new range lease repl=(n1,s1):1 seq=5 start=1533733883.436914841,0 epo=1 pro=1533733883.436919395,0 following repl=(n3,s3):2 seq=4 start=1533725275.949570639,0 epo=2 pro=1533725275.949596737,0
What was the impact?
The text was updated successfully, but these errors were encountered:
consistency checker queue looks to be declining slowly. The decline seems to match absence of leaseholder transfers (from Range Operations graph). What is the relationship between the two metric? Is this normal? Any downside to having high consistency checker queue?
I haven't checked the specific numbers, but the consistency checker works differently from most other queues. It tries to spread out its load over its full checking cycle (~24h). Yet, replicas are queued when they become eligible for a check (after 24h). So when lots of ranges are created around the same time, they will all become eligible at the same time at the first, and the bump should only disperse after ~1 day. If you're seeing this in a cluster that has had an approximately constant number of ranges for multiple days, there's a problem. Otherwise, it's expected. From your description ("created 8k databases") it seems expected.
read through #21824. Want to confirm the expected behavior.
BUG REPORT
Created test case per #28376
v2.1.0-alpha.20180730 on osx
stopped the test after creating 8,000 databases.
even on a small MacBook, the database was operational.
checking for abnormal metric and saw that consistency checker queue continued to increase after workload stopped.
The text was updated successfully, but these errors were encountered: