Skip to content

Commit

Permalink
Merge #25471 #25474
Browse files Browse the repository at this point in the history
25471: backport-2.0: rocksdb: use max_manifest_file_size option r=bdarnell a=tschottdorf

Backport 1/1 commits from #25341.

/cc @cockroachdb/release

---

Cockroach uses a single long running rocksdb instance for the entire
process lifetime, which could be many months. By default, rocksdb tracks
filesystem state changes in a log file called the MANIFEST, which grows
without bound until the instance is re-opened. We should bound the
maximum file size of rocksdb MANIFEST using the corresponding rocksdb
option to prevent unbounded growth.

The MANIFEST file grew to several GBs in size in a customer bug report
but that was probably because of some other bad behavior in rocksdb
state management. We do want to bound the MANIFEST size in such cases as
well.

Release note: None


25474: backport-2.0: storage: fix deadlock in consistency queue r=bdarnell a=tschottdorf

Backport 1/1 commits from #25456.

/cc @cockroachdb/release

---

When `CheckConsistency` returns an error, the queue checks whether the
store is draining to decide whether the error is worth logging.

Unfortunately this check was incorrect and would block until the store
actually started draining.

A toy example of this problem is below (this will deadlock). The dual
return form of chan receive isn't non-blocking -- the second parameter
indicates whether the received value corresponds to a closing of the
channel.

Switch to a `select` instead.

```go
package main

import (
	"fmt"
)

func main() {
	ch := make(chan struct{})
	_, ok := <-ch
	fmt.Println(ok)
}
```

Touches #21824.

Release note (bug fix): Prevent the consistency checker from
deadlocking. This would previously manifest itself as a steady number of
replicas queued for consistency checking on one or more nodes and would
resolve by restarting the affected nodes.


Co-authored-by: Garvit Juniwal <[email protected]>
Co-authored-by: Tobias Schottdorf <[email protected]>
  • Loading branch information
3 people committed May 14, 2018
3 parents c3b55ba + 83ff0d2 + d9ba885 commit 8b1dab0
Show file tree
Hide file tree
Showing 3 changed files with 52 additions and 5 deletions.
6 changes: 6 additions & 0 deletions c-deps/libroach/options.cc
Original file line number Diff line number Diff line change
Expand Up @@ -264,6 +264,12 @@ rocksdb::Options DBMakeOptions(DBOptions db_opts) {
options.target_file_size_base = 4 << 20; // 4 MB
options.target_file_size_multiplier = 2;

// Because we open a long running rocksdb instance, we do not want the
// manifest file to grow unbounded. Assuming each manifest entry is about 1
// KB, this allows for 128 K entries. This could account for several hours to
// few months of runtime without rolling based on the workload.
options.max_manifest_file_size = 128 << 20; // 128 MB

rocksdb::BlockBasedTableOptions table_options;
if (db_opts.cache != nullptr) {
table_options.block_cache = db_opts.cache->rep;
Expand Down
21 changes: 16 additions & 5 deletions pkg/storage/consistency_queue.go
Original file line number Diff line number Diff line change
Expand Up @@ -102,19 +102,30 @@ func (q *consistencyQueue) process(
if q.interval() <= 0 {
return nil
}

// Call setQueueLastProcessed because the consistency checker targets a much
// longer cycle time than other queues. That it ignores errors is likely a
// historical accident that should be revisited.
if err := repl.setQueueLastProcessed(ctx, q.name, repl.store.Clock().Now()); err != nil {
log.VErrEventf(ctx, 2, "failed to update last processed time: %v", err)
}

req := roachpb.CheckConsistencyRequest{}
if _, pErr := repl.CheckConsistency(ctx, req); pErr != nil {
_, shouldQuiesce := <-repl.store.Stopper().ShouldQuiesce()
var shouldQuiesce bool
select {
case <-repl.store.Stopper().ShouldQuiesce():
shouldQuiesce = true
default:
}

if !shouldQuiesce || !grpcutil.IsClosedConnection(pErr.GoError()) {
// Suppress noisy errors about closed GRPC connections when the
// server is quiescing.
log.Error(ctx, pErr.GoError())
return pErr.GoError()
}
}
// Update the last processed time for this queue.
if err := repl.setQueueLastProcessed(ctx, q.name, repl.store.Clock().Now()); err != nil {
log.VErrEventf(ctx, 2, "failed to update last processed time: %v", err)
}
return nil
}

Expand Down
30 changes: 30 additions & 0 deletions pkg/storage/replica_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -9050,6 +9050,36 @@ func TestReplicaRecomputeStats(t *testing.T) {
}
}

// TestConsistencyQueueErrorFromCheckConsistency exercises the case in which
// the queue receives an error from CheckConsistency.
func TestConsistenctQueueErrorFromCheckConsistency(t *testing.T) {
defer leaktest.AfterTest(t)()

ctx := context.Background()
stopper := stop.NewStopper()
defer stopper.Stop(ctx)

cfg := TestStoreConfig(nil)
cfg.TestingKnobs = StoreTestingKnobs{
TestingRequestFilter: func(ba roachpb.BatchRequest) *roachpb.Error {
if _, ok := ba.GetArg(roachpb.ComputeChecksum); ok {
return roachpb.NewErrorf("boom")
}
return nil
},
}
tc := testContext{}
tc.StartWithStoreConfig(t, stopper, cfg)

for i := 0; i < 2; i++ {
// Do this twice because it used to deadlock. See #25456.
sysCfg, _ := tc.store.Gossip().GetSystemConfig()
if err := tc.store.consistencyQueue.process(ctx, tc.repl, sysCfg); !testutils.IsError(err, "boom") {
t.Fatal(err)
}
}
}

// TestReplicaLocalRetries verifies local retry logic for transactional
// and non transactional batches. Verifies the timestamp cache is updated
// to reflect the timestamp at which retried batches are executed.
Expand Down

0 comments on commit 8b1dab0

Please sign in to comment.