libroach: Migrate IncrementalIterator logic to C++ #37496

adityamaru27 · 2019-05-13T20:29:51Z

Export was previously executed using an MVCCIncrementalIterator
where the logic handling iteration over the diff of the key
range [startKey, endKey) and time range (startTime, endTime]
was written in Go. This iteration involved making a cgo call
to find every key, along with another cgo call for writing
each key to the sstable.
This migration resolves the aforementioned performance
bottleneck (#18884), by pushing all the required logic to C++
and exposing a single export method.

Based on a performance benchmark by running BACKUP on a tpcc database with 1000 warehouses we observe the following:

Over an average of 3 runs we see a 1.1x improvement in time performance. While the original binary took ~32m04s the changed implementation took ~28m55s. This is due to the elimination of a cgo call per key.

CLAassistant · 2019-05-13T20:29:59Z

All committers have signed the CLA.

cockroach-teamcity · 2019-05-13T20:29:59Z

This change is

petermattis · 2019-05-13T21:04:06Z

Also tagging @ajkr on this as he's familiar with both C++ and RocksDB.

dt · 2019-05-13T21:40:37Z

@ajkr you can hold off for a bit, we've discussed this a bit with @danhhz and are going to change the approach slightly and revise this before it is ready for review.

dt · 2019-05-15T17:43:20Z

pkg/ccl/storageccl/export.go

-	var rows bulk.RowCounter
-	// TODO(dan): Move all this iteration into cpp to avoid the cgo calls.
-	// TODO(dan): Consider checking ctx periodically during the MVCCIterate call.
-	iter := engineccl.NewMVCCIncrementalIterator(batch, engineccl.IterOptions{


Suggestion: you could pull this code into a go function that takes the same args are your new c++ function (and keep mvcc.go around for now). Then you could write a unit test that calls both functions on the same engine and compares the result?

Done, but still investigating why it is taking the test so long to run.

dt

Reviewed 3 of 3 files at r1, 12 of 14 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru27 and @ajkr)

c-deps/libroach/db.h, line 104 at r2 (raw file):

// preferred function for Go callers.
DBStatus DBSstFileWriterAddRaw(DBSstFileWriter* fw, const ::rocksdb::Slice key,
                               const ::rocksdb::Slice val);

nit: newline at end of file

c-deps/libroach/db.cc, line 950 at r2 (raw file):

    // Skip tombstone (len=0) records when start time is zero (non-incremental)
    // and we are not exporting all versions.
    if (!export_all_revisions && iter.value().size() == 0 && start.wall_time == 0 &&

2¢: I might pull the three conditions that go into "are we skipping deletes?" into a bool before the loop, so we only check "are we skipping deletes && is this a delete"

c-deps/libroach/db.cc, line 974 at r2 (raw file):

  return res;
}

nit: newline at end of file

c-deps/libroach/incremental_iterator.h, line 1 at r2 (raw file):

// Copyright 2018 The Cockroach Authors.

nit: 2019

c-deps/libroach/incremental_iterator.h, line 52 at r2 (raw file):

  cockroach::util::hlc::LegacyTimestamp start_time;
  cockroach::util::hlc::LegacyTimestamp end_time;
};

nit: newline at end of file

c-deps/libroach/incremental_iterator.cc, line 1 at r2 (raw file):

// Copyright 2018 The Cockroach Authors.

nit: 2019

c-deps/libroach/incremental_iterator.cc, line 223 at r2 (raw file):

const rocksdb::Slice DBIncrementalIterator::key() { return iter.get()->rep->key(); }

const rocksdb::Slice DBIncrementalIterator::value() { return iter.get()->rep->value(); }

nit: newline at end of file

pkg/storage/engine/rocksdb.go, line 3201 at r1 (raw file):

func ExportToSst(
	ctx context.Context, e Reader, start, end MVCCKey, exportAllRevisions bool, io IterOptions,
) ([]byte, C.int64_t, error) {

I'd return an int64 instead of the c type.

pkg/storage/engine/rocksdb.go, line 3213 at r1 (raw file):

	var intentErr C.DBString

	err := statusToError(C.DBExportToSst(goToCKey(start), goToCKey(end), C.bool(exportAllRevisions),

I think you might have inverted the order of the commits?

pkg/storage/engine/rocksdb.go, line 3223 at r1 (raw file):

			}

			return nil, dataSize, &e

I'd return 0 instead of the undefined value (right?) in dataSize.

adityamaru27

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajkr and @dt)

c-deps/libroach/db.cc, line 950 at r2 (raw file):