Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hot/cold blockstore segregation (aka. splitstore) #4992

Merged
merged 150 commits into from
Mar 8, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
150 commits
Select commit Hold shift + click to select a range
6577cc8
splitstore struct and Blockstore interface implementation
vyzo Nov 24, 2020
c8f1139
compaction algorithm
vyzo Nov 24, 2020
b192adf
trigger compaction from head changes
vyzo Nov 24, 2020
fd08786
track base epoch in metadata ds
vyzo Nov 24, 2020
c2cc198
fix off by 1 in marking
vyzo Nov 24, 2020
2bed6c9
use dual live set marking algorithm to keep all hotly reachable objec…
vyzo Nov 24, 2020
b945747
satisfy linter
vyzo Nov 24, 2020
101e5c6
close keys channel when dome emitting keys
vyzo Nov 25, 2020
3083d80
no need to import go-ipfs-blockstore, lib/blockstore will do
vyzo Nov 25, 2020
c1b1a9c
avoid race with compacting state variable
vyzo Nov 25, 2020
2c9b58a
add some logging
vyzo Nov 25, 2020
17bc5fc
move splitstore implementation to its own directory
vyzo Nov 26, 2020
0bf1a78
stubs for tracking store and live set
vyzo Nov 26, 2020
df856b7
gomod: get lmdb-go
vyzo Nov 26, 2020
3f92a00
implement lmdb-backed LiveSet
vyzo Nov 26, 2020
5043f31
liveset unit test
vyzo Nov 26, 2020
83f8a0a
quiet linter
vyzo Nov 26, 2020
0d7476c
implement LMDB-backed tracking store
vyzo Nov 26, 2020
4763397
add tracking store test
vyzo Nov 26, 2020
da47883
quiet linter
vyzo Nov 26, 2020
d20cbc0
protect against potential data races
vyzo Nov 29, 2020
5db314f
fallback to coldstore if snooping fails.
vyzo Nov 29, 2020
37e391f
add TODO note about map size
vyzo Nov 29, 2020
0af7b16
simplify Has
vyzo Nov 29, 2020
b0f48b5
use CAS for compacting state
vyzo Nov 29, 2020
e87ce6c
go get go-bs-lmdb
vyzo Dec 1, 2020
e07c6c7
splitstore constructor
vyzo Dec 1, 2020
622b4f7
hook splitstore into DI
vyzo Dec 1, 2020
3912694
fix lotus-shed build
vyzo Dec 1, 2020
facdc55
add nil check for curTs -- some tests don't have chain state
vyzo Dec 1, 2020
f44cf0f
appease linter
vyzo Dec 1, 2020
843fd09
deal with MDB_KEY_EXIST errors
vyzo Dec 1, 2020
ce41e39
handle MDB_KEYEXIST in liveset marking
vyzo Dec 1, 2020
3f8da19
go get [email protected]
vyzo Dec 1, 2020
6e51e6d
better handling of MDB_KEYEXIST in Put
vyzo Dec 1, 2020
1a23b1f
make CompactionThreshold a var to fix lotus-soup build
vyzo Dec 1, 2020
76d6edb
fix max readers for tracking store
vyzo Dec 1, 2020
8b00875
adjust walk boundaries for marking
vyzo Dec 1, 2020
58a8434
temporary log level for splitstore to DEBUG
vyzo Jan 13, 2021
5b4e6b7
don't set max readers for livesets
vyzo Jan 20, 2021
877ecab
update go-bs-lmdb and migrate to ledgerwatch/lmdb-go.
raulk Jan 25, 2021
5872f24
go get [email protected]
vyzo Jan 26, 2021
2080e46
don't set MaxReaders for tracking store
vyzo Jan 29, 2021
c89ab1a
retry on MDB_READERS_FULL errors
vyzo Feb 1, 2021
b9f8a3d
log MDB_READERS_FULL retries
vyzo Feb 1, 2021
d91b60d
fix potential panic with max readers retry and cursor channel
vyzo Feb 1, 2021
ea05fd9
use xerrors instead of fmt.Errorf
vyzo Feb 10, 2021
cdf5bd0
return annotated xerrors where appropriate
vyzo Feb 10, 2021
69a88d4
fix snoop test
vyzo Feb 10, 2021
ca8a673
adjust hot store options
vyzo Feb 11, 2021
874ecd3
adjust hot store options, redux.
vyzo Feb 11, 2021
723e48b
gomod:update go-bs-lmdb to v1.0.3
vyzo Feb 11, 2021
95befa1
set lmdb max readers retry delay to 1ms
vyzo Feb 11, 2021
f6c930d
crank up blockstore max readers to 16K, reduce retry delays to 10us
vyzo Feb 13, 2021
7044e62
flag to enable GC during compaction, disabled for now
vyzo Feb 26, 2021
a586d42
make hot store DI injectable in the split store, default to badger.
vyzo Feb 26, 2021
842ec43
get rid of goroutine iteration in tracking store; long live ForEach
vyzo Feb 26, 2021
d44719d
amend confusing comment
vyzo Feb 26, 2021
5068d51
use CompactionCold epochs for delinating the cold epoch cliff
vyzo Feb 26, 2021
31268ba
walk snapshot the same way snapshot exporting does; skip old msgs and…
vyzo Feb 26, 2021
8e12377
handle consistency edge case
vyzo Feb 26, 2021
99c7d8e
more informative names for the hotstore directories
vyzo Feb 26, 2021
ee751f8
refactor lmdb specific snoop/liveset code into their own files
vyzo Feb 26, 2021
9977f5c
rewrite sweep logic to avoid doing writes/deletes nested in a read txn
vyzo Feb 26, 2021
e794451
handle MDB_KEY_EXIST in tracking store Puts
vyzo Feb 27, 2021
8f0ddac
add comment
vyzo Feb 27, 2021
923a3db
abstract tracking store and live set construction
vyzo Feb 27, 2021
68b6f91
propagate useLMDB option to splitstore through DI
vyzo Feb 27, 2021
cb1789e
gomod: use bolt
vyzo Feb 27, 2021
27a9b97
implement bolt-backed liveset
vyzo Feb 27, 2021
2c1a978
add test for bolt liveset
vyzo Feb 27, 2021
b839947
separate LMDB options for hotstore and tracking stores
vyzo Feb 27, 2021
f1c61c4
implement bolt backed tracking store
vyzo Feb 27, 2021
2e4d45e
test for bolt backed tracking store
vyzo Feb 27, 2021
73259aa
add configuration for splitstore and default to a simple compaction a…
vyzo Feb 27, 2021
364076c
set NoSync option for bolt livesets
vyzo Feb 27, 2021
783dcda
add Sync to the tracking store
vyzo Feb 27, 2021
2f26026
compactSimple should walk the cold epoch at depth 1
vyzo Feb 27, 2021
2426ffb
better logging plus moving some code around
vyzo Feb 27, 2021
e52c709
more accurate setting of skip params
vyzo Feb 27, 2021
09cd117
structured log for beginning of compaction
vyzo Feb 27, 2021
aba6530
batch deletion for purging the tracking store
vyzo Feb 27, 2021
97abbe1
add (salted) bloom filter liveset
vyzo Feb 28, 2021
4cc672d
batch move objects from coldstore to hotstore
vyzo Feb 28, 2021
f4c6bc6
comment nomenclature
vyzo Feb 28, 2021
f5ce795
size bloom filter for 50M objects
vyzo Feb 28, 2021
8884920
fix tests
vyzo Feb 28, 2021
44aadb9
rehash salted keys in bloom filter
vyzo Feb 28, 2021
f62999d
use named constants for bloom filter parameters
vyzo Feb 28, 2021
05fee27
remove stale references to lmdb from splitstore implementation
vyzo Feb 28, 2021
e582f0b
remove references to splitstore from lotus-shed
vyzo Feb 28, 2021
7587ab6
quiet the stupid linter
vyzo Feb 28, 2021
5639261
make compaction parameters variable
vyzo Feb 28, 2021
cae5ddc
dynamically size bloom filters
vyzo Feb 28, 2021
99c6e4f
adjust min bloom filter size
vyzo Feb 28, 2021
3282f85
fix tests
vyzo Feb 28, 2021
0fc2f3a
fix post-rebase compilation errors
vyzo Mar 1, 2021
3733456
go mod tidy
vyzo Mar 1, 2021
1b51c10
split off lmdb support to a different branch.
raulk Mar 1, 2021
1a804fb
move splitstore into blockstore package.
raulk Mar 1, 2021
cb36d5b
warm up splitstore at first head change notification
vyzo Mar 1, 2021
748dd96
snake current tipset from head change notification
vyzo Mar 1, 2021
e612fff
also estimate liveset size during warm up
vyzo Mar 1, 2021
b9400c5
use crypto/rand for bloom salt
vyzo Mar 1, 2021
b1b452b
remove dependency from blockstore/splitstore => chain/store.
raulk Mar 1, 2021
8cfba5b
renames and polish.
raulk Mar 1, 2021
ce68b9b
batch writes during warm up
vyzo Mar 1, 2021
48f2533
increase batch size to 16K
vyzo Mar 1, 2021
4b1e1f4
rename liveset => markset; rename snoop => tracking store; docs.
raulk Mar 2, 2021
f651f43
improve comment accuracy
vyzo Mar 2, 2021
35d466d
use sha256 for bloom key rehashing
vyzo Mar 2, 2021
68213a9
use ioutil.TempDir for test directories
vyzo Mar 2, 2021
5184bc5
log consistency for full compaction
vyzo Mar 2, 2021
c762536
deduplicate code
vyzo Mar 2, 2021
6014273
storage miner doesn't need a splitstore
vyzo Mar 2, 2021
dd0c308
move Blockstore config to FullNode, rename to Chainstore and add defa…
vyzo Mar 2, 2021
86b73d6
add DeleteMany to Blockstore interface
vyzo Mar 2, 2021
8a55b73
fix the situation with WrapIDStore
vyzo Mar 2, 2021
2ff5aec
satisfy linter, use Prefix for common path of non inline CIDs
vyzo Mar 2, 2021
86fdad2
fix typo
vyzo Mar 2, 2021
ab52e34
add comment
vyzo Mar 2, 2021
4c05ec2
fix FromDatastore to not do double adapting
vyzo Mar 2, 2021
06d8ea1
batch delete during the cold purge
vyzo Mar 2, 2021
006c55a
add startup log
vyzo Mar 2, 2021
70ebb2a
improve startup log
vyzo Mar 2, 2021
d2d0980
don't delete in one giant batch, use smaller chunks of batchSize
vyzo Mar 2, 2021
6b8c60a
don't ID wrap the hotstore
vyzo Mar 2, 2021
6b680d1
do tracker purge in smaller batches
vyzo Mar 2, 2021
11b2f41
overestimate markSetSize a bit
vyzo Mar 3, 2021
47d8c87
fix log
vyzo Mar 3, 2021
508fcb9
properly close snoop at shutdown
vyzo Mar 3, 2021
fdd8775
walk at boundary epoch, 2 finalities from current epoch, to find live…
vyzo Mar 3, 2021
98a7b88
implement DeleteMany in union blockstore
vyzo Mar 3, 2021
5fb6a90
fix loop condition in batch deletion
vyzo Mar 3, 2021
aff0f1e
deduplicate code for batch deletion
vyzo Mar 3, 2021
17be7d3
save markSetSize
vyzo Mar 5, 2021
9bd009d
use atomics to demarkate critical section and limit close delay
vyzo Mar 5, 2021
c58df3f
don't panic on compaction errors
vyzo Mar 5, 2021
99d2157
remove DEBUG log spam
vyzo Mar 5, 2021
2b32c2e
add some metrics
vyzo Mar 5, 2021
0a2f2cf
use the right condition for triggering the miss metric
vyzo Mar 5, 2021
09f5ba1
add splitstore unit test
vyzo Mar 5, 2021
e85391b
quiet stupid linter
vyzo Mar 5, 2021
8562a9b
garbage collect hotstore after compaction
vyzo Mar 8, 2021
52de95d
also gc in compactFull, not just compactSimple
vyzo Mar 8, 2021
3d1b855
rename GC to CollectGarbage, ignore badger.ErrNoRewrite
vyzo Mar 8, 2021
3bd7770
deduplicate code
vyzo Mar 8, 2021
c52dae4
Merge pull request #5744 from filecoin-project/feat/splitstore-gc
vyzo Mar 8, 2021
90741da
tune badger gc to repeated gc the value log until there is no rewrite
vyzo Mar 8, 2021
51ed4c7
Merge pull request #5745 from filecoin-project/feat/splitstore-gc-tuning
magik6k Mar 8, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions blockstore/badger/blockstore.go
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,25 @@ func (b *Blockstore) Close() error {
return b.DB.Close()
}

// CollectGarbage runs garbage collection on the value log
func (b *Blockstore) CollectGarbage() error {
if atomic.LoadInt64(&b.state) != stateOpen {
return ErrBlockstoreClosed
}

var err error
for err == nil {
err = b.DB.RunValueLogGC(0.125)
}

if err == badger.ErrNoRewrite {
// not really an error in this case
return nil
}

return err
}

// View implements blockstore.Viewer, which leverages zero-copy read-only
// access to values.
func (b *Blockstore) View(cid cid.Cid, fn func([]byte) error) error {
Expand Down Expand Up @@ -318,6 +337,44 @@ func (b *Blockstore) DeleteBlock(cid cid.Cid) error {
})
}

func (b *Blockstore) DeleteMany(cids []cid.Cid) error {
if atomic.LoadInt64(&b.state) != stateOpen {
return ErrBlockstoreClosed
}

batch := b.DB.NewWriteBatch()
defer batch.Cancel()

// toReturn tracks the byte slices to return to the pool, if we're using key
// prefixing. we can't return each slice to the pool after each Set, because
// badger holds on to the slice.
var toReturn [][]byte
if b.prefixing {
toReturn = make([][]byte, 0, len(cids))
defer func() {
for _, b := range toReturn {
KeyPool.Put(b)
}
}()
}

for _, cid := range cids {
k, pooled := b.PooledStorageKey(cid)
if pooled {
toReturn = append(toReturn, k)
}
if err := batch.Delete(k); err != nil {
return err
}
}

err := batch.Flush()
if err != nil {
err = fmt.Errorf("failed to delete blocks from badger blockstore: %w", err)
}
return err
}

// AllKeysChan implements Blockstore.AllKeysChan.
func (b *Blockstore) AllKeysChan(ctx context.Context) (<-chan cid.Cid, error) {
if atomic.LoadInt64(&b.state) != stateOpen {
Expand Down
33 changes: 31 additions & 2 deletions blockstore/blockstore.go
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
package blockstore

import (
"github.com/ipfs/go-cid"
cid "github.com/ipfs/go-cid"
ds "github.com/ipfs/go-datastore"
logging "github.com/ipfs/go-log/v2"

Expand All @@ -18,20 +18,38 @@ var ErrNotFound = blockstore.ErrNotFound
type Blockstore interface {
blockstore.Blockstore
blockstore.Viewer
BatchDeleter
}

// BasicBlockstore is an alias to the original IPFS Blockstore.
type BasicBlockstore = blockstore.Blockstore

type Viewer = blockstore.Viewer

type BatchDeleter interface {
DeleteMany(cids []cid.Cid) error
}

// WrapIDStore wraps the underlying blockstore in an "identity" blockstore.
// The ID store filters out all puts for blocks with CIDs using the "identity"
// hash function. It also extracts inlined blocks from CIDs using the identity
// hash function and returns them on get/has, ignoring the contents of the
// blockstore.
func WrapIDStore(bstore blockstore.Blockstore) Blockstore {
return blockstore.NewIdStore(bstore).(Blockstore)
if is, ok := bstore.(*idstore); ok {
// already wrapped
return is
}

if bs, ok := bstore.(Blockstore); ok {
// we need to wrap our own because we don't want to neuter the DeleteMany method
// the underlying blockstore has implemented an (efficient) DeleteMany
return NewIDStore(bs)
}

// The underlying blockstore does not implement DeleteMany, so we need to shim it.
// This is less efficient as it'll iterate and perform single deletes.
return NewIDStore(Adapt(bstore))
}

// FromDatastore creates a new blockstore backed by the given datastore.
Expand All @@ -53,6 +71,17 @@ func (a *adaptedBlockstore) View(cid cid.Cid, callback func([]byte) error) error
return callback(blk.RawData())
}

func (a *adaptedBlockstore) DeleteMany(cids []cid.Cid) error {
for _, cid := range cids {
err := a.DeleteBlock(cid)
if err != nil {
return err
}
}

return nil
}

// Adapt adapts a standard blockstore to a Lotus blockstore by
// enriching it with the extra methods that Lotus requires (e.g. View, Sync).
//
Expand Down
8 changes: 8 additions & 0 deletions blockstore/buffered.go
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,14 @@ func (bs *BufferedBlockstore) DeleteBlock(c cid.Cid) error {
return bs.write.DeleteBlock(c)
}

func (bs *BufferedBlockstore) DeleteMany(cids []cid.Cid) error {
if err := bs.read.DeleteMany(cids); err != nil {
return err
}

return bs.write.DeleteMany(cids)
}

func (bs *BufferedBlockstore) View(c cid.Cid, callback func([]byte) error) error {
// both stores are viewable.
if err := bs.write.View(c, callback); err == ErrNotFound {
Expand Down
174 changes: 174 additions & 0 deletions blockstore/idstore.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
package blockstore

import (
"context"
"io"

"golang.org/x/xerrors"

blocks "github.com/ipfs/go-block-format"
cid "github.com/ipfs/go-cid"
mh "github.com/multiformats/go-multihash"
)

var _ Blockstore = (*idstore)(nil)

type idstore struct {
bs Blockstore
}

func NewIDStore(bs Blockstore) Blockstore {
return &idstore{bs: bs}
}

func decodeCid(cid cid.Cid) (inline bool, data []byte, err error) {
if cid.Prefix().MhType != mh.IDENTITY {
return false, nil, nil
}

dmh, err := mh.Decode(cid.Hash())
if err != nil {
return false, nil, err
}

if dmh.Code == mh.IDENTITY {
return true, dmh.Digest, nil
}

return false, nil, err
}

func (b *idstore) Has(cid cid.Cid) (bool, error) {
inline, _, err := decodeCid(cid)
if err != nil {
return false, xerrors.Errorf("error decoding Cid: %w", err)
}

if inline {
return true, nil
}

return b.bs.Has(cid)
}

func (b *idstore) Get(cid cid.Cid) (blocks.Block, error) {
inline, data, err := decodeCid(cid)
if err != nil {
return nil, xerrors.Errorf("error decoding Cid: %w", err)
}

if inline {
return blocks.NewBlockWithCid(data, cid)
}

return b.bs.Get(cid)
}

func (b *idstore) GetSize(cid cid.Cid) (int, error) {
inline, data, err := decodeCid(cid)
if err != nil {
return 0, xerrors.Errorf("error decoding Cid: %w", err)
}

if inline {
return len(data), err
}

return b.bs.GetSize(cid)
}

func (b *idstore) View(cid cid.Cid, cb func([]byte) error) error {
inline, data, err := decodeCid(cid)
if err != nil {
return xerrors.Errorf("error decoding Cid: %w", err)
}

if inline {
return cb(data)
}

return b.bs.View(cid, cb)
}

func (b *idstore) Put(blk blocks.Block) error {
inline, _, err := decodeCid(blk.Cid())
if err != nil {
return xerrors.Errorf("error decoding Cid: %w", err)
}

if inline {
return nil
}

return b.bs.Put(blk)
}

func (b *idstore) PutMany(blks []blocks.Block) error {
toPut := make([]blocks.Block, 0, len(blks))
for _, blk := range blks {
inline, _, err := decodeCid(blk.Cid())
if err != nil {
return xerrors.Errorf("error decoding Cid: %w", err)
}

if inline {
continue
}
toPut = append(toPut, blk)
}

if len(toPut) > 0 {
return b.bs.PutMany(toPut)
}

return nil
}

func (b *idstore) DeleteBlock(cid cid.Cid) error {
inline, _, err := decodeCid(cid)
if err != nil {
return xerrors.Errorf("error decoding Cid: %w", err)
}

if inline {
return nil
}

return b.bs.DeleteBlock(cid)
}

func (b *idstore) DeleteMany(cids []cid.Cid) error {
toDelete := make([]cid.Cid, 0, len(cids))
for _, cid := range cids {
inline, _, err := decodeCid(cid)
if err != nil {
return xerrors.Errorf("error decoding Cid: %w", err)
}

if inline {
continue
}
toDelete = append(toDelete, cid)
}

if len(toDelete) > 0 {
return b.bs.DeleteMany(toDelete)
}

return nil
}

func (b *idstore) AllKeysChan(ctx context.Context) (<-chan cid.Cid, error) {
return b.bs.AllKeysChan(ctx)
}

func (b *idstore) HashOnRead(enabled bool) {
b.bs.HashOnRead(enabled)
}

func (b *idstore) Close() error {
if c, ok := b.bs.(io.Closer); ok {
return c.Close()
}
return nil
}
7 changes: 7 additions & 0 deletions blockstore/mem.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,13 @@ func (m MemBlockstore) DeleteBlock(k cid.Cid) error {
return nil
}

func (m MemBlockstore) DeleteMany(ks []cid.Cid) error {
for _, k := range ks {
delete(m, k)
}
return nil
}

func (m MemBlockstore) Has(k cid.Cid) (bool, error) {
_, ok := m[k]
return ok, nil
Expand Down
38 changes: 38 additions & 0 deletions blockstore/splitstore/markset.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
package splitstore

import (
"path/filepath"

"golang.org/x/xerrors"

cid "github.com/ipfs/go-cid"
)

// MarkSet is a utility to keep track of seen CID, and later query for them.
//
// * If the expected dataset is large, it can be backed by a datastore (e.g. bbolt).
// * If a probabilistic result is acceptable, it can be backed by a bloom filter (default).
type MarkSet interface {
Mark(cid.Cid) error
Has(cid.Cid) (bool, error)
Close() error
}

// markBytes is deliberately a non-nil empty byte slice for serialization.
var markBytes = []byte{}

type MarkSetEnv interface {
Create(name string, sizeHint int64) (MarkSet, error)
Close() error
}

func OpenMarkSetEnv(path string, mtype string) (MarkSetEnv, error) {
switch mtype {
case "", "bloom":
return NewBloomMarkSetEnv()
case "bolt":
return NewBoltMarkSetEnv(filepath.Join(path, "markset.bolt"))
default:
return nil, xerrors.Errorf("unknown mark set type %s", mtype)
}
}
Loading