splitstore sortless compaction #8008

vyzo · 2022-01-28T13:57:25Z

This implements sortless compaction, as outlined in #7137.
Closes #7137

The compaction algorithm is modified to store the coldset on disk and checkpoint deletions. Once the critical section starts, the markset is consulted on reads and an object is considered as missing from the hotstore if it is not in the markset, while write synchronously update the markset.

This results in significantly faster compaciton (40% of time was spent in sorting).
In addition, coupled with the badger markset, compaction now uses very little memory independent of coldset size, which makes it possible to use the splitstore in memory constrained systems.

codecov · 2022-01-30T13:54:56Z

Codecov Report

Merging #8008 (e129ae3) into master (ff10e0e) will decrease coverage by 0.14%.
The diff coverage is 46.31%.

@@            Coverage Diff             @@
##           master    #8008      +/-   ##
==========================================
- Coverage   39.27%   39.12%   -0.15%     
==========================================
  Files         660      662       +2     
  Lines       71436    71922     +486     
==========================================
+ Hits        28054    28143      +89     
- Misses      38580    38887     +307     
- Partials     4802     4892      +90

Impacted Files	Coverage Δ
blockstore/splitstore/splitstore_check.go	`0.00% <0.00%> (ø)`
blockstore/splitstore/splitstore.go	`22.97% <2.17%> (-4.12%)`	⬇️
blockstore/splitstore/splitstore_compact.go	`41.36% <28.51%> (-11.08%)`	⬇️
blockstore/splitstore/markset_map.go	`61.74% <57.14%> (-22.04%)`	⬇️
blockstore/splitstore/checkpoint.go	`67.74% <67.74%> (ø)`
blockstore/splitstore/markset_badger.go	`71.06% <73.25%> (+2.99%)`	⬆️
blockstore/splitstore/coldset.go	`82.00% <82.00%> (ø)`
blockstore/splitstore/markset.go	`75.00% <100.00%> (ø)`
blockstore/splitstore/splitstore_warmup.go	`52.63% <100.00%> (ø)`
node/config/def.go	`97.46% <100.00%> (ø)`
... and 33 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ff10e0e...e129ae3. Read the comment docs.

…t-markset-badger splitstore: set badger as the default default markset type

Adds note about 3k IOPs requirement with badger markset, updates the memory requirement for map to 48G based on observed behaviour of test nodes.

magik6k

Looks simpler/easier to understand than the previous thing.

Just one tiny nit.

blockstore/splitstore/markset_badger.go

magik6k · 2022-02-09T17:17:52Z

⛵

vyzo added 18 commits January 28, 2022 15:41

refactor marksets for critical section on-disk persistence

45c2f34

add test for markset persistence

1bf396f

immediately flush pending writes when entering critical section

730acea

improve peristence test

67fbf9e

use temporary dir for splitstore test path

f9fd47e

add MarkMany to MarkSet interface

d140909

check for existence of badger db in recover

a4c1a34

moar markset tests

cf09dd0

make markSets synchronous in critical section

322b858

on disk checkpoints

c94eee5

checkpoint test

6ede77b

on disk coldsets

64cda4a

prettier checkpoint close

7233314

fix buffered reads

4b8369c

coldset test

a4f720d

sortless compaction

dbc8903

fix mockStore for splitstore tests

20b7502

fix lint

7931f1f

vyzo added 11 commits January 30, 2022 19:00

reinstante waitForMissingRefs

7b8447a

use both hot and cold when doing fetches for markset positive objects

a9d4495

account for missing refs in the markset in Has

1900c90

fix race in protectView

ee63be2

mark tipset references to protect them during critical section

c9bd5ec

fix comment

1abfc5b

recursively mark puts during the critical section

2b14bda

fix putmany marking

710fda4

avoid races in beginCriticalSection

5b9ea1b

hold the lock in the second protect call

877dfbe

use walkObjectIncomplete for marking live refs

7896af7

vyzo added 7 commits February 1, 2022 11:13

fix test

cd95892

improve robustness of waitForSync

9c92d77

unblock waitForSync on close

b13aa8f

fix comment

4b4104e

share a concurrent visitor between workers in markLiveRefs

c1d8368

badger markset option tweaks

75ad0c3

add note about compaction algorithm changes in README

049b489

vyzo requested a review from magik6k February 2, 2022 12:45

vyzo marked this pull request as ready for review February 2, 2022 12:46

vyzo requested a review from a team as a code owner February 2, 2022 12:46

vyzo changed the title ~~[WIP] splitstore sortless compaction~~ splitstore sortless compaction Feb 2, 2022

This was referenced Feb 4, 2022

Implement splitstore cold object reification #6914

Closed

splistore cold object reification redux #8029

Merged

vyzo added this to the █Blockstore Improvements milestone Feb 4, 2022

vyzo added 2 commits February 6, 2022 11:21

make badger the default splitstore markset type

03352ea

update README for map as the default

d45e207

vyzo mentioned this pull request Feb 6, 2022

splitstore: set badger as the default default markset type #8034

Merged

vyzo added 2 commits February 6, 2022 12:28

make gen

1221c0b

moar make gen

0ad1f0e

This was referenced Feb 7, 2022

Issues with syncing from scratch and long resyncs in the splitstore #6769

Open

Splitstore: the road to production readiness #8037

Open

Merge pull request #8034 from filecoin-project/feat/splitstore-defaul…

966071d

…t-markset-badger splitstore: set badger as the default default markset type

travisperson mentioned this pull request Feb 7, 2022

Setup testing for vyzo filecoin-project/lotus-infra#546

Merged

update README

8ddf476

Adds note about 3k IOPs requirement with badger markset, updates the memory requirement for map to 48G based on observed behaviour of test nodes.

jennijuju added the P1 P1: Must be resolved label Feb 8, 2022

magik6k approved these changes Feb 9, 2022

View reviewed changes

blockstore/splitstore/markset_badger.go Outdated Show resolved Hide resolved

refactor nextBatch in badger markset

e129ae3

magik6k merged commit 44fd0e3 into master Feb 9, 2022

magik6k deleted the feat/splitstore-sortless-compaction branch February 9, 2022 17:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

splitstore sortless compaction #8008

splitstore sortless compaction #8008

vyzo commented Jan 28, 2022 •

edited

Loading

codecov bot commented Jan 30, 2022 •

edited

Loading

magik6k left a comment

magik6k commented Feb 9, 2022

splitstore sortless compaction #8008

splitstore sortless compaction #8008

Conversation

vyzo commented Jan 28, 2022 • edited Loading

codecov bot commented Jan 30, 2022 • edited Loading

Codecov Report

magik6k left a comment

Choose a reason for hiding this comment

magik6k commented Feb 9, 2022

vyzo commented Jan 28, 2022 •

edited

Loading

codecov bot commented Jan 30, 2022 •

edited

Loading