use cache for storing block offsets #1336

poonai · 2020-05-21T06:43:16Z

This change is

Signed-off-by: Tiger <[email protected]>

jarifibrahim

Good job @balajijinnah . I have a few comments.

Reviewable status: 0 of 4 files reviewed, 3 unresolved discussions (waiting on @ashish-goswami, @balajijinnah, @jarifibrahim, and @manishrjain)

table/table.go, line 101 at r1 (raw file):

	bfLock    sync.Mutex

	blockIndex []*pb.BlockOffset

We should keep the blockIndex in memory and keep index in cache based on the options. For instance, bloom filters will be kept in memory or in the cache based on the option.

We need support for both.

table/table.go, line 444 at r1 (raw file):

		// Set the block index in the cache.
		key := t.blockIndexCacheKey(idx)
		t.opt.Cache.Set(key, ko, calculateOffsetSize(ko))

We're inserting a single entry from the index in the cache. I think inserting the entire index in the cache would be better.

When someone is iterating over a table, every call to block(idx) function would call readIndex (which would read and unmarshal the index). Instead, if we push the entire index in the cache, it would be faster to use it.

Also, each unmarshal index call would take up some memory. Keeping the entire index in the cache would also help in reducing this memory as well.

table/table_test.go, line 960 at r1 (raw file):

// Use this benchmark to manually verify block offset size calculation
func BenchmarkBlockOffsetSizeCalculation(b *testing.B) {

What does this benchmark do? I see it is only creating a new blockOffset. Is it supposed to benchmark calculateOffsetSize?

Signed-off-by: Tiger <[email protected]>

poonai

Reviewable status: 0 of 5 files reviewed, 3 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, and @manishrjain)

table/table.go, line 101 at r1 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

We should keep the blockIndex in memory and keep index in cache based on the options. For instance, bloom filters will be kept in memory or in the cache based on the option.

We need support for both.

Done.

table/table.go, line 444 at r1 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

We're inserting a single entry from the index in the cache. I think inserting the entire index in the cache would be better.

When someone is iterating over a table, every call to block(idx) function would call readIndex (which would read and unmarshal the index). Instead, if we push the entire index in the cache, it would be faster to use it.

Also, each unmarshal index call would take up some memory. Keeping the entire index in the cache would also help in reducing this memory as well.

Done.

table/table_test.go, line 960 at r1 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

What does this benchmark do? I see it is only creating a new blockOffset. Is it supposed to benchmark calculateOffsetSize?

This is just to manually verify the actual size.
When you run, you'll get an output like this

BenchmarkBlockOffsetSizeCalculation-12    	24037693	        82.6 ns/op	      66 B/op	       2 allocs/op

From which you can see that 66bytes taken from the blockoffset struct

Signed-off-by: Tiger <[email protected]>

…aji/block_optimization Signed-off-by: Tiger <[email protected]>

jarifibrahim

Reviewable status: 0 of 6 files reviewed, 5 unresolved discussions (waiting on @ashish-goswami, @balajijinnah, and @manishrjain)

options.go, line 106 at r3 (raw file):

	maxBatchSize  int64 // max batch size in bytes

	KeepBlockOffsetsInCache bool // if KeepBlockOffsetsInCache is true and cache is enabled then,

Maybe call it KeepIndexInCache?
Nit: This should be declared above the Flags for testing purposes section.

options.go, line 645 at r3 (raw file):

// Badger uses block offsets to to find the blocks in the sst table. Blocks offset also contains
// starting key of block. Badger uses those keys to find the block to do an effective key iteration.
// This option is used to reduce inmemory usage of BlockOffsets.

Please add the following line

When this option is set badger will store the block offsets in a cache along with
the blocks. The size of the cache is determined by the MaxBlockCacheSize option.
When indices are stored in the cache, the read performance might be affected but 
the cache limits the amount of memory used by the indices.

options.go, line 649 at r3 (raw file):

// The default value of KeepBlockOffsetInCache is false.
func (opt Options) WithKeepBlockOffsetsInCache(val bool) Options {
	opt.KeepBlockOffsetsInCache = val

I think we will need another option for the blocks. Let's say someone wants to keep indices in the cache but they don't want to keep the blocks in the cache because they're not using compression/encryption.
So if they do the following

KeepBlockOffsetsInCache = true
MaxBlockCache = 100 << 20

We will end up storing the blocks also in the cache which would be a waste of the cache space and will affect the read performance.

I think we should have something like

KeepBlocksInCache     bool
KeepIndicesInCache    bool
MaxCacheSize          int

So that the users can have any combination of block/index cache.

table/table.go, line 456 at r2 (raw file):

	val, ok := t.opt.Cache.Get(t.blockOffsetsCacheKey())
	if ok {

You might want to check if val is nil before we do type assertion. Ristretto documentation says

The value can be nil and the boolean can be true at the same time.

https://godoc.org/github.com/dgraph-io/ristretto#Cache.Get

if val, ok := ...; ok && val != nil { ... }

table/table.go, line 53 at r3 (raw file):

	1*8 + // offset and len takes 1 word
	3*8 + // XXX_unrecognized array takes 3 word.
	1*8 // so far 7 words, in order to round the slab we're adding one more word.

Nit: far => for

Signed-off-by: Tiger <[email protected]>

poonai

Reviewable status: 0 of 6 files reviewed, 5 unresolved discussions (waiting on @ashish-goswami, @balajijinnah, @jarifibrahim, and @manishrjain)

options.go, line 106 at r3 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

Maybe call it KeepIndexInCache?
Nit: This should be declared above the Flags for testing purposes section.

Done.

options.go, line 645 at r3 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

Please add the following line

When this option is set badger will store the block offsets in a cache along with
the blocks. The size of the cache is determined by the MaxBlockCacheSize option.
When indices are stored in the cache, the read performance might be affected but 
the cache limits the amount of memory used by the indices.

Done.

options.go, line 649 at r3 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

I think we will need another option for the blocks. Let's say someone wants to keep indices in the cache but they don't want to keep the blocks in the cache because they're not using compression/encryption.
So if they do the following
KeepBlockOffsetsInCache = true
MaxBlockCache = 100 << 20
We will end up storing the blocks also in the cache which would be a waste of the cache space and will affect the read performance.

I think we should have something like
KeepBlocksInCache     bool
KeepIndicesInCache    bool
MaxCacheSize          int
So that the users can have any combination of block/index cache.

Done.

table/table.go, line 456 at r2 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

You might want to check if val is nil before we do type assertion. Ristretto documentation says
The value can be nil and the boolean can be true at the same time.
https://godoc.org/github.com/dgraph-io/ristretto#Cache.Get
if val, ok := ...; ok && val != nil { ... }

Done.

jarifibrahim

Good work @balajijinnah 👏

Reviewable status: 0 of 6 files reviewed, 3 unresolved discussions (waiting on @ashish-goswami, @balajijinnah, @jarifibrahim, and @manishrjain)

options.go, line 653 at r4 (raw file):

// the cache limits the amount of memory used by the indices.
//
// The default value of KeepBlockOffsetInCache is false.

We should mention this option also set maxCacheSize to 100mb.

options.go, line 667 at r4 (raw file):

//
// When this option is set badger will store the block in the cache. The default value of
// KeepBlocksInCache is false.

We should mention this option also set maxCacheSize to 100mb.

manishrjain

I have comments. Get it reviewed by @vvbalaji-dgraph as well in the daily reviews.

Reviewable status: 0 of 6 files reviewed, 5 unresolved discussions (waiting on @ashish-goswami, @balajijinnah, @jarifibrahim, and @manishrjain)

options.go, line 166 at r4 (raw file):

		EncryptionKey:                 []byte{},
		EncryptionKeyRotationDuration: 10 * 24 * time.Hour, // Default 10 days.
		KeepBlocksInCache:             false,

Might be better to use number of bytes as a value. That way, a user has better control over how many bytes they are willing to use for the cache. If it is zero, then no cache. If it is higher, then that.

Also, do we really need two different options? KeepBlocksInCache and then keep indices in cache? Can we bring it down to just one option?

table/table.go, line 620 at r4 (raw file):

func (t *Table) blockOffsetsCacheKey() []byte {
	y.AssertTrue(t.id < math.MaxUint32)
	buf := make([]byte, 4)

This can be optimized. You can allocate 6 bytes upfront. set the first two bytes to bo, and then do the put uint32.

Signed-off-by: Tiger <[email protected]>

poonai

Reviewable status: 0 of 6 files reviewed, 5 unresolved discussions (waiting on @ashish-goswami, @balajijinnah, @jarifibrahim, and @manishrjain)

options.go, line 653 at r4 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

We should mention this option also set maxCacheSize to 100mb.

Done.

options.go, line 667 at r4 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

We should mention this option also set maxCacheSize to 100mb.

Done.

table/table.go, line 620 at r4 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

This can be optimized. You can allocate 6 bytes upfront. set the first two bytes to bo, and then do the put uint32.

Done.

vvbalaji-dgraph · 2020-05-27T14:10:57Z

table/table.go

@@ -45,6 +45,13 @@ import (
 const fileSuffix = ".sst"
 const intSize = int(unsafe.Sizeof(int(0)))

+// 1 word = 8 bytes
+// sizeOfOffsetStruct is the size of pb.BlockOffset
+const sizeOfOffsetStruct int64 = 3*8 + // key array take 3 words


How safe/reliable is this computation?

vvbalaji-dgraph · 2020-05-27T14:14:22Z

options.go

@@ -157,19 +163,23 @@ func DefaultOptions(path string) Options {
 		LogRotatesToFlush:             2,
 		EncryptionKey:                 []byte{},
 		EncryptionKeyRotationDuration: 10 * 24 * time.Hour, // Default 10 days.
+		KeepBlocksInCache:             false,
+		KeepBlockIndicesInCache:       false,


Is it the same cache that is referenced in 21735af?

vvbalaji-dgraph

:lgtm

poonai

Reviewable status: 0 of 6 files reviewed, 7 unresolved discussions (waiting on @ashish-goswami, @balajijinnah, @jarifibrahim, and @manishrjain)

options.go, line 166 at r4 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Might be better to use number of bytes as a value. That way, a user has better control over how many bytes they are willing to use for the cache. If it is zero, then no cache. If it is higher, then that.

Also, do we really need two different options? KeepBlocksInCache and then keep indices in cache? Can we bring it down to just one option?

Intially it was like that, then we changed. People who don't use encryption and compression might want to store their offsets in cache.

…aji/block_optimization

This PR adds 3 new tests and each of them about 2m and so the test suite cannot complete in even 15 minutes.

All review comments have been addressed

manishrjain · 2020-05-29T02:19:53Z

Emm.. I'm not sure if my comments have been addressed. Two things:

The cache size can be configured via options, instead of just asking for a true/false and setting it to 100mb.

This can be optimized. You can allocate 6 bytes upfront. set the first two bytes to bo, and then do the put uint32.

You missed my above point, not sure why. Doing the append at the end would create a new byte slice. You can avoid that by just allocating the 6 bytes, setting the first two to have those two letters and then set the integer in the last 4.

@vvbalaji-dgraph -- Please ensure that my comments get addressed before approving.

poonai · 2020-05-29T05:29:58Z

@manishrjain
I've made the capsize of 6 bytes and made len as 4. So, I can add the number in the begining and append the letter at the end. That hardcoded letter will be part of the stack both ways.

I did this way to match the consistency with the other keys.

Fixes #1335 Signed-off-by: Tiger <[email protected]>

use cache for storing block offsets

1e6a2ed

Signed-off-by: Tiger <[email protected]>

poonai requested review from ashish-goswami, jarifibrahim, manishrjain and a team as code owners May 21, 2020 06:43

jarifibrahim suggested changes May 21, 2020

View reviewed changes

minor

e7988ac

Signed-off-by: Tiger <[email protected]>

poonai commented May 22, 2020

View reviewed changes

poonai added 2 commits May 22, 2020 18:03

add test

6950025

Signed-off-by: Tiger <[email protected]>

Merge branch 'master' of https://github.com/dgraph-io/badger into bal…

74aec25

…aji/block_optimization Signed-off-by: Tiger <[email protected]>

jarifibrahim suggested changes May 26, 2020

View reviewed changes

minor

445dede

Signed-off-by: Tiger <[email protected]>

poonai commented May 26, 2020

View reviewed changes

jarifibrahim approved these changes May 26, 2020

View reviewed changes

manishrjain previously requested changes May 26, 2020

View reviewed changes

fix comment and allocation

e8ab8df

Signed-off-by: Tiger <[email protected]>

poonai commented May 27, 2020

View reviewed changes

vvbalaji-dgraph reviewed May 27, 2020

View reviewed changes

vvbalaji-dgraph approved these changes May 27, 2020

View reviewed changes

poonai commented May 28, 2020

View reviewed changes

poonai and others added 2 commits May 29, 2020 00:05

Merge branch 'master' of https://github.com/dgraph-io/badger into bal…

a82a55f

…aji/block_optimization

Increase test timeout from 15m to 25m

81b0a5d

This PR adds 3 new tests and each of them about 2m and so the test suite cannot complete in even 15 minutes.

jarifibrahim merged commit e7b6e76 into master May 28, 2020

jarifibrahim deleted the balaji/block_optimization branch May 28, 2020 20:53

This was referenced May 28, 2020

Update badger to commit e7b6e76f96e8 hypermodeinc/dgraph#5537

Merged

release/v1.2 Update badger to commit e7b6e76f96e8 #5537 hypermodeinc/dgraph#5538

Merged

release/v20.03 Update badger to commit e7b6e76f96e8 #5537 hypermodeinc/dgraph#5539

Merged

jarifibrahim mentioned this pull request Jun 19, 2020

Improve Badger build time #1376

Closed

jarifibrahim pushed a commit that referenced this pull request Oct 2, 2020

Use cache for storing block offsets (#1336)

d75da11

Fixes #1335 Signed-off-by: Tiger <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use cache for storing block offsets #1336

use cache for storing block offsets #1336

poonai commented May 21, 2020 •

edited by manishrjain

Loading

jarifibrahim left a comment

poonai left a comment

jarifibrahim left a comment

poonai left a comment

jarifibrahim left a comment

manishrjain left a comment

poonai left a comment

vvbalaji-dgraph May 27, 2020

vvbalaji-dgraph May 27, 2020

vvbalaji-dgraph left a comment

poonai left a comment

manishrjain commented May 29, 2020 •

edited

Loading

poonai commented May 29, 2020

use cache for storing block offsets #1336

use cache for storing block offsets #1336

Conversation

poonai commented May 21, 2020 • edited by manishrjain Loading

jarifibrahim left a comment

Choose a reason for hiding this comment

poonai left a comment

Choose a reason for hiding this comment

jarifibrahim left a comment

Choose a reason for hiding this comment

poonai left a comment

Choose a reason for hiding this comment

jarifibrahim left a comment

Choose a reason for hiding this comment

manishrjain left a comment

Choose a reason for hiding this comment

poonai left a comment

Choose a reason for hiding this comment

vvbalaji-dgraph May 27, 2020

Choose a reason for hiding this comment

vvbalaji-dgraph May 27, 2020

Choose a reason for hiding this comment

vvbalaji-dgraph left a comment

Choose a reason for hiding this comment

poonai left a comment

Choose a reason for hiding this comment

manishrjain commented May 29, 2020 • edited Loading

poonai commented May 29, 2020

poonai commented May 21, 2020 •

edited by manishrjain

Loading

manishrjain commented May 29, 2020 •

edited

Loading