[dbnode] Use already encoded tags when writing time series to commit log #1898

robskillington · 2019-08-20T19:58:12Z

What this PR does / why we need it:

This improves raw commit log write speed by using the encoded tags from the RPC batch rather than encoding from the newly created series tags. This is achieved by extending the lifetime of the tags from the RPC write request.

It also adds the ability to set in the pooling config a value for thriftBytesPoolAllocSize which can adjust from the default 1024 value a higher or lower to provided pooled bytes to thrift binary fields. This enables tweaking for workloads with a lot of tags, in the future this should hopefully be more adaptive.

Special notes for your reviewer:

Does this PR introduce a user-facing and/or backwards incompatible change?:

NONE

Does this PR require updating code package or user-facing documentation?:

NONE

…ncode-tags

…o r/fast-encode-tags

richardartoul

LGTM minus the nits

richardartoul · 2019-08-21T13:45:04Z

glide.yaml

@@ -28,7 +28,7 @@ import:
    version: ^0.8

  - package: github.com/apache/thrift
-    version: 0.9.3-pool-read-binary-2
+    version: 0.9.3-pool-read-binary-3


Can we just merge this into master of our branch? Pretty confusing that we're pinned to a branch

Not opposed to it, problem is master is way ahead of where we are at, so we'd have to reset master all the way back to 0.9.3 on our fork which would be weird...

richardartoul · 2019-08-21T13:48:32Z

src/cmd/services/m3dbnode/config/pooling.go

@@ -252,6 +253,9 @@ type PoolingPolicy struct {
 	// The initial alloc size for a block.
 	BlockAllocSize *int `yaml:"blockAllocSize"`

+	// The thrift bytes pool max bytes slice allocation for a single binary field.
+	ThriftBytesPoolMaxAllocSize *int `yaml:"thriftBytesPoolMaxAllocSize"`


Its not really a max is it right? its always this size?

The capacity is always this size, but the actual length/size is determined by the length of the bytes being copied.

Since we suffix it with "Alloc" I'm not opposed to just calling it AllocSize and dropping the Max.

richardartoul · 2019-08-21T13:49:56Z

src/dbnode/network/server/tchannelthrift/node/service.go

@@ -1764,7 +1769,11 @@ func (r *writeBatchPooledReq) Finalize() {
 	if r.writeTaggedReq != nil {
 		for _, elem := range r.writeTaggedReq.Elements {
 			apachethrift.BytesPoolPut(elem.ID)
-			apachethrift.BytesPoolPut(elem.EncodedTags)
+			// Ownership of the encoded tagts has been transferred to the BatchWriter


tags -> tags

richardartoul · 2019-08-21T13:52:17Z

src/dbnode/ts/types.go

@@ -72,9 +76,13 @@ type Series struct {
 	// ID is the series identifier.
 	ID ident.ID

-	// Tags are the series tags.
+	// Tags is the series tags.
 	Tags ident.Tags


Is this still used anywhere?

Yeah, it's used on the read side and it was going to be very difficult to remove unfortunately.

richardartoul · 2019-08-21T13:53:08Z

src/dbnode/ts/write_batch.go

 	"time"

 	"github.com/m3db/m3/src/x/ident"
 	xtime "github.com/m3db/m3/src/x/time"
 )

+var (
+	errTagsAndEncodedTagsRequired = errors.New("tags iterator and encoded tags required to be provided")


"required to be provided" sounds funny. Maybe: "must be provided"

codecov · 2019-08-21T15:51:58Z

Codecov Report

Merging #1898 into master will increase coverage by 33.6%.
The diff coverage is 71.7%.

@@            Coverage Diff            @@
##           master   #1898      +/-   ##
=========================================
+ Coverage    32.1%   65.7%   +33.6%     
=========================================
  Files           4     773     +769     
  Lines         342   71560   +71218     
=========================================
+ Hits          110   47081   +46971     
- Misses        222   20924   +20702     
- Partials       10    3555    +3545

Flag	Coverage Δ
#aggregator	`81% <ø> (?)`
#cluster	`85% <ø> (?)`
#collector	`54.9% <ø> (+22.8%)`	⬆️
#dbnode	`68% <71.7%> (?)`
#m3em	`62% <ø> (?)`
#m3ninx	`68.7% <ø> (?)`
#m3nsch	`28.4% <ø> (?)`
#metrics	`17.5% <ø> (?)`
#msg	`74.9% <ø> (?)`
#query	`70.7% <ø> (?)`
#x	`78.7% <ø> (?)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 57d533f...63c1bf2. Read the comment docs.

…ncode-tags

robskillington and others added 9 commits August 20, 2019 15:56

[dbnode] Use already encoded tags when writing time series to commit log

dbd2a92

Fix config test

58e51a7

Revert vagrantfile changes

f519dc1

Merge branch 'master' into r/fast-encode-tags

6baa76f

Fix build

2a08f53

Merge branch 'r/fast-encode-tags' of github.com:m3db/m3 into r/fast-e…

40ded18

…ncode-tags

Merge branch 'master' into r/fast-encode-tags

286314a

Generate mocks

de08040

Merge branch 'r/fast-encode-tags' of github.com:robskillington/m3 int…

c86c69b

…o r/fast-encode-tags

richardartoul approved these changes Aug 21, 2019

View reviewed changes

robskillington added 2 commits August 21, 2019 11:50

Feedback and fix tests

dedfc89

Merge branch 'master' into r/fast-encode-tags

fe1475a

Richard Artoul and others added 7 commits August 21, 2019 15:16

Merge branch 'master' into r/fast-encode-tags

63c1bf2

Fix tests

d0dfa3c

Merge branch 'r/fast-encode-tags' of github.com:m3db/m3 into r/fast-e…

892da81

…ncode-tags

Merge branch 'master' into r/fast-encode-tags

ff0248c

Use n1 type nodes

d0efa4a

Merge branch 'r/fast-encode-tags' of github.com:m3db/m3 into r/fast-e…

a35b6ad

…ncode-tags

Use normal benchmark vagrantfile

6c44a91

robskillington merged commit 12ed1c2 into master Aug 21, 2019

robskillington deleted the r/fast-encode-tags branch August 21, 2019 22:27

robskillington mentioned this pull request Aug 23, 2019

[WIP][DBNode] - Optimize Tag Encoder #1866

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dbnode] Use already encoded tags when writing time series to commit log #1898

[dbnode] Use already encoded tags when writing time series to commit log #1898

robskillington commented Aug 20, 2019 •

edited

Loading

richardartoul left a comment

richardartoul Aug 21, 2019

robskillington Aug 21, 2019

richardartoul Aug 21, 2019

robskillington Aug 21, 2019

richardartoul Aug 21, 2019

richardartoul Aug 21, 2019

robskillington Aug 21, 2019

richardartoul Aug 21, 2019

codecov bot commented Aug 21, 2019 •

edited

Loading

[dbnode] Use already encoded tags when writing time series to commit log #1898

[dbnode] Use already encoded tags when writing time series to commit log #1898

Conversation

robskillington commented Aug 20, 2019 • edited Loading

richardartoul left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Aug 21, 2019 • edited Loading

Codecov Report

robskillington commented Aug 20, 2019 •

edited

Loading

codecov bot commented Aug 21, 2019 •

edited

Loading