exact slice allocation for fixed size repeated packed fields #437

gunnihinn · 2018-07-25T13:43:03Z

See comments in commit message and issue 436.

This is a regression test for issue 436: gogo#436

Suppose we have the protobuf definition message Foo { repeated type Stuff = 1; } that we deserialize in a fairly common way: f := &Foo{} f.Unmarshal(blob) Before the call to Unmarshal, `f.Stuff` will be a slice of length 0, so the Unmarshal operation will more or less be: for _, x := range xs { f.Stuff = append(f.Stuff, x) } If we don't know how many elements we're going to deserialize beforehand, this is the best we can do. Suppose, however, that we know that we're going to deserialize n elements. If k is such that 2^k < n <= 2^{k+1}, then the Go runtime's exponential doubling strategy for resizing the arrays that back slices will cause us to allocate memory for at least 1 + 2 + ... + 2^{k+1} = 2 * 2^{k+1} elements, which is usually more than double what we actually need. When we deserialize packed fields, we know how many bytes we're going to deserialize before we start the default append loop. If we furthermore know how many elements those bytes correspond to, which we do when the protobuf wire type corresponding to `type` has fixed length [1], we can prepend the default append loop with f.Stuff = make([]type, 0, n) and ask for exactly the memory we're going to use. This results in considerable memory savings, between 50 and 80 percent, compared with the default strategy. These savings are important to people who use protobuf to communicate things like time series between services, which consist almost entirely of large arrays of floats and doubles. This fixes gogo#436. It's conceivable to implement similar things for packed types of non-fixed length. They're encoded with varints, and we _could_ run through the byte stream we're going to deserialize and count how many bytes don't have the most significant bit set, but the performance implications of that seem less predictable than of the simple division we can perform here. [1] https://developers.google.com/protocol-buffers/docs/encoding#structure

Travis CI is unhappy with me.

gunnihinn · 2018-07-26T11:50:59Z

I thought a bit more about the throwaway comment about packed types of non-fixed length. I think it's worth a shot to do that, and tune it in such a way that maybe it only kicks in if we're sure there are enough varints to deserialize to be worth it. It sounds like a nice project for junior developers, so I'm fishing for some at $work to see if anyone wants to have a go at this. It might take a little while to organize, so I think we'll do a separate PR for that.

awalterschulze · 2018-07-30T15:16:41Z

Thank you so much @gunnihinn this looks very promising.

@donaldgraham would you mind reviewing this?

donaldgraham · 2018-07-31T14:13:49Z

This looks great. I like the way the test expresses the expectation that we allocate less than half of the memory we had allocated before...

awalterschulze · 2018-07-31T14:24:01Z

Thank you so much @gunnihinn and @donaldgraham

... and generate the protobuf files with the latest version. We really want to run at least commit 5440baf of gogo/protobuf, because it contains a pull request we got merged upstream: gogo/protobuf#437 The effect of the PR is to reduce the memory allocated by the carbonapi_v3_pb.MultiFetchResponse.Unmarshal method by at least 50% and up to 80%. This is very relevant to Graphite's interests.

Gunnar Þór Magnússon added 3 commits July 25, 2018 15:22

Add a test for Unmarshal memory use

a5d95dd

This is a regression test for issue 436: gogo#436

Commit modified test files

583cb96

Travis CI is unhappy with me.

gunnihinn mentioned this pull request Jul 27, 2018

Unmarshal allocates twice the memory needed for repeated fields #436

Closed

awalterschulze changed the title ~~Gmagnusson/exact slice allocation~~ exact slice allocation for fixed size repeated packed fields Jul 31, 2018

awalterschulze merged commit 5440baf into gogo:master Jul 31, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

exact slice allocation for fixed size repeated packed fields #437

exact slice allocation for fixed size repeated packed fields #437

gunnihinn commented Jul 25, 2018

gunnihinn commented Jul 26, 2018

awalterschulze commented Jul 30, 2018

donaldgraham commented Jul 31, 2018

awalterschulze commented Jul 31, 2018

exact slice allocation for fixed size repeated packed fields #437

exact slice allocation for fixed size repeated packed fields #437

Conversation

gunnihinn commented Jul 25, 2018

gunnihinn commented Jul 26, 2018

awalterschulze commented Jul 30, 2018

donaldgraham commented Jul 31, 2018

awalterschulze commented Jul 31, 2018