Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Allocate exactly the memory needed for packed fields
Suppose we have the protobuf definition message Foo { repeated type Stuff = 1; } that we deserialize in a fairly common way: f := &Foo{} f.Unmarshal(blob) Before the call to Unmarshal, `f.Stuff` will be a slice of length 0, so the Unmarshal operation will more or less be: for _, x := range xs { f.Stuff = append(f.Stuff, x) } If we don't know how many elements we're going to deserialize beforehand, this is the best we can do. Suppose, however, that we know that we're going to deserialize n elements. If k is such that 2^k < n <= 2^{k+1}, then the Go runtime's exponential doubling strategy for resizing the arrays that back slices will cause us to allocate memory for at least 1 + 2 + ... + 2^{k+1} = 2 * 2^{k+1} elements, which is usually more than double what we actually need. When we deserialize packed fields, we know how many bytes we're going to deserialize before we start the default append loop. If we furthermore know how many elements those bytes correspond to, which we do when the protobuf wire type corresponding to `type` has fixed length [1], we can prepend the default append loop with f.Stuff = make([]type, 0, n) and ask for exactly the memory we're going to use. This results in considerable memory savings, between 50 and 80 percent, compared with the default strategy. These savings are important to people who use protobuf to communicate things like time series between services, which consist almost entirely of large arrays of floats and doubles. This fixes gogo#436. It's conceivable to implement similar things for packed types of non-fixed length. They're encoded with varints, and we _could_ run through the byte stream we're going to deserialize and count how many bytes don't have the most significant bit set, but the performance implications of that seem less predictable than of the simple division we can perform here. [1] https://developers.google.com/protocol-buffers/docs/encoding#structure
- Loading branch information