Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement BinaryAppender #47

Merged
merged 1 commit into from
Jan 6, 2025
Merged

Implement BinaryAppender #47

merged 1 commit into from
Jan 6, 2025

Conversation

axw
Copy link
Contributor

@axw axw commented Jan 6, 2025

Implement the BinaryAppender interface in sketch types, so they can be marshalled to an existing slice.

Fixes #46

@lukasmalkmus
Copy link
Contributor

Hi @axw, thanks for the PR!

This looks promising so far, although I'm not the one to pull the trigger on this (@seiflotfy is), some general feedback:

  1. Most of our Go OpenSource modules are compatible with the latest two major Go releases (and most likely even more if we're not going all-in on latest language/module additions). This PR would break that promise. My suggestion is to use build tags and keep the old implementation of MarshalBinary around for < Go 1.24. Anything >= Go 1.24, MarshalBinary can call AppendBinary under the hood.
  2. Tests would be great but you pointed that out, already.
  3. Normally I would ask for benchmarks. But looking at the code, it looks like there is no measurable impact to allocations? Unless AppendBinary is called with a large enough buffer, allocations happen just like before with the exception that compressedList.MarshalBinary/AppendBinary now allocates only 8 bytes, initially? Correct me if I'm wrong.

@lukasmalkmus
Copy link
Contributor

Ah, speaking tests, there is one thing I forgot: Ideally a test would also include an Example just to showcase potential memory savings via buffer reuse. Might even want to add comments around MarshalBinary pointing users to AppendBinary.

@axw
Copy link
Contributor Author

axw commented Jan 6, 2025

Ah, speaking tests, there is one thing I forgot: Ideally a test would also include an Example just to showcase potential memory savings via buffer reuse. Might even want to add comments around MarshalBinary pointing users to AppendBinary.

I added a doc comment, but haven't added an Example test. There aren't any existing ones, and adding one just for AppendBinary feels a bit contrived, without making a very complicated example. Let me know if you have any ideas of an example you'd like added.

I did end up adding a benchmark, because I found a win for existing callers of MarshalBinary: compressedList, variableLengthList, and set now all only implement AppendBinary (doesn't matter, they're not exported), and these are used internally by Sketch.AppendBinary. This saves some intermediate allocations & copying even when not passing a buffer in. Comparing to main (I commented out the AppendBinary sub-benchmark):

$ benchstat /tmp/main.txt /tmp/pr.txt 
goos: linux
goarch: amd64
pkg: github.com/axiomhq/hyperloglog
cpu: AMD Ryzen 7 PRO 5850U with Radeon Graphics     
                                                 │ /tmp/main.txt │            /tmp/pr.txt             │
                                                 │    sec/op     │   sec/op     vs base               │
_HLL_Marshal/precision16_sparse/MarshalBinary-16     3.585µ ± 9%   3.245µ ± 7%   -9.48% (p=0.015 n=6)
_HLL_Marshal/precision16_dense/MarshalBinary-16      40.46µ ± 1%   30.54µ ± 7%  -24.51% (p=0.002 n=6)
_HLL_Marshal/precision16_sparse/AppendBinary-16                    2.350µ ± 5%
_HLL_Marshal/precision16_dense/AppendBinary-16                     24.13µ ± 4%
geomean                                              12.04µ        8.658µ       -17.34%

                                                 │ /tmp/main.txt │              /tmp/pr.txt              │
                                                 │     B/op      │     B/op      vs base                 │
_HLL_Marshal/precision16_sparse/MarshalBinary-16    9.883Ki ± 0%   4.508Ki ± 0%  -54.39% (p=0.002 n=6)
_HLL_Marshal/precision16_dense/MarshalBinary-16     72.00Ki ± 0%   72.00Ki ± 0%        ~ (p=1.000 n=6) ¹
_HLL_Marshal/precision16_sparse/AppendBinary-16                      0.000 ± 0%
_HLL_Marshal/precision16_dense/AppendBinary-16                       1.000 ± 0%
geomean                                             26.68Ki                      -32.46%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                                 │ /tmp/main.txt │             /tmp/pr.txt             │
                                                 │   allocs/op   │ allocs/op   vs base                 │
_HLL_Marshal/precision16_sparse/MarshalBinary-16      6.000 ± 0%   3.000 ± 0%  -50.00% (p=0.002 n=6)
_HLL_Marshal/precision16_dense/MarshalBinary-16       1.000 ± 0%   1.000 ± 0%        ~ (p=1.000 n=6) ¹
_HLL_Marshal/precision16_sparse/AppendBinary-16                    0.000 ± 0%
_HLL_Marshal/precision16_dense/AppendBinary-16                     0.000 ± 0%
geomean                                               2.449                    -29.29%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

Stats for just this PR:

$ benchstat /tmp/pr.txt 
goos: linux
goarch: amd64
pkg: github.com/axiomhq/hyperloglog
cpu: AMD Ryzen 7 PRO 5850U with Radeon Graphics     
                                                 │ /tmp/pr.txt │
                                                 │   sec/op    │
_HLL_Marshal/precision16_sparse/MarshalBinary-16   3.245µ ± 7%
_HLL_Marshal/precision16_sparse/AppendBinary-16    2.350µ ± 5%
_HLL_Marshal/precision16_dense/MarshalBinary-16    30.54µ ± 7%
_HLL_Marshal/precision16_dense/AppendBinary-16     24.13µ ± 4%
geomean                                            8.658µ

                                                 │  /tmp/pr.txt   │
                                                 │      B/op      │
_HLL_Marshal/precision16_sparse/MarshalBinary-16   4.508Ki ± 0%
_HLL_Marshal/precision16_sparse/AppendBinary-16      0.000 ± 0%
_HLL_Marshal/precision16_dense/MarshalBinary-16    72.00Ki ± 0%
_HLL_Marshal/precision16_dense/AppendBinary-16       1.000 ± 0%
geomean                                                         ¹
¹ summaries must be >0 to compute geomean

                                                 │ /tmp/pr.txt  │
                                                 │  allocs/op   │
_HLL_Marshal/precision16_sparse/MarshalBinary-16   3.000 ± 0%
_HLL_Marshal/precision16_sparse/AppendBinary-16    0.000 ± 0%
_HLL_Marshal/precision16_dense/MarshalBinary-16    1.000 ± 0%
_HLL_Marshal/precision16_dense/AppendBinary-16     0.000 ± 0%
geomean                                                       ¹
¹ summaries must be >0 to compute geomean

@axw axw marked this pull request as ready for review January 6, 2025 09:27
@axiomhq axiomhq deleted a comment from axw Jan 6, 2025
@lukasmalkmus
Copy link
Contributor

lukasmalkmus commented Jan 6, 2025

@axw Cool, just checked all the changes and this looks very promising!

Sorry, had to delete your first comment answering my questions: I messed up when trying to quote your response and ended up overwriting your response with my answer 🤦 No undo button on GitHub. Uff, what a rough first day back on the internet after the holidays 😅 Sorry about that.

Actually there is an undo button on GitHub but I was to fast deleting your comment that I messed up 🤦

@seiflotfy seiflotfy merged commit ccca8cb into axiomhq:main Jan 6, 2025
@axw
Copy link
Contributor Author

axw commented Jan 7, 2025

Haha no worries @lukasmalkmus :)
Thanks for the feedback!

@axw axw deleted the binaryappender branch January 7, 2025 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement BinaryAppender interface coming in 1.24
3 participants