-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove mallocs from Compress/Decompress #31
Remove mallocs from Compress/Decompress #31
Conversation
Go allocates a copy of slice headers when they are passed to C functions. See golang/go#24450 for details. Avoid this by passing unsafe.Pointer as C.uintptr_t into C functions. Benchmark results: $ PAYLOAD=ZSTD_LICENSE go test -run=111 -bench='k[CD]' -benchmem -count=10 name old time/op new time/op delta Compression-4 21.2µs ± 5% 20.8µs ± 6% ~ (p=0.190 n=10+10) Decompression-4 7.37µs ± 5% 6.93µs ± 6% -5.95% (p=0.001 n=10+10) name old speed new speed delta Compression-4 62.5MB/s ± 5% 63.7MB/s ± 6% ~ (p=0.190 n=10+10) Decompression-4 180MB/s ± 5% 191MB/s ± 6% +6.35% (p=0.001 n=10+10) name old alloc/op new alloc/op delta Compression-4 96.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10) Decompression-4 96.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10) name old allocs/op new allocs/op delta Compression-4 4.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) Decompression-4 4.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
I did some simple tests and benchmarks and it looks good.
I just want to benchmark against bigger payloads over the weekend.
Do you mind doing the tiny change on the other comment ?
@@ -139,6 +139,8 @@ func BenchmarkStreamCompression(b *testing.B) { | |||
if err != nil { | |||
b.Fatalf("Failed writing to compress object: %s", err) | |||
} | |||
// Prevent from unbound buffer growth. | |||
intermediate.Reset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would do the Reset()
outside the for loop though because it skew the results if it's in the loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Reset
must be in the loop, since otherwise:
- The performance of the code inside loop will depend on the value of
b.N
. This contradicts the benchmarking rules and makes the results non-determenistic. See https://dave.cheney.net/2013/06/30/how-to-write-benchmarks-in-go . - The amount of required memory for each benchmark run will be proportional to
b.N
. This may lead to out of memory error on a system with low amount of memory.
The bytes.Buffer.Reset skews the benchmark results by a few nanoseconds. This is a noise comparing to 20 microseconds spent on the data compression at the line 138.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just want to benchmark against bigger payloads over the weekend.
This makes little sense, since the time required for compressing bigger payloads will be much bigger comparing to the time spent on the (eliminated) memory allocations. This patch improves performance for small payloads only.
@@ -139,6 +139,8 @@ func BenchmarkStreamCompression(b *testing.B) { | |||
if err != nil { | |||
b.Fatalf("Failed writing to compress object: %s", err) | |||
} | |||
// Prevent from unbound buffer growth. | |||
intermediate.Reset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Reset
must be in the loop, since otherwise:
- The performance of the code inside loop will depend on the value of
b.N
. This contradicts the benchmarking rules and makes the results non-determenistic. See https://dave.cheney.net/2013/06/30/how-to-write-benchmarks-in-go . - The amount of required memory for each benchmark run will be proportional to
b.N
. This may lead to out of memory error on a system with low amount of memory.
The bytes.Buffer.Reset skews the benchmark results by a few nanoseconds. This is a noise comparing to 20 microseconds spent on the data compression at the line 138.
Yes sorry, I meant test on a large payload corpus over the weekend (vs. a bigger single payload). |
Go allocates a copy of slice headers when they are passed to C functions.
See golang/go#24450 for details.
Avoid this by passing unsafe.Pointer as C.uintptr_t into C functions.