Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deflate: Move advanced compression state #107

Merged
merged 3 commits into from
Jun 5, 2019
Merged

Conversation

klauspost
Copy link
Owner

@klauspost klauspost commented Jun 1, 2019

Only allocate advanced compression state when actually needed.

This will reduce upfront allocations for levels < 5 significantly (reduced by approximately 750K)

Much less risk than #70 - but not as clean.

Of course this is only relevant when Reset() is not used.

Benchmark: https://gist.github.com/klauspost/f5df3a3522ac4bcb3bcde448872dffe6
Before:

BenchmarkCompressAllocations/level(-2)/flate-8   	    5000	    306197 ns/op	 1000880 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(-2)/gzip-8    	    5000	    279000 ns/op	 1001056 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(-1)/flate-8   	    5000	    284000 ns/op	 1003184 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(-1)/gzip-8    	    5000	    312400 ns/op	 1003360 B/op	      16 allocs/op
BenchmarkCompressAllocations/level(0)/flate-8    	   10000	    177000 ns/op	  996273 B/op	      12 allocs/op
BenchmarkCompressAllocations/level(0)/gzip-8     	   10000	    207600 ns/op	  996448 B/op	      13 allocs/op
BenchmarkCompressAllocations/level(1)/flate-8    	    5000	    248399 ns/op	 1003184 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(1)/gzip-8     	   10000	    219800 ns/op	 1003360 B/op	      16 allocs/op
BenchmarkCompressAllocations/level(2)/flate-8    	    5000	    252599 ns/op	 1207984 B/op	      17 allocs/op
BenchmarkCompressAllocations/level(2)/gzip-8     	    5000	    261799 ns/op	 1208160 B/op	      18 allocs/op
BenchmarkCompressAllocations/level(3)/flate-8    	    5000	    336599 ns/op	 1339056 B/op	      17 allocs/op
BenchmarkCompressAllocations/level(3)/gzip-8     	    5000	    298599 ns/op	 1339232 B/op	      18 allocs/op
BenchmarkCompressAllocations/level(4)/flate-8    	    5000	    338400 ns/op	 1339056 B/op	      17 allocs/op
BenchmarkCompressAllocations/level(4)/gzip-8     	    5000	    298207 ns/op	 1339232 B/op	      18 allocs/op
BenchmarkCompressAllocations/level(5)/flate-8    	    5000	    273799 ns/op	 1003184 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(5)/gzip-8     	    5000	    285599 ns/op	 1003360 B/op	      16 allocs/op
BenchmarkCompressAllocations/level(6)/flate-8    	    5000	    274599 ns/op	 1003184 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(6)/gzip-8     	    5000	    300200 ns/op	 1003360 B/op	      16 allocs/op
BenchmarkCompressAllocations/level(7)/flate-8    	    5000	    277201 ns/op	 1003184 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(7)/gzip-8     	    5000	    282399 ns/op	 1003360 B/op	      16 allocs/op
BenchmarkCompressAllocations/level(8)/flate-8    	    5000	    273199 ns/op	 1003184 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(8)/gzip-8     	    5000	    278199 ns/op	 1003360 B/op	      16 allocs/op
BenchmarkCompressAllocations/level(9)/flate-8    	    5000	    273807 ns/op	 1003184 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(9)/gzip-8     	    5000	    274001 ns/op	 1003360 B/op	      16 allocs/op

After:

BenchmarkCompressAllocations/level(-2)/flate-8   	   20000	     65653 ns/op	  345520 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(-2)/gzip-8    	   20000	     66400 ns/op	  345696 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(-1)/flate-8   	    3000	    355331 ns/op	 1011377 B/op	      16 allocs/op
BenchmarkCompressAllocations/level(-1)/gzip-8    	    3000	    364333 ns/op	 1011554 B/op	      17 allocs/op
BenchmarkCompressAllocations/level(0)/flate-8    	   20000	     56000 ns/op	  340912 B/op	      12 allocs/op
BenchmarkCompressAllocations/level(0)/gzip-8     	   30000	     57966 ns/op	  341088 B/op	      13 allocs/op
BenchmarkCompressAllocations/level(1)/flate-8    	   20000	     65050 ns/op	  347824 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(1)/gzip-8     	   20000	     66250 ns/op	  348000 B/op	      16 allocs/op
BenchmarkCompressAllocations/level(2)/flate-8    	   10000	    107800 ns/op	  552628 B/op	      17 allocs/op
BenchmarkCompressAllocations/level(2)/gzip-8     	   10000	    100600 ns/op	  552804 B/op	      18 allocs/op
BenchmarkCompressAllocations/level(3)/flate-8    	   10000	    125400 ns/op	  683703 B/op	      17 allocs/op
BenchmarkCompressAllocations/level(3)/gzip-8     	   10000	    122300 ns/op	  683878 B/op	      18 allocs/op
BenchmarkCompressAllocations/level(4)/flate-8    	   10000	    120200 ns/op	  683701 B/op	      17 allocs/op
BenchmarkCompressAllocations/level(4)/gzip-8     	   10000	    116500 ns/op	  683877 B/op	      18 allocs/op
BenchmarkCompressAllocations/level(5)/flate-8    	    3000	    398334 ns/op	 1011378 B/op	      16 allocs/op
BenchmarkCompressAllocations/level(5)/gzip-8     	    5000	    384200 ns/op	 1011554 B/op	      17 allocs/op
BenchmarkCompressAllocations/level(6)/flate-8    	    5000	    440799 ns/op	 1011378 B/op	      16 allocs/op
BenchmarkCompressAllocations/level(6)/gzip-8     	    3000	    399333 ns/op	 1011553 B/op	      17 allocs/op
BenchmarkCompressAllocations/level(7)/flate-8    	    5000	    346799 ns/op	 1011377 B/op	      16 allocs/op
BenchmarkCompressAllocations/level(7)/gzip-8     	    5000	    316600 ns/op	 1011552 B/op	      17 allocs/op
BenchmarkCompressAllocations/level(8)/flate-8    	    5000	    408000 ns/op	 1011378 B/op	      16 allocs/op
BenchmarkCompressAllocations/level(8)/gzip-8     	    5000	    351599 ns/op	 1011553 B/op	      17 allocs/op
BenchmarkCompressAllocations/level(9)/flate-8    	    5000	    307600 ns/op	 1011377 B/op	      16 allocs/op
BenchmarkCompressAllocations/level(9)/gzip-8     	    5000	    317199 ns/op	 1011553 B/op	      17 allocs/op
PASS

Only allocate advanced compression state when actually needed.

This will reduce upfront allocations for levels < 5 significantly.
@klauspost klauspost changed the title Move advanced compression state deflate: Move advanced compression state Jun 1, 2019
@klauspost klauspost merged commit 8665cc6 into master Jun 5, 2019
@klauspost klauspost deleted the leaner-interface-pt2 branch June 5, 2019 17:04
@nhooyr
Copy link

nhooyr commented Feb 15, 2020

Was this change reverted?

These are the results I'm getting on master:

$ go test -bench=BenchmarkCompressAllocations
goos: darwin
goarch: amd64
pkg: github.com/klauspost/compress
BenchmarkCompressAllocations/level(-2)/flate-8   	   29196	     41301 ns/op	  342272 B/op	      11 allocs/op
BenchmarkCompressAllocations/level(-2)/gzip-8    	   28676	     56400 ns/op	  342448 B/op	      12 allocs/op
BenchmarkCompressAllocations/level(-1)/flate-8   	    2926	    400765 ns/op	 3235206 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(-1)/gzip-8    	    4066	    386490 ns/op	 3235383 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(0)/flate-8    	   30462	     48217 ns/op	  339968 B/op	       9 allocs/op
BenchmarkCompressAllocations/level(0)/gzip-8     	   21348	     49214 ns/op	  340144 B/op	      10 allocs/op
BenchmarkCompressAllocations/level(1)/flate-8    	    5511	    213282 ns/op	 2186627 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(1)/gzip-8     	    5860	    212243 ns/op	 2186803 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(2)/flate-8    	    2802	    389814 ns/op	 3759528 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(2)/gzip-8     	    2666	    391217 ns/op	 3759700 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(3)/flate-8    	    4395	    291127 ns/op	 2710922 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(3)/gzip-8     	    4207	    302403 ns/op	 2711098 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(4)/flate-8    	    3687	    360733 ns/op	 2710924 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(4)/gzip-8     	    3758	    475015 ns/op	 2711098 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(5)/flate-8    	    2434	    455862 ns/op	 3235206 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(5)/gzip-8     	    2836	    366831 ns/op	 3235382 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(6)/flate-8    	    3193	    386058 ns/op	 3235205 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(6)/gzip-8     	    3214	    398171 ns/op	 3235383 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(7)/flate-8    	    8619	    139871 ns/op	 1006982 B/op	      13 allocs/op
BenchmarkCompressAllocations/level(7)/gzip-8     	    8251	    151450 ns/op	 1007158 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(8)/flate-8    	   10000	    129895 ns/op	 1006982 B/op	      13 allocs/op
BenchmarkCompressAllocations/level(8)/gzip-8     	   10000	    109056 ns/op	 1007157 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(9)/flate-8    	   10000	    122040 ns/op	 1006981 B/op	      13 allocs/op
BenchmarkCompressAllocations/level(9)/gzip-8     	   10000	    127054 ns/op	 1007157 B/op	      14 allocs/op
PASS
ok  	github.com/klauspost/compress	37.086s

Much higher than those reported in this PR.

I had similar results testing on my linux VM.

@klauspost
Copy link
Owner Author

@nhooyr This PR is about static allocations (allocations kept in the writer between writes/resets), though this seems rather high.

Could you share the benchmark code?

@nhooyr
Copy link

nhooyr commented Feb 15, 2020

It's the same benchmark in this PR.

func BenchmarkCompressAllocations(b *testing.B) {

@klauspost
Copy link
Owner Author

Ah, thanks :)

@klauspost
Copy link
Owner Author

@nhooyr It changed as part of the rewrite in #105

I added #223

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants