Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generate in goroutines #51

Closed
wants to merge 14 commits into from
Closed

Conversation

aspacca
Copy link
Contributor

@aspacca aspacca commented Feb 1, 2023

main branch:

goos: darwin
goarch: amd64
pkg: github.com/elastic/elastic-integration-corpus-generator-tool/pkg/genlib
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Benchmark_GeneratorCustomTemplateJSONContent-16    	  235588	     47286 ns/op	     432 B/op	      14 allocs/op
Benchmark_GeneratorTextTemplateJSONContent-16      	   24507	    487438 ns/op	   48725 B/op	    2252 allocs/op
Benchmark_GeneratorCustomTemplateVPCFlowLogs-16    	 6558933	      1840 ns/op	      64 B/op	       2 allocs/op
Benchmark_GeneratorTextTemplateVPCFlowLogs-16      	  526669	     22920 ns/op	    2322 B/op	      95 allocs/op
PASS
ok  	github.com/elastic/elastic-integration-corpus-generator-tool/pkg/genlib	57.004s

this branch:

goos: darwin
goarch: amd64
pkg: github.com/elastic/elastic-integration-corpus-generator-tool/pkg/genlib
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Benchmark_GeneratorCustomTemplateJSONContent-16    	   97422	    145236 ns/op	   18980 B/op	     972 allocs/op
Benchmark_GeneratorTextTemplateJSONContent-16      	   18703	    820674 ns/op	   49159 B/op	    2265 allocs/op
Benchmark_GeneratorCustomTemplateVPCFlowLogs-16    	 1504225	      8486 ns/op	     501 B/op	      35 allocs/op
Benchmark_GeneratorTextTemplateVPCFlowLogs-16      	  299754	     43765 ns/op	    2540 B/op	     111 allocs/op
PASS
ok  	github.com/elastic/elastic-integration-corpus-generator-tool/pkg/genlib	73.641s

@aspacca aspacca self-assigned this Feb 1, 2023
@aspacca aspacca requested a review from endorama February 1, 2023 07:25
@aspacca
Copy link
Contributor Author

aspacca commented Feb 1, 2023

@endorama you were right

tweaking the channel size seems to show different performance behaviour and some improvement, but I'd say it's not reliable, since it's probably very relevant to the host machine

@elasticmachine
Copy link

elasticmachine commented Feb 1, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-02-27T08:13:37.515+0000

  • Duration: 3 min 46 sec

Test stats 🧪

Test Results
Failed 0
Passed 65
Skipped 0
Total 65

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@aspacca
Copy link
Contributor Author

aspacca commented Feb 15, 2023

@endorama I've built different binaries across the refactoring at various commit in the branch

here's some outcomes

  • comparing legacy (pre custom template) with current main and using goroutines with unbuffered channel (gen-with-custom_template-goroutines). each binary was built with the default behaviour to write to file and with replacing to write to /dev/null or using a io.Discard (I wanted to assess the overhead coming disk access). still main is faster
./gen-with-custom_template-goroutines generate aws dynamodb 1.28.3 -t 20G
85.81user 24.40system 1:48.29elapsed 101%CPU (0avgtext+0avgdata 34880maxresident)k
0inputs+39434472outputs (0major+15962minor)pagefaults 0swaps

./gen-with-custom_template-goroutines-dev.null generate aws dynamodb 1.28.3 -t 20G
79.84user 7.85system 1:26.60elapsed 101%CPU (0avgtext+0avgdata 35412maxresident)k
0inputs+0outputs (0major+9166minor)pagefaults 0swaps

./gen-with-custom_template-goroutines-io.discard generate aws dynamodb 1.28.3 -t 20G
74.79user 5.07system 1:18.87elapsed 101%CPU (0avgtext+0avgdata 34884maxresident)k
0inputs+0outputs (0major+10965minor)pagefaults 0swaps

./gen-with-custom_template-main generate aws dynamodb 1.28.3 -t 20G
81.36user 24.34system 1:43.85elapsed 101%CPU (0avgtext+0avgdata 33904maxresident)k
0inputs+39062504outputs (0major+20773minor)pagefaults 0swaps

./gen-with-custom_template-main-dev.null generate aws dynamodb 1.28.3 -t 20G
76.97user 7.62system 1:23.50elapsed 101%CPU (0avgtext+0avgdata 35452maxresident)k
0inputs+0outputs (0major+12925minor)pagefaults 0swaps

./gen-with-custom_template-main-io.discard generate aws dynamodb 1.28.3 -t 20G
72.74user 4.72system 1:16.48elapsed 101%CPU (0avgtext+0avgdata 36680maxresident)k
0inputs+0outputs (0major+17456minor)pagefaults 0swaps

./gen-with-legacy generate aws dynamodb 1.28.3 -t 20G
102.62user 26.73system 2:06.55elapsed 102%CPU (0avgtext+0avgdata 41588maxresident)k
128inputs+39062504outputs (1major+20325minor)pagefaults 0swaps

./gen-with-legacy-dev.null generate aws dynamodb 1.28.3 -t 20G
94.32user 8.19system 1:40.53elapsed 101%CPU (0avgtext+0avgdata 34824maxresident)k
328inputs+0outputs (4major+19482minor)pagefaults 0swaps

./gen-with-legacy-io.discard generate aws dynamodb 1.28.3 -t 20G
90.58user 5.30system 1:33.94elapsed 102%CPU (0avgtext+0avgdata 40156maxresident)k
328inputs+0outputs (4major+23288minor)pagefaults 0swaps
  • comparing text template current main and using goroutines with unbuffered channel (./gen-with-text_template-goroutines-unbufferedchan), with channel buffered to half of runtime.GOMAXPROCS(0) (./gen-with-text_template-goroutines-chansize) and without channel but with a state local to every field (./gen-with-text_template-goroutines-nochan). as before each binary was built with the default behaviour to write to file and with replacing to write to /dev/null or using a io.Discard. still main is faster
./gen-with-text_template-goroutines-chansize generate aws dynamodb 1.28.3 -t 20G
1866.58user 297.94system 17:14.26elapsed 209%CPU (0avgtext+0avgdata 41056maxresident)k
160inputs+40217872outputs (3major+2666517minor)pagefaults 0swaps

./gen-with-text_template-goroutines-chansize-dev.null generate aws dynamodb 1.28.3 -t 20G
1800.80user 240.11system 16:12.13elapsed 209%CPU (0avgtext+0avgdata 42172maxresident)k
160inputs+0outputs (3major+2898400minor)pagefaults 0swaps

./gen-with-text_template-goroutines-chansize-io.discard generate aws dynamodb 1.28.3 -t 20G
1796.17user 236.99system 16:03.14elapsed 211%CPU (0avgtext+0avgdata 45364maxresident)k
160inputs+0outputs (3major+3265729minor)pagefaults 0swaps

./gen-with-text_template-goroutines-nochan generate aws dynamodb 1.28.3 -t 20G
910.16user 74.99system 14:27.17elapsed 113%CPU (0avgtext+0avgdata 37504maxresident)k
368inputs+39185808outputs (4major+538156minor)pagefaults 0swaps

./gen-with-text_template-goroutines-nochan-dev.null generate aws dynamodb 1.28.3 -t 20G
896.23user 51.32system 13:50.42elapsed 114%CPU (0avgtext+0avgdata 46144maxresident)k
376inputs+0outputs (4major+536475minor)pagefaults 0swaps

./gen-with-text_template-goroutines-nochan-io.discard generate aws dynamodb 1.28.3 -t 20G
892.24user 43.21system 13:39.13elapsed 114%CPU (0avgtext+0avgdata 36460maxresident)k
368inputs+0outputs (4major+491776minor)pagefaults 0swaps

./gen-with-text_template-goroutines-unbufferedchan generate aws dynamodb 1.28.3 -t 20G
1801.91user 292.29system 16:47.15elapsed 207%CPU (0avgtext+0avgdata 44076maxresident)k
160inputs+39289992outputs (3major+2758282minor)pagefaults 0swaps

./gen-with-text_template-goroutines-unbufferedchan-dev.null generate aws dynamodb 1.28.3 -t 20G
1835.62user 242.94system 16:36.15elapsed 208%CPU (0avgtext+0avgdata 36960maxresident)k
160inputs+0outputs (3major+2822023minor)pagefaults 0swaps

./gen-with-text_template-goroutines-unbufferedchan-io.discard generate aws dynamodb 1.28.3 -t 20G
1873.59user 240.32system 16:49.39elapsed 209%CPU (0avgtext+0avgdata 42716maxresident)k
160inputs+0outputs (3major+2730760minor)pagefaults 0swaps

./gen-with-text_template-main generate aws dynamodb 1.28.3 -t 20G
767.99user 52.43system 12:50.66elapsed 106%CPU (0avgtext+0avgdata 38352maxresident)k
328inputs+39062728outputs (4major+371663minor)pagefaults 0swaps

./gen-with-text_template-main-dev.null generate aws dynamodb 1.28.3 -t 20G
747.52user 25.22system 12:04.20elapsed 106%CPU (0avgtext+0avgdata 42136maxresident)k
448inputs+0outputs (5major+357297minor)pagefaults 0swaps

./gen-with-text_template-main-io.discard generate aws dynamodb 1.28.3 -t 20G
736.85user 22.40system 11:51.84elapsed 106%CPU (0avgtext+0avgdata 40688maxresident)k
320inputs+0outputs (4major+374720minor)pagefaults 0swaps

all the above used no specific generator configuration

I've run another test using the ec2_metrics template (that has an high cardinality on several fields).
it's comparing text template with current mai and using goroutines with channels buffered to half of runtime.GOMAXPROCS(0) (./gen-with-text_template-goroutines-chansize) and without gouroutines/channels but with a state local to every field (./gen-with-text_template-goroutines-nochan) and without goroutines/channels with a global state slightly refactored from its version in main (./gen-with-text_template-goroutines-globalstate). the latest binary is up to date with the latest commit in the branch. It also pre-calculates the number of events to generate based on the requested output size and the size of an initial template rendering.

./gen-with-text_template-goroutines-chansize generate-with-template gotext.tpl fields.yml -c configs.yml -y gotext -t 30G
3469.86user 361.14system 33:09.41elapsed 192%CPU (0avgtext+0avgdata 16740maxresident)k
17504inputs+51098160outputs (110major+3464263minor)pagefaults 0swaps

./gen-with-text_template-goroutines-globalstate generate-with-template gotext.tpl fields.yml -c configs.yml -y gotext -t 30G
2827.63user 636.22system 30:03.28elapsed 192%CPU (0avgtext+0avgdata 16676maxresident)k
352inputs+51874752outputs (5major+5466338minor)pagefaults 0swaps

./gen-with-text_template-goroutines-nochan generate-with-template gotext.tpl fields.yml -c configs.yml -y gotext -t 30G
2018.64user 172.34system 31:27.58elapsed 116%CPU (0avgtext+0avgdata 15556maxresident)k
104inputs+52320520outputs (6major+2643605minor)pagefaults 0swaps

./gen-with-text_template-main generate-with-template gotext.tpl fields.yml -c configs.yml -y gotext -t 30G
3043.93user 633.86system 33:16.69elapsed 184%CPU (0avgtext+0avgdata 16052maxresident)k
5968inputs+58594376outputs (43major+6518558minor)pagefaults 0swaps

in this case the performance between gen-with-text_template-main and gen-with-text_template-goroutines-globalstate seems very similar. the 3 minutes longer taken by gen-with-text_template-main are related to generating more events

while the final goal of improving the performance was not reached I would keep a few elements of the refactoring:

  • pre-calculating the number of events to generate
  • remove error return from EmitF
  • same general cleaning of the code (like writing the custom template prefix on emit() rather than on the bind functions)

@aspacca aspacca marked this pull request as ready for review February 27, 2023 08:13
@aspacca aspacca closed this Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants