Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create experimental output using go-elasticsearch #6002

Closed
simitt opened this issue Aug 20, 2021 · 2 comments
Closed

Create experimental output using go-elasticsearch #6002

simitt opened this issue Aug 20, 2021 · 2 comments
Assignees
Labels
Milestone

Comments

@simitt
Copy link
Contributor

simitt commented Aug 20, 2021

Introduce an experimental output using go-elasticsearch, avoiding libbeat output.

see #5970

Goal:

  • Get an understanding for involved effort, potential issues and shortcomings
  • Get a rough understanding of performance implications
@simitt simitt added the v8.0.0 label Aug 20, 2021
@simitt simitt added this to the 8.0 milestone Aug 20, 2021
@zube zube bot added the [zube]: Backlog label Aug 20, 2021
@axw axw self-assigned this Oct 7, 2021
@axw axw modified the milestones: 8.0, 7.16 Oct 7, 2021
@axw axw added v7.16.0 and removed v8.0.0 labels Oct 7, 2021
@axw
Copy link
Member

axw commented Oct 12, 2021

Closed by #5970

@marclop
Copy link
Contributor

marclop commented Oct 20, 2021

Benchmark results

I benchmarked against an ECE installation comparing the performance of:

  • APM standalone (libbeat output) vs APM standalone (experimental output)
  • APM standalone vs APM integration

No output settings were tuned.

Environment

Single node ECE installation on an r5d.2xlarge (64GB of RAM).

Each of the benchmarks was run against a deployment with the following topology:

  • 16GB single node Elasticsearch.
  • 1GB Kibana instance.
  • 4GB APM server.

All benchmarks have been performed running systemtest/cmd/apmbench with the benchmark
scenarios defined in:

func Benchmark1000Transactions(b *testing.B) {
b.RunParallel(func(pb *testing.PB) {
tracer := benchtest.NewTracer(b)
for pb.Next() {
for i := 0; i < 1000; i++ {
tracer.StartTransaction("name", "type").End()
}
// TODO(axw) implement a transport that enables streaming
// events in a way that we can block when the queue is full,
// without flushing. Alternatively, make this an option in
// TracerOptions?
tracer.Flush(nil)
}
})
}
func BenchmarkOTLPTraces(b *testing.B) {
b.RunParallel(func(pb *testing.PB) {
exporter := benchtest.NewOTLPExporter(b)
tracerProvider := sdktrace.NewTracerProvider(
sdktrace.WithSampler(sdktrace.AlwaysSample()),
sdktrace.WithBatcher(exporter, sdktrace.WithBlocking()),
)
tracer := tracerProvider.Tracer("tracer")
for pb.Next() {
_, span := tracer.Start(context.Background(), "name")
span.End()
}
tracerProvider.ForceFlush(context.Background())
})
}

Summary

Standalone with libbeat output vs experimental output

There seems to a pretty significant performance gain when running apm-server in standalone
mode with the experimental go-elasticsearch output enabled vs the default libbeat output.

$ benchstat benchresult/7.16.0-standalone.txt benchresult/7.16.0-standalone-experimental.txt
name              old time/op              new time/op              delta
1000Transactions               515ms ± 2%               196ms ± 3%   -61.98%  (p=0.002 n=6+6)
OTLPTraces                     908µs ± 2%               295µs ± 4%   -67.44%  (p=0.002 n=6+6)

name              old events/sec           new events/sec           delta
1000Transactions               1.94k ± 2%               5.11k ± 3%  +163.05%  (p=0.002 n=6+6)
OTLPTraces                     1.09k ± 3%               3.38k ± 4%  +210.20%  (p=0.002 n=6+6)

name              old alloc/op             new alloc/op             delta
1000Transactions               315kB ± 1%               265kB ± 0%   -15.89%  (p=0.002 n=6+6)
OTLPTraces                    1.74kB ± 0%              1.73kB ± 0%    -0.27%  (p=0.004 n=6+5)

name              old allocs/op            new allocs/op            delta
1000Transactions               5.18k ± 0%               5.11k ± 0%    -1.35%  (p=0.002 n=6+6)
OTLPTraces                      15.0 ± 0%                15.0 ± 0%      ~     (all equal)

Standalone vs APM integration with libbeat output

There seems to be no significant performance difference between apm-server running in standalone
mode and running under the Elastic Agent in managed mode.

$ benchstat benchresult/7.16.0-standalone.txt benchresult/7.16.0-integration.txt
name              old time/op              new time/op              delta
1000Transactions               515ms ± 2%               514ms ± 4%    ~     (p=0.714 n=6+3)
OTLPTraces                     908µs ± 2%               878µs ± 0%  -3.29%  (p=0.024 n=6+3)

name              old events/sec           new events/sec           delta
1000Transactions               1.94k ± 2%               1.95k ± 4%    ~     (p=0.714 n=6+3)
OTLPTraces                     1.09k ± 3%               1.12k ± 0%  +3.18%  (p=0.024 n=6+3)

name              old alloc/op             new alloc/op             delta
1000Transactions               315kB ± 1%               315kB ± 1%    ~     (p=0.714 n=6+3)
OTLPTraces                    1.74kB ± 0%              1.73kB ± 0%    ~     (p=0.167 n=6+3)

name              old allocs/op            new allocs/op            delta
1000Transactions               5.18k ± 0%               5.20k ± 0%  +0.34%  (p=0.048 n=6+3)
OTLPTraces                      15.0 ± 0%                15.0 ± 0%    ~     (all equal)

Detailed results

APM standalone 7.16.0-SNAPSHOT libbeat output

$ ./apmbench -benchtime=1m
Benchmark1000Transactions	     139	 505583279 ns/op	         0 error_responses/sec	      1983 events/sec	  313232 B/op	    5180 allocs/op
Benchmark1000Transactions	     139	 509048043 ns/op	         0 error_responses/sec	      1963 events/sec	  310805 B/op	    5173 allocs/op
Benchmark1000Transactions	     134	 524066035 ns/op	         0 error_responses/sec	      1902 events/sec	  316557 B/op	    5184 allocs/op
BenchmarkOTLPTraces      	   85532	    896748 ns/op	         0 error_responses/sec	      1108 events/sec	    1736 B/op	      15 allocs/op
BenchmarkOTLPTraces      	   80754	    889358 ns/op	         0 error_responses/sec	      1102 events/sec	    1734 B/op	      15 allocs/op
BenchmarkOTLPTraces      	   79603	    890350 ns/op	         0 error_responses/sec	      1117 events/sec	    1736 B/op	      15 allocs/op

APM standalone 7.16.0-SNAPSHOT experimental output

$ ./apmbench -benchtime=1m -count=3
Benchmark1000Transactions	     330	 198143096 ns/op	         0 error_responses/sec	      5053 events/sec	  265972 B/op	    5111 allocs/op
Benchmark1000Transactions	     355	 191521761 ns/op	         0 error_responses/sec	      5211 events/sec	  263638 B/op	    5107 allocs/op
Benchmark1000Transactions	     354	 192381587 ns/op	         0 error_responses/sec	      5199 events/sec	  263758 B/op	    5112 allocs/op
BenchmarkOTLPTraces      	  275701	    284977 ns/op	         0 error_responses/sec	      3500 events/sec	    1730 B/op	      15 allocs/op
BenchmarkOTLPTraces      	  258784	    290756 ns/op	         0 error_responses/sec	      3430 events/sec	    1732 B/op	      15 allocs/op
BenchmarkOTLPTraces      	  256189	    295984 ns/op	         0 error_responses/sec	      3376 events/sec	    1732 B/op	      15 allocs/op

APM integration 7.16.0-SNAPSHOT libbeat output

$./apmbench -benchtime=1m -count=3
Benchmark1000Transactions	     135	 507918451 ns/op	         0 error_responses/sec	      1975 events/sec	  318295 B/op	    5204 allocs/op
Benchmark1000Transactions	     136	 535332092 ns/op	         0 error_responses/sec	      1868 events/sec	  316076 B/op	    5195 allocs/op
Benchmark1000Transactions	     140	 497921516 ns/op	         0 error_responses/sec	      2004 events/sec	  311776 B/op	    5192 allocs/op
BenchmarkOTLPTraces      	   81349	    878723 ns/op	         0 error_responses/sec	      1128 events/sec	    1734 B/op	      15 allocs/op
BenchmarkOTLPTraces      	   79958	    874865 ns/op	         0 error_responses/sec	      1122 events/sec	    1734 B/op	      15 allocs/op
BenchmarkOTLPTraces      	   75822	    879549 ns/op	         0 error_responses/sec	      1124 events/sec	    1736 B/op	      15 allocs/op

APM integration 7.16.0-SNAPSHOT experimental output

TODO. Output settings are not being applied to the Elastic Cloud policy.

Looking at the performance of the integration vs standalone, I would not expect to see any significant difference.

@zube zube bot removed the [zube]: Done label Jan 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants