Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/groupbyattrsprocessor] allow empty keys for compaction #7793

Merged
merged 7 commits into from
Mar 2, 2022

Conversation

pmm-sumo
Copy link
Contributor

@pmm-sumo pmm-sumo commented Feb 10, 2022

Description:

Extension of groupbyattrsprocessor for compacting data when spread across multiple ResourceSpans/ResourceMetrics/ResourceLogs with matching Resource and InstrumentationLibrary

Link to tracking Issue: #2265 in core

Testing: Several unit tests added

Documentation: Docs clarified on usage of empty keys. Provided example on compaction

Benchmark results (compacting 100 spans in different layouts):

BenchmarkCompacting
BenchmarkCompacting/instrumentation_library_count=1,_spans_per_library_count=100
BenchmarkCompacting/instrumentation_library_count=1,_spans_per_library_count=100-16         	   27966	     42352 ns/op
BenchmarkCompacting/instrumentation_library_count=10,_spans_per_library_count=10
BenchmarkCompacting/instrumentation_library_count=10,_spans_per_library_count=10-16         	   23912	     50266 ns/op
BenchmarkCompacting/instrumentation_library_count=100,_spans_per_library_count=1
BenchmarkCompacting/instrumentation_library_count=100,_spans_per_library_count=1-16         	   16819	     71327 ns/op

@pkositsyn
Copy link
Contributor

I have 2 insights:

  1. There is a comment about prohibiting empty keys in config.go, it should be changed as well
  2. I have an example of Jaegerexporter, which sends different ResourceSpans independently in separate calls. Given that, I feel like in many cases setting batchpeocessor -> groupbyattrs(with empty set of keys) -> jaeger exporter will benefit performance a lot. Might be a good idea to make performance test and add something to jaeger exporter docs as well

@pkositsyn
Copy link
Contributor

I have done some tests with adding groupbyattrs with empty keys for jaeger exporter

On the following screenshots 2 versions are compared: first window without groupbyattrs, and second with this processor.

Spans rate in jaeger exporter
Screenshot from 2022-02-13 20-55-38

CPU for collector
Screenshot from 2022-02-13 20-55-45

The benchmark was conducted via tracegen docker-compose, adding this diff to current branch

diff --git a/examples/tracegen/docker-compose.yml b/examples/tracegen/docker-compose.yml
index 015068b5e..d6842c6ce 100644
--- a/examples/tracegen/docker-compose.yml
+++ b/examples/tracegen/docker-compose.yml
@@ -5,7 +5,16 @@ services:
     image: jaegertracing/all-in-one:latest
     ports:
       - "16686:16686"
-      - "14250"
+      - "14250" 
+
+  prometheus:
+    image: prom/prometheus
+    ports:
+      - "9090:9090"
+    volumes:
+      - ./prometheus-config.yml:/etc/prometheus/prometheus.yml
+    depends_on:
+      - otel-collector
 
   otel-collector:
     build:
@@ -25,8 +34,11 @@ services:
       - otel-collector:4317
       - -otlp-insecure
       - -rate
-      - "1"
+      - "1000"
+      - -workers
+      - "4"
       - -duration
-      - 10000000s
+      - 600s
     depends_on:
       - otel-collector
+      - prometheus
diff --git a/examples/tracegen/otel-collector-config.yml b/examples/tracegen/otel-collector-config.yml
index e07abfae1..c3518e0ae 100644
--- a/examples/tracegen/otel-collector-config.yml
+++ b/examples/tracegen/otel-collector-config.yml
@@ -12,10 +12,13 @@ exporters:
 
 processors:
   batch:
+  groupbytrace:
+  groupbyattrs:
+    keys: []
 
 service:
   pipelines:
     traces:
       receivers: [otlp]
       exporters: [jaeger, logging]
-      processors: [batch]
+      processors: [groupbytrace, batch, groupbyattrs]

I added groupbytrace to split (possibly) sent batches by tracegen and make all pdata.ResourceSpans contain only one span. (Actually it is a common situation when groupbytrace is set before batching). The runs on screenshots differ only by adding/removing groupbyattrs from the pipeline

@jpkrohling jpkrohling self-requested a review February 15, 2022 09:56
@pmm-sumo
Copy link
Contributor Author

@pkositsyn @jpkrohling I updated README, added some test cases and examples. Let me know what do you think

@jpkrohling
Copy link
Member

I'll add this to my review queue and should provide some feedback soon.

@pmm-sumo pmm-sumo marked this pull request as ready for review February 17, 2022 14:27
@pmm-sumo pmm-sumo requested a review from a team February 17, 2022 14:27
Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, basically, there wasn't a change in the processing itself, only on the constraint checking right? I like it :-)

Because this feels like an esoteric feature, I recommend documenting it well, including when and how it should be used and when/how it should not be used.

@@ -83,6 +86,64 @@ Notes:
* The specified "grouping" attributes that are set on the new *Resources* are also **removed** from the metric *DataPoints*
* While not shown in the above example, the processor also merges collections of records under matching InstrumentationLibrary

### Compaction

In some cases, the data might come in single requests to the collector and even after batching there might be multiple duplicated ResourceSpans/ResourceLogs/ResourceMetrics objects, which leads to additional memory consumption and increased processing costs. As a remedy, `groupbyattrs` processor might be used to compact the data which has matching Resource and InstrumentationLibrary properties.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, the appalling aspect of this feature is to get better performance while sending data out. Without calling this out explicitly, people might not realize that the advantages are good on the transport side as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this will reduce the size of the message for many of the formats (e.g. OTLP). In some cases (Jaeger) this will also reduce the number of RPC calls (since the Jaeger model maps one batch to one ResourceSpans) - if I got it right. Perhaps it would be worth calling out in jaegerexporter that groupbyattrs is recommended (or worth considering)?

processor/groupbyattrsprocessor/README.md Show resolved Hide resolved
@pmm-sumo pmm-sumo force-pushed the group-by-attrs-compaction branch 2 times, most recently from c7b691c to 6600874 Compare February 18, 2022 15:41
@pmm-sumo
Copy link
Contributor Author

Benchmark results (compacting 100 spans in different layouts):

BenchmarkCompacting
BenchmarkCompacting/instrumentation_library_count=1,_spans_per_library_count=100
BenchmarkCompacting/instrumentation_library_count=1,_spans_per_library_count=100-16         	   27966	     42352 ns/op
BenchmarkCompacting/instrumentation_library_count=10,_spans_per_library_count=10
BenchmarkCompacting/instrumentation_library_count=10,_spans_per_library_count=10-16         	   23912	     50266 ns/op
BenchmarkCompacting/instrumentation_library_count=100,_spans_per_library_count=1
BenchmarkCompacting/instrumentation_library_count=100,_spans_per_library_count=1-16         	   16819	     71327 ns/op


## Example
It is recommended to use the `groupbyattrs` processor together with [batch](https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor/batchprocessor) processor, as a consecutive step, as this will reduce the fragmentation of data (by grouping records together under matching Resource/Instrumentation Library)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpkrohling @pkositsyn I put the note on batch processor in this section, also updated the wording and included it in the examples. If you have any suggestions how to express this better, you are more than welcome :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@pmm-sumo
Copy link
Contributor Author

pmm-sumo commented Mar 2, 2022

Thank you @jpkrohling! Just rebased

@jpkrohling jpkrohling merged commit 40b09b2 into open-telemetry:main Mar 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants