Optimize advice with FilteredAttributes #6633

jack-berg · 2024-08-08T20:20:04Z

During the 8/7/2024 APAC java SIG, we discussed options for improving agent performance, including the impact of the AdviceAttributesProcessor (#6601).

This PR improve performance by wrapping Attributes implementation in a FilteredAttributes. FilteredAttributes uses the same memory allocation for key / values as its backing Attributes, but also retains a list of indices which have been filtered out. FilteredAttributes implements the Attributes contract with logic similar to the backing ImmutableKeyValueAttributes, but incorporating knowledge of indexes which should be filtered.

The performance improvement is quite good.

This PR adds MetricAdviceBenchmark with a variety of scenarios illustrating different interesting aspects of advice. The scenario of interest is ADVICE_ALL_ATTRIBUTES, which records the full set of HTTP server span attributes and applies attribute advice to narrow them to the HTTP server metric attributes. This is meant to be representative of a typical HTTP server instrumentation in opentelemetry-java-instrumentation.

The performance improvement is quite good:

Before (baseline):

Benchmark                                                      (instrumentParam)  Mode  Cnt     Score     Error   Units
MetricAdviceBenchmark.record                               ADVICE_ALL_ATTRIBUTES  avgt   10   954.733 ±  14.308   ns/op
MetricAdviceBenchmark.record:gc.alloc.rate                 ADVICE_ALL_ATTRIBUTES  avgt   10  1445.940 ±  21.237  MB/sec
MetricAdviceBenchmark.record:gc.alloc.rate.norm            ADVICE_ALL_ATTRIBUTES  avgt   10  1447.989 ±   0.133    B/op
MetricAdviceBenchmark.record:gc.count                      ADVICE_ALL_ATTRIBUTES  avgt   10    24.000            counts
MetricAdviceBenchmark.record:gc.time                       ADVICE_ALL_ATTRIBUTES  avgt   10    15.000                ms

After:

Benchmark                                                      (instrumentParam)  Mode  Cnt     Score     Error   Units
MetricAdviceBenchmark.record                               ADVICE_ALL_ATTRIBUTES  avgt   10   624.285 ±  11.537   ns/op
MetricAdviceBenchmark.record:gc.alloc.rate                 ADVICE_ALL_ATTRIBUTES  avgt   10  1246.238 ±  22.510  MB/sec
MetricAdviceBenchmark.record:gc.alloc.rate.norm            ADVICE_ALL_ATTRIBUTES  avgt   10   816.010 ±   0.022    B/op
MetricAdviceBenchmark.record:gc.count                      ADVICE_ALL_ATTRIBUTES  avgt   10    21.000            counts
MetricAdviceBenchmark.record:gc.time                       ADVICE_ALL_ATTRIBUTES  avgt   10    16.000                ms

CPU down from from 954 ns/op to 624 ns/op
Memory allocation down from 1447 B/op to 816 B/op
Of the 816 bytes allocated, 784 are the baseline needed to store the HTTP server span attributes. We're able to create a filtered view of the span attributes with just 816-784=32 bytes and 125ns of CPU time!
Creating a copy of the metric attributes (i.e. what we do today) requires 512 bytes and 405 ns. So we've managed a 1-(32/512)=~94% reduction in memory, and 1-(125/405)=~70% CPU time reduction. Nice!

cc @wgy035, @open-telemetry/java-instrumentation-maintainers

sdk/metrics/src/main/java/io/opentelemetry/sdk/metrics/internal/view/FilteredAttributes.java

jack-berg · 2024-08-12T18:27:15Z

sdk/metrics/src/main/java/io/opentelemetry/sdk/metrics/internal/view/FilteredAttributes.java

+  // excluded. overflowFilteredIndices is used when more than 32 key-value pairs are present in
+  // sourceData.
+  private final int filteredIndices;
+  @Nullable private final BitSet overflowFilteredIndices;


I experimented with all sorts of different mechanisms to track which attributes from the original source data should be filtered out and landed on this. We use the 32 bits of an int to track whether or not each of the first 32 attribute key value pairs are included or excluded, which covers most attribute use cases we'll see in the wild. For larger attribute sets, we overflow into BitSet, which is less efficient for typical use cases.

codecov · 2024-08-12T18:46:06Z

Codecov Report

Attention: Patch coverage is 84.00000% with 16 lines in your changes missing coverage. Please review.

Project coverage is 90.05%. Comparing base (0132d5d) to head (677e51c).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
.../sdk/metrics/internal/view/FilteredAttributes.java	83.67%	9 Missing and 7 partials ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #6633      +/-   ##
============================================
- Coverage     90.09%   90.05%   -0.05%     
- Complexity     6390     6424      +34     
============================================
  Files           711      712       +1     
  Lines         19333    19422      +89     
  Branches       1891     1919      +28     
============================================
+ Hits          17418    17490      +72     
- Misses         1335     1344       +9     
- Partials        580      588       +8

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

laurit · 2024-09-06T12:37:11Z

sdk/metrics/src/main/java/io/opentelemetry/sdk/metrics/internal/view/FilteredAttributes.java

+  // excluded in the output. A bit equal to 1 indicates the sourceData key at i*2 should be
+  // excluded. overflowFilteredIndices is used when more than 32 key-value pairs are present in
+  // sourceData.
+  private final int filteredIndices;


This is a bit similar to EnumSet in jdk. RegularEnumSet uses bits in a long and JumboEnumSet uses a long[] to track state.
Instead of overflowing could have dedicated implementation for small and large filter set that subclass FilteredAttributes. Another difference is that for small case jdk uses long instead of int.

Nice suggestion. Made it and addition to making the code easier to understand, it improved the benchmark:

Benchmark (instrumentParam) Mode Cnt Score Error Units MetricAdviceBenchmark.record NO_ADVICE_ALL_ATTRIBUTES avgt 10 629.574 ± 24.667 ns/op MetricAdviceBenchmark.record:gc.alloc.rate NO_ADVICE_ALL_ATTRIBUTES avgt 10 1187.849 ± 43.848 MB/sec MetricAdviceBenchmark.record:gc.alloc.rate.norm NO_ADVICE_ALL_ATTRIBUTES avgt 10 784.009 ± 0.023 B/op MetricAdviceBenchmark.record:gc.count NO_ADVICE_ALL_ATTRIBUTES avgt 10 20.000 counts MetricAdviceBenchmark.record:gc.time NO_ADVICE_ALL_ATTRIBUTES avgt 10 16.000 ms MetricAdviceBenchmark.record NO_ADVICE_FILTERED_ATTRIBUTES avgt 10 410.956 ± 55.624 ns/op MetricAdviceBenchmark.record:gc.alloc.rate NO_ADVICE_FILTERED_ATTRIBUTES avgt 10 1196.643 ± 167.851 MB/sec MetricAdviceBenchmark.record:gc.alloc.rate.norm NO_ADVICE_FILTERED_ATTRIBUTES avgt 10 512.006 ± 0.015 B/op MetricAdviceBenchmark.record:gc.count NO_ADVICE_FILTERED_ATTRIBUTES avgt 10 20.000 counts MetricAdviceBenchmark.record:gc.time NO_ADVICE_FILTERED_ATTRIBUTES avgt 10 14.000 ms MetricAdviceBenchmark.record NO_ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 17.710 ± 0.880 ns/op MetricAdviceBenchmark.record:gc.alloc.rate NO_ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 0.014 ± 0.034 MB/sec MetricAdviceBenchmark.record:gc.alloc.rate.norm NO_ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 ≈ 10⁻⁴ B/op MetricAdviceBenchmark.record:gc.count NO_ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 ≈ 0 counts MetricAdviceBenchmark.record ADVICE_ALL_ATTRIBUTES avgt 10 632.802 ± 27.453 ns/op MetricAdviceBenchmark.record:gc.alloc.rate ADVICE_ALL_ATTRIBUTES avgt 10 1230.183 ± 51.932 MB/sec MetricAdviceBenchmark.record:gc.alloc.rate.norm ADVICE_ALL_ATTRIBUTES avgt 10 816.010 ± 0.022 B/op MetricAdviceBenchmark.record:gc.count ADVICE_ALL_ATTRIBUTES avgt 10 21.000 counts MetricAdviceBenchmark.record:gc.time ADVICE_ALL_ATTRIBUTES avgt 10 21.000 ms MetricAdviceBenchmark.record ADVICE_FILTERED_ATTRIBUTES avgt 10 362.083 ± 15.526 ns/op MetricAdviceBenchmark.record:gc.alloc.rate ADVICE_FILTERED_ATTRIBUTES avgt 10 1391.021 ± 58.555 MB/sec MetricAdviceBenchmark.record:gc.alloc.rate.norm ADVICE_FILTERED_ATTRIBUTES avgt 10 528.006 ± 0.013 B/op MetricAdviceBenchmark.record:gc.count ADVICE_FILTERED_ATTRIBUTES avgt 10 23.000 counts MetricAdviceBenchmark.record:gc.time ADVICE_FILTERED_ATTRIBUTES avgt 10 15.000 ms MetricAdviceBenchmark.record ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 125.661 ± 3.040 ns/op MetricAdviceBenchmark.record:gc.alloc.rate ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 242.832 ± 5.761 MB/sec MetricAdviceBenchmark.record:gc.alloc.rate.norm ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 32.002 ± 0.005 B/op MetricAdviceBenchmark.record:gc.count ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 4.000 counts MetricAdviceBenchmark.record:gc.time ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 7.000 ms

Notably, for ADVICE_ALL_ATTRIBUTES, the B/op decreases from 816 to 528, and ns/op increases from 624 to 362. Not exactly sure why we see such improvement. 🤔

Something changed about the baseline. Current baseline before following your recommendation:

Benchmark (instrumentParam) Mode Cnt Score Error Units MetricAdviceBenchmark.record NO_ADVICE_ALL_ATTRIBUTES avgt 10 637.389 ± 23.386 ns/op MetricAdviceBenchmark.record:gc.alloc.rate NO_ADVICE_ALL_ATTRIBUTES avgt 10 1173.210 ± 41.638 MB/sec MetricAdviceBenchmark.record:gc.alloc.rate.norm NO_ADVICE_ALL_ATTRIBUTES avgt 10 784.010 ± 0.025 B/op MetricAdviceBenchmark.record:gc.count NO_ADVICE_ALL_ATTRIBUTES avgt 10 20.000 counts MetricAdviceBenchmark.record:gc.time NO_ADVICE_ALL_ATTRIBUTES avgt 10 16.000 ms MetricAdviceBenchmark.record NO_ADVICE_FILTERED_ATTRIBUTES avgt 10 420.190 ± 68.320 ns/op MetricAdviceBenchmark.record:gc.alloc.rate NO_ADVICE_FILTERED_ATTRIBUTES avgt 10 1174.458 ± 200.960 MB/sec MetricAdviceBenchmark.record:gc.alloc.rate.norm NO_ADVICE_FILTERED_ATTRIBUTES avgt 10 512.007 ± 0.017 B/op MetricAdviceBenchmark.record:gc.count NO_ADVICE_FILTERED_ATTRIBUTES avgt 10 19.000 counts MetricAdviceBenchmark.record:gc.time NO_ADVICE_FILTERED_ATTRIBUTES avgt 10 13.000 ms MetricAdviceBenchmark.record NO_ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 17.277 ± 0.856 ns/op MetricAdviceBenchmark.record:gc.alloc.rate NO_ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 0.014 ± 0.034 MB/sec MetricAdviceBenchmark.record:gc.alloc.rate.norm NO_ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 ≈ 10⁻⁴ B/op MetricAdviceBenchmark.record:gc.count NO_ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 ≈ 0 counts MetricAdviceBenchmark.record ADVICE_ALL_ATTRIBUTES avgt 10 620.605 ± 5.833 ns/op MetricAdviceBenchmark.record:gc.alloc.rate ADVICE_ALL_ATTRIBUTES avgt 10 1253.519 ± 11.813 MB/sec MetricAdviceBenchmark.record:gc.alloc.rate.norm ADVICE_ALL_ATTRIBUTES avgt 10 816.010 ± 0.023 B/op MetricAdviceBenchmark.record:gc.count ADVICE_ALL_ATTRIBUTES avgt 10 21.000 counts MetricAdviceBenchmark.record:gc.time ADVICE_ALL_ATTRIBUTES avgt 10 16.000 ms MetricAdviceBenchmark.record ADVICE_FILTERED_ATTRIBUTES avgt 10 384.805 ± 62.749 ns/op MetricAdviceBenchmark.record:gc.alloc.rate ADVICE_FILTERED_ATTRIBUTES avgt 10 1359.608 ± 187.333 MB/sec MetricAdviceBenchmark.record:gc.alloc.rate.norm ADVICE_FILTERED_ATTRIBUTES avgt 10 544.006 ± 0.013 B/op MetricAdviceBenchmark.record:gc.count ADVICE_FILTERED_ATTRIBUTES avgt 10 23.000 counts MetricAdviceBenchmark.record:gc.time ADVICE_FILTERED_ATTRIBUTES avgt 10 15.000 ms MetricAdviceBenchmark.record ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 125.548 ± 1.769 ns/op MetricAdviceBenchmark.record:gc.alloc.rate ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 243.013 ± 3.395 MB/sec MetricAdviceBenchmark.record:gc.alloc.rate.norm ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 32.002 ± 0.005 B/op MetricAdviceBenchmark.record:gc.count ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 4.000 counts MetricAdviceBenchmark.record:gc.time ADVICE_ALL_ATTRIBUTES_CACHED avgt 10 6.000 ms

Modest improvement in ADVICE_ALL_ATTRIBUTES, the B/op decreases from 544 to 528, and ns/op increases from 384 to 362.

laurit · 2024-09-06T13:08:44Z

sdk/metrics/src/main/java/io/opentelemetry/sdk/metrics/internal/view/FilteredAttributes.java

+  @Override
+  public Map<AttributeKey<?>, Object> asMap() {
+    Map<AttributeKey<?>, Object> result = new LinkedHashMap<>(size);
+    for (int i = 0; i < sourceData.length; i += 2) {
+      if (includeIndexInOutput(i)) {
+        result.put((AttributeKey<?>) sourceData[i], sourceData[i + 1]);
+      }
+    }
+    return Collections.unmodifiableMap(result);
+  }
+
+  @Override
+  public AttributesBuilder toBuilder() {
+    AttributesBuilder builder = Attributes.builder();
+    for (int i = 0; i < sourceData.length; i += 2) {
+      if (includeIndexInOutput(i)) {
+        putInBuilder(builder, (AttributeKey<? super Object>) sourceData[i], sourceData[i + 1]);
+      }
+    }
+    return builder;
+  }


I guess an alternative could be to implement a method that produces filtered List<Object> and feed the result to ReadOnlyArrayMap.wrap and ArrayBackedAttributesBuilder

Yeah I think there are ways to optimize this but I don't believe this is on the hot path so not a priority. Can always followup.

…y-java into advice-filtered-attributes

jack-berg commented Aug 8, 2024

View reviewed changes

sdk/metrics/src/main/java/io/opentelemetry/sdk/metrics/internal/view/FilteredAttributes.java Outdated Show resolved Hide resolved

jack-berg commented Aug 8, 2024

View reviewed changes

sdk/metrics/src/main/java/io/opentelemetry/sdk/metrics/internal/view/FilteredAttributes.java Outdated Show resolved Hide resolved

jack-berg commented Aug 9, 2024

View reviewed changes

sdk/metrics/src/main/java/io/opentelemetry/sdk/metrics/internal/view/FilteredAttributes.java Outdated Show resolved Hide resolved

jack-berg added 8 commits August 12, 2024 12:56

Optimize advice with FilteredAttributes

0da6fab

Use bitset to track filtered indices

65fb907

Compute hashcode up front

e227bd2

Custom BitSet implementation

6ddf43a

Merge bitset into FilteredAttributes to reduce memory, rework benchmarks

2a2f1f3

Encode filtered attributes into int

56a6778

clean up

2a9ad4b

Remove metric advice benchmark test

69f3c30

jack-berg force-pushed the advice-filtered-attributes branch from 217339c to 69f3c30 Compare August 12, 2024 18:18

jack-berg marked this pull request as ready for review August 12, 2024 18:24

jack-berg requested a review from a team August 12, 2024 18:24

jack-berg commented Aug 12, 2024

View reviewed changes

Add javadoc to ImmutableKeyValuePairs

ba582ae

laurit approved these changes Sep 6, 2024

View reviewed changes

jack-berg added 2 commits September 18, 2024 11:43

PR feedback

7e86ba9

Merge branch 'main' of https://github.com/open-telemetry/opentelemetr…

677e51c

…y-java into advice-filtered-attributes

jack-berg force-pushed the advice-filtered-attributes branch from ae99630 to 677e51c Compare September 18, 2024 17:24

jack-berg merged commit 82b9e9b into open-telemetry:main Sep 18, 2024
18 checks passed

jack-berg mentioned this pull request Sep 18, 2024

Any way to reduce performance overhead in DefaultSynchronousMetricStorage? #6601

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize advice with FilteredAttributes #6633

Optimize advice with FilteredAttributes #6633

jack-berg commented Aug 8, 2024 •

edited

Loading

jack-berg Aug 12, 2024 •

edited

Loading

codecov bot commented Aug 12, 2024 •

edited

Loading

laurit Sep 6, 2024

jack-berg Sep 18, 2024

jack-berg Sep 18, 2024 •

edited

Loading

laurit Sep 6, 2024

jack-berg Sep 18, 2024

Optimize advice with FilteredAttributes #6633

Optimize advice with FilteredAttributes #6633

Conversation

jack-berg commented Aug 8, 2024 • edited Loading

jack-berg Aug 12, 2024 • edited Loading

Choose a reason for hiding this comment

codecov bot commented Aug 12, 2024 • edited Loading

Codecov Report

laurit Sep 6, 2024

Choose a reason for hiding this comment

jack-berg Sep 18, 2024

Choose a reason for hiding this comment

jack-berg Sep 18, 2024 • edited Loading

Choose a reason for hiding this comment

laurit Sep 6, 2024

Choose a reason for hiding this comment

jack-berg Sep 18, 2024

Choose a reason for hiding this comment

jack-berg commented Aug 8, 2024 •

edited

Loading

jack-berg Aug 12, 2024 •

edited

Loading

codecov bot commented Aug 12, 2024 •

edited

Loading

jack-berg Sep 18, 2024 •

edited

Loading