Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize advice with FilteredAttributes #6633

Merged
merged 11 commits into from
Sep 18, 2024

Conversation

jack-berg
Copy link
Member

@jack-berg jack-berg commented Aug 8, 2024

During the 8/7/2024 APAC java SIG, we discussed options for improving agent performance, including the impact of the AdviceAttributesProcessor (#6601).

This PR improve performance by wrapping Attributes implementation in a FilteredAttributes. FilteredAttributes uses the same memory allocation for key / values as its backing Attributes, but also retains a list of indices which have been filtered out. FilteredAttributes implements the Attributes contract with logic similar to the backing ImmutableKeyValueAttributes, but incorporating knowledge of indexes which should be filtered.

The performance improvement is quite good.

This PR adds MetricAdviceBenchmark with a variety of scenarios illustrating different interesting aspects of advice. The scenario of interest is ADVICE_ALL_ATTRIBUTES, which records the full set of HTTP server span attributes and applies attribute advice to narrow them to the HTTP server metric attributes. This is meant to be representative of a typical HTTP server instrumentation in opentelemetry-java-instrumentation.

The performance improvement is quite good:

Before (baseline):

Benchmark                                                      (instrumentParam)  Mode  Cnt     Score     Error   Units
MetricAdviceBenchmark.record                               ADVICE_ALL_ATTRIBUTES  avgt   10   954.733 ±  14.308   ns/op
MetricAdviceBenchmark.record:gc.alloc.rate                 ADVICE_ALL_ATTRIBUTES  avgt   10  1445.940 ±  21.237  MB/sec
MetricAdviceBenchmark.record:gc.alloc.rate.norm            ADVICE_ALL_ATTRIBUTES  avgt   10  1447.989 ±   0.133    B/op
MetricAdviceBenchmark.record:gc.count                      ADVICE_ALL_ATTRIBUTES  avgt   10    24.000            counts
MetricAdviceBenchmark.record:gc.time                       ADVICE_ALL_ATTRIBUTES  avgt   10    15.000                ms

After:

Benchmark                                                      (instrumentParam)  Mode  Cnt     Score     Error   Units
MetricAdviceBenchmark.record                               ADVICE_ALL_ATTRIBUTES  avgt   10   624.285 ±  11.537   ns/op
MetricAdviceBenchmark.record:gc.alloc.rate                 ADVICE_ALL_ATTRIBUTES  avgt   10  1246.238 ±  22.510  MB/sec
MetricAdviceBenchmark.record:gc.alloc.rate.norm            ADVICE_ALL_ATTRIBUTES  avgt   10   816.010 ±   0.022    B/op
MetricAdviceBenchmark.record:gc.count                      ADVICE_ALL_ATTRIBUTES  avgt   10    21.000            counts
MetricAdviceBenchmark.record:gc.time                       ADVICE_ALL_ATTRIBUTES  avgt   10    16.000                ms
  • CPU down from from 954 ns/op to 624 ns/op
  • Memory allocation down from 1447 B/op to 816 B/op
  • Of the 816 bytes allocated, 784 are the baseline needed to store the HTTP server span attributes. We're able to create a filtered view of the span attributes with just 816-784=32 bytes and 125ns of CPU time!
  • Creating a copy of the metric attributes (i.e. what we do today) requires 512 bytes and 405 ns. So we've managed a 1-(32/512)=~94% reduction in memory, and 1-(125/405)=~70% CPU time reduction. Nice!

cc @wgy035, @open-telemetry/java-instrumentation-maintainers

@jack-berg jack-berg force-pushed the advice-filtered-attributes branch from 217339c to 69f3c30 Compare August 12, 2024 18:18
@jack-berg jack-berg marked this pull request as ready for review August 12, 2024 18:24
@jack-berg jack-berg requested a review from a team August 12, 2024 18:24
// excluded. overflowFilteredIndices is used when more than 32 key-value pairs are present in
// sourceData.
private final int filteredIndices;
@Nullable private final BitSet overflowFilteredIndices;
Copy link
Member Author

@jack-berg jack-berg Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I experimented with all sorts of different mechanisms to track which attributes from the original source data should be filtered out and landed on this. We use the 32 bits of an int to track whether or not each of the first 32 attribute key value pairs are included or excluded, which covers most attribute use cases we'll see in the wild. For larger attribute sets, we overflow into BitSet, which is less efficient for typical use cases.

Copy link

codecov bot commented Aug 12, 2024

Codecov Report

Attention: Patch coverage is 84.00000% with 16 lines in your changes missing coverage. Please review.

Project coverage is 90.05%. Comparing base (0132d5d) to head (677e51c).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
.../sdk/metrics/internal/view/FilteredAttributes.java 83.67% 9 Missing and 7 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #6633      +/-   ##
============================================
- Coverage     90.09%   90.05%   -0.05%     
- Complexity     6390     6424      +34     
============================================
  Files           711      712       +1     
  Lines         19333    19422      +89     
  Branches       1891     1919      +28     
============================================
+ Hits          17418    17490      +72     
- Misses         1335     1344       +9     
- Partials        580      588       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

// excluded in the output. A bit equal to 1 indicates the sourceData key at i*2 should be
// excluded. overflowFilteredIndices is used when more than 32 key-value pairs are present in
// sourceData.
private final int filteredIndices;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit similar to EnumSet in jdk. RegularEnumSet uses bits in a long and JumboEnumSet uses a long[] to track state.
Instead of overflowing could have dedicated implementation for small and large filter set that subclass FilteredAttributes. Another difference is that for small case jdk uses long instead of int.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice suggestion. Made it and addition to making the code easier to understand, it improved the benchmark:

Benchmark                                                      (instrumentParam)  Mode  Cnt     Score     Error   Units
MetricAdviceBenchmark.record                            NO_ADVICE_ALL_ATTRIBUTES  avgt   10   629.574 ±  24.667   ns/op
MetricAdviceBenchmark.record:gc.alloc.rate              NO_ADVICE_ALL_ATTRIBUTES  avgt   10  1187.849 ±  43.848  MB/sec
MetricAdviceBenchmark.record:gc.alloc.rate.norm         NO_ADVICE_ALL_ATTRIBUTES  avgt   10   784.009 ±   0.023    B/op
MetricAdviceBenchmark.record:gc.count                   NO_ADVICE_ALL_ATTRIBUTES  avgt   10    20.000            counts
MetricAdviceBenchmark.record:gc.time                    NO_ADVICE_ALL_ATTRIBUTES  avgt   10    16.000                ms
MetricAdviceBenchmark.record                       NO_ADVICE_FILTERED_ATTRIBUTES  avgt   10   410.956 ±  55.624   ns/op
MetricAdviceBenchmark.record:gc.alloc.rate         NO_ADVICE_FILTERED_ATTRIBUTES  avgt   10  1196.643 ± 167.851  MB/sec
MetricAdviceBenchmark.record:gc.alloc.rate.norm    NO_ADVICE_FILTERED_ATTRIBUTES  avgt   10   512.006 ±   0.015    B/op
MetricAdviceBenchmark.record:gc.count              NO_ADVICE_FILTERED_ATTRIBUTES  avgt   10    20.000            counts
MetricAdviceBenchmark.record:gc.time               NO_ADVICE_FILTERED_ATTRIBUTES  avgt   10    14.000                ms
MetricAdviceBenchmark.record                     NO_ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10    17.710 ±   0.880   ns/op
MetricAdviceBenchmark.record:gc.alloc.rate       NO_ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10     0.014 ±   0.034  MB/sec
MetricAdviceBenchmark.record:gc.alloc.rate.norm  NO_ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10    ≈ 10⁻⁴              B/op
MetricAdviceBenchmark.record:gc.count            NO_ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10       ≈ 0            counts
MetricAdviceBenchmark.record                               ADVICE_ALL_ATTRIBUTES  avgt   10   632.802 ±  27.453   ns/op
MetricAdviceBenchmark.record:gc.alloc.rate                 ADVICE_ALL_ATTRIBUTES  avgt   10  1230.183 ±  51.932  MB/sec
MetricAdviceBenchmark.record:gc.alloc.rate.norm            ADVICE_ALL_ATTRIBUTES  avgt   10   816.010 ±   0.022    B/op
MetricAdviceBenchmark.record:gc.count                      ADVICE_ALL_ATTRIBUTES  avgt   10    21.000            counts
MetricAdviceBenchmark.record:gc.time                       ADVICE_ALL_ATTRIBUTES  avgt   10    21.000                ms
MetricAdviceBenchmark.record                          ADVICE_FILTERED_ATTRIBUTES  avgt   10   362.083 ±  15.526   ns/op
MetricAdviceBenchmark.record:gc.alloc.rate            ADVICE_FILTERED_ATTRIBUTES  avgt   10  1391.021 ±  58.555  MB/sec
MetricAdviceBenchmark.record:gc.alloc.rate.norm       ADVICE_FILTERED_ATTRIBUTES  avgt   10   528.006 ±   0.013    B/op
MetricAdviceBenchmark.record:gc.count                 ADVICE_FILTERED_ATTRIBUTES  avgt   10    23.000            counts
MetricAdviceBenchmark.record:gc.time                  ADVICE_FILTERED_ATTRIBUTES  avgt   10    15.000                ms
MetricAdviceBenchmark.record                        ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10   125.661 ±   3.040   ns/op
MetricAdviceBenchmark.record:gc.alloc.rate          ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10   242.832 ±   5.761  MB/sec
MetricAdviceBenchmark.record:gc.alloc.rate.norm     ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10    32.002 ±   0.005    B/op
MetricAdviceBenchmark.record:gc.count               ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10     4.000            counts
MetricAdviceBenchmark.record:gc.time                ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10     7.000                ms

Notably, for ADVICE_ALL_ATTRIBUTES, the B/op decreases from 816 to 528, and ns/op increases from 624 to 362. Not exactly sure why we see such improvement. 🤔

Copy link
Member Author

@jack-berg jack-berg Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something changed about the baseline. Current baseline before following your recommendation:

Benchmark                                                      (instrumentParam)  Mode  Cnt     Score     Error   Units
MetricAdviceBenchmark.record                            NO_ADVICE_ALL_ATTRIBUTES  avgt   10   637.389 ±  23.386   ns/op
MetricAdviceBenchmark.record:gc.alloc.rate              NO_ADVICE_ALL_ATTRIBUTES  avgt   10  1173.210 ±  41.638  MB/sec
MetricAdviceBenchmark.record:gc.alloc.rate.norm         NO_ADVICE_ALL_ATTRIBUTES  avgt   10   784.010 ±   0.025    B/op
MetricAdviceBenchmark.record:gc.count                   NO_ADVICE_ALL_ATTRIBUTES  avgt   10    20.000            counts
MetricAdviceBenchmark.record:gc.time                    NO_ADVICE_ALL_ATTRIBUTES  avgt   10    16.000                ms
MetricAdviceBenchmark.record                       NO_ADVICE_FILTERED_ATTRIBUTES  avgt   10   420.190 ±  68.320   ns/op
MetricAdviceBenchmark.record:gc.alloc.rate         NO_ADVICE_FILTERED_ATTRIBUTES  avgt   10  1174.458 ± 200.960  MB/sec
MetricAdviceBenchmark.record:gc.alloc.rate.norm    NO_ADVICE_FILTERED_ATTRIBUTES  avgt   10   512.007 ±   0.017    B/op
MetricAdviceBenchmark.record:gc.count              NO_ADVICE_FILTERED_ATTRIBUTES  avgt   10    19.000            counts
MetricAdviceBenchmark.record:gc.time               NO_ADVICE_FILTERED_ATTRIBUTES  avgt   10    13.000                ms
MetricAdviceBenchmark.record                     NO_ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10    17.277 ±   0.856   ns/op
MetricAdviceBenchmark.record:gc.alloc.rate       NO_ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10     0.014 ±   0.034  MB/sec
MetricAdviceBenchmark.record:gc.alloc.rate.norm  NO_ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10    ≈ 10⁻⁴              B/op
MetricAdviceBenchmark.record:gc.count            NO_ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10       ≈ 0            counts
MetricAdviceBenchmark.record                               ADVICE_ALL_ATTRIBUTES  avgt   10   620.605 ±   5.833   ns/op
MetricAdviceBenchmark.record:gc.alloc.rate                 ADVICE_ALL_ATTRIBUTES  avgt   10  1253.519 ±  11.813  MB/sec
MetricAdviceBenchmark.record:gc.alloc.rate.norm            ADVICE_ALL_ATTRIBUTES  avgt   10   816.010 ±   0.023    B/op
MetricAdviceBenchmark.record:gc.count                      ADVICE_ALL_ATTRIBUTES  avgt   10    21.000            counts
MetricAdviceBenchmark.record:gc.time                       ADVICE_ALL_ATTRIBUTES  avgt   10    16.000                ms
MetricAdviceBenchmark.record                          ADVICE_FILTERED_ATTRIBUTES  avgt   10   384.805 ±  62.749   ns/op
MetricAdviceBenchmark.record:gc.alloc.rate            ADVICE_FILTERED_ATTRIBUTES  avgt   10  1359.608 ± 187.333  MB/sec
MetricAdviceBenchmark.record:gc.alloc.rate.norm       ADVICE_FILTERED_ATTRIBUTES  avgt   10   544.006 ±   0.013    B/op
MetricAdviceBenchmark.record:gc.count                 ADVICE_FILTERED_ATTRIBUTES  avgt   10    23.000            counts
MetricAdviceBenchmark.record:gc.time                  ADVICE_FILTERED_ATTRIBUTES  avgt   10    15.000                ms
MetricAdviceBenchmark.record                        ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10   125.548 ±   1.769   ns/op
MetricAdviceBenchmark.record:gc.alloc.rate          ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10   243.013 ±   3.395  MB/sec
MetricAdviceBenchmark.record:gc.alloc.rate.norm     ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10    32.002 ±   0.005    B/op
MetricAdviceBenchmark.record:gc.count               ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10     4.000            counts
MetricAdviceBenchmark.record:gc.time                ADVICE_ALL_ATTRIBUTES_CACHED  avgt   10     6.000                ms

Modest improvement in ADVICE_ALL_ATTRIBUTES, the B/op decreases from 544 to 528, and ns/op increases from 384 to 362.

Comment on lines +157 to +177
@Override
public Map<AttributeKey<?>, Object> asMap() {
Map<AttributeKey<?>, Object> result = new LinkedHashMap<>(size);
for (int i = 0; i < sourceData.length; i += 2) {
if (includeIndexInOutput(i)) {
result.put((AttributeKey<?>) sourceData[i], sourceData[i + 1]);
}
}
return Collections.unmodifiableMap(result);
}

@Override
public AttributesBuilder toBuilder() {
AttributesBuilder builder = Attributes.builder();
for (int i = 0; i < sourceData.length; i += 2) {
if (includeIndexInOutput(i)) {
putInBuilder(builder, (AttributeKey<? super Object>) sourceData[i], sourceData[i + 1]);
}
}
return builder;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess an alternative could be to implement a method that produces filtered List<Object> and feed the result to ReadOnlyArrayMap.wrap and ArrayBackedAttributesBuilder

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think there are ways to optimize this but I don't believe this is on the hot path so not a priority. Can always followup.

@jack-berg jack-berg force-pushed the advice-filtered-attributes branch from ae99630 to 677e51c Compare September 18, 2024 17:24
@jack-berg jack-berg merged commit 82b9e9b into open-telemetry:main Sep 18, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants