Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize normalization caching and copying #890

Merged

Conversation

dashpole
Copy link
Contributor

@dashpole dashpole commented Sep 9, 2024

Fixes #888

Changes

This avoids creating new cached elements, updates existing cached datapoints if they are already there, and only copies the required fields into the cache. In particular, not storing attributes lowers memory usage significantly. The benchmark difference is:

goos: linux
goarch: amd64
pkg: github.com/GoogleCloudPlatform/opentelemetry-operations-go/exporter/collector/internal/normalization
cpu: Intel(R) Xeon(R) CPU @ 2.20GHz
                                              │   main.txt   │            mincache.txt            │
                                              │    sec/op    │   sec/op     vs base               │
NormalizeNumberDataPoint-2                       606.5n ± 7%   585.2n ± 3%        ~ (p=0.132 n=6)
NormalizeHistogramDataPoint-2                    727.2n ± 3%   684.4n ± 7%   -5.89% (p=0.041 n=6)
NormalizeExopnentialHistogramDataPoint-2         776.1n ± 4%   801.5n ± 3%   +3.28% (p=0.009 n=6)
NormalizeSummaryDataPoint-2                      552.4n ± 5%   331.3n ± 2%  -40.02% (p=0.002 n=6)
ResetNormalizeNumberDataPoint-2                  904.4n ± 2%   591.8n ± 3%  -34.57% (p=0.002 n=6)
ResetNormalizeHistogramDataPoint-2              1198.5n ± 4%   868.4n ± 6%  -27.54% (p=0.002 n=6)
ResetNormalizeExponentialHistogramDataPoint-2    949.9n ± 4%   651.4n ± 3%  -31.43% (p=0.002 n=6)
ResetNormalizeSummaryDataPoint-2                 859.9n ± 4%   383.6n ± 3%  -55.39% (p=0.002 n=6)
geomean                                          799.8n        584.7n       -26.89%

                                              │  main.txt   │              mincache.txt              │
                                              │    B/op     │    B/op     vs base                    │
NormalizeNumberDataPoint-2                       16.00 ± 0%   16.00 ± 0%         ~ (p=1.000 n=6) ¹
NormalizeHistogramDataPoint-2                    32.00 ± 0%   32.00 ± 0%         ~ (p=1.000 n=6) ¹
NormalizeExopnentialHistogramDataPoint-2         48.00 ± 0%   48.00 ± 0%         ~ (p=1.000 n=6) ¹
NormalizeSummaryDataPoint-2                      16.00 ± 0%    0.00 ± 0%  -100.00% (p=0.002 n=6)
ResetNormalizeNumberDataPoint-2                 128.00 ± 0%   16.00 ± 0%   -87.50% (p=0.002 n=6)
ResetNormalizeHistogramDataPoint-2              256.00 ± 0%   64.00 ± 0%   -75.00% (p=0.002 n=6)
ResetNormalizeExponentialHistogramDataPoint-2   256.00 ± 0%   16.00 ± 0%   -93.75% (p=0.002 n=6)
ResetNormalizeSummaryDataPoint-2                 128.0 ± 0%     0.0 ± 0%  -100.00% (p=0.002 n=6)
geomean                                          67.33                    ?                      ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean

                                              │  main.txt  │              mincache.txt              │
                                              │ allocs/op  │ allocs/op   vs base                    │
NormalizeNumberDataPoint-2                      2.000 ± 0%   2.000 ± 0%         ~ (p=1.000 n=6) ¹
NormalizeHistogramDataPoint-2                   3.000 ± 0%   3.000 ± 0%         ~ (p=1.000 n=6) ¹
NormalizeExopnentialHistogramDataPoint-2        4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=6) ¹
NormalizeSummaryDataPoint-2                     1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.002 n=6)
ResetNormalizeNumberDataPoint-2                 4.000 ± 0%   1.000 ± 0%   -75.00% (p=0.002 n=6)
ResetNormalizeHistogramDataPoint-2              7.000 ± 0%   4.000 ± 0%   -42.86% (p=0.002 n=6)
ResetNormalizeExponentialHistogramDataPoint-2   4.000 ± 0%   1.000 ± 0%   -75.00% (p=0.002 n=6)
ResetNormalizeSummaryDataPoint-2                4.000 ± 0%   0.000 ± 0%  -100.00% (p=0.002 n=6)
geomean                                         3.191                    ?                      ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean

Note that this primarily improves the "reset" benchmarks. This is because we only store things in the cache when a reset happens. This is primarily capturing the fact that we are re-using data points, rather than making new ones.

Copy link

codecov bot commented Sep 9, 2024

Codecov Report

Attention: Patch coverage is 96.96970% with 2 lines in your changes missing coverage. Please review.

Project coverage is 63.37%. Comparing base (4caace7) to head (a48768a).
Report is 44 commits behind head on main.

Files with missing lines Patch % Lines
...lector/internal/datapointstorage/datapointcache.go 96.96% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #890      +/-   ##
==========================================
+ Coverage   61.03%   63.37%   +2.34%     
==========================================
  Files          56       57       +1     
  Lines        5903     6029     +126     
==========================================
+ Hits         3603     3821     +218     
+ Misses       2143     2046      -97     
- Partials      157      162       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dashpole dashpole force-pushed the optimize_normalization_caching branch from 7210af7 to 65a705b Compare September 10, 2024 19:03
@dashpole
Copy link
Contributor Author

Presubmit failure will be fixed by #891

@dashpole dashpole marked this pull request as ready for review September 10, 2024 19:56
@dashpole dashpole requested a review from a team as a code owner September 10, 2024 19:56
@dashpole
Copy link
Contributor Author

Ready for review

@dashpole dashpole changed the title Don't create new data points when caching Optimize normalization caching and copying Sep 10, 2024
@dashpole dashpole force-pushed the optimize_normalization_caching branch from 65a705b to 89fb2cf Compare September 11, 2024 14:57
@aabmass
Copy link
Contributor

aabmass commented Sep 11, 2024

Since our exporter is marked non-mutating, we can't change the data point that is passed to us

Would the code be significantly simpler if we could mutate? And are there any negative consequences of marking the exporter as mutating?

Other changes make the code harder to read because we now need to pass around the "original" and the "normalized" point

Could we wrap the pdata struct to hide this complexity to the caller?

@dashpole
Copy link
Contributor Author

Would the code be significantly simpler if we could mutate? And are there any negative consequences of marking the exporter as mutating?

That might be a good idea to explore as well. It looks like it only matters if there are multiple exporters in the config: https://github.com/open-telemetry/opentelemetry-collector/blob/2166b11137ec473b01b6cc5a0b9bf932cb752cf2/internal/fanoutconsumer/metrics.go#L21. If there is only a single one, there is no additional clone involved.

We already are doing a full clone for GMP anyways:

metricsCopy := pmetric.NewMetrics()
. Given that, it seems like it would be better to just mark our exporter as mutating, and simplify those things.

Could we wrap the pdata struct to hide this complexity to the caller?

We probably could, but that is a lot of boilerplate.

@dashpole
Copy link
Contributor Author

Lets try #892 first, and then see how much more of a benefit we get from some of the optimizations here.

@dashpole dashpole marked this pull request as draft September 12, 2024 19:12
@dashpole dashpole force-pushed the optimize_normalization_caching branch 2 times, most recently from 6236d55 to a29cf5d Compare September 19, 2024 20:18
@dashpole dashpole force-pushed the optimize_normalization_caching branch from a29cf5d to a48768a Compare September 19, 2024 20:46
@dashpole
Copy link
Contributor Author

Ok, i've rebased this on the other change, and it now is much smaller. It still seems like a good improvement to make.

@dashpole dashpole marked this pull request as ready for review September 19, 2024 20:50
@dashpole dashpole added the enhancement New feature or request label Sep 19, 2024
@dashpole dashpole merged commit f28ea8c into GoogleCloudPlatform:main Sep 20, 2024
34 checks passed
@dashpole dashpole deleted the optimize_normalization_caching branch September 20, 2024 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate performance improvements for datapoint normalization
3 participants