New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

feat(metrics): Track memory footprint of metrics buckets [INGEST-1132] #1284

Merged

jjbayer merged 8 commits into master from feat/track-metrics-footprint

Jun 3, 2022

Member

jjbayer commented Jun 2, 2022

To be able to limit the memory footprint of metrics buckets in the aggregator, we need to keep track of the number of elements we store. Instead of measuring the actual memory consumption, we apply a simple model, roughly measuring the bytes needed to encode a bucket:

counter buckets: 8 bytes (f64)
set buckets: number of unique elements * 4 (f32)
distribution buckets: number of unique elements * 12 (f64 + u32)
gauge: 40 bytes (5 * f64)

To avoid iterating over all the buckets every time we want to query the memory footprint, we keep a map of counters per project key (plus one total count) that is incremented with the footprint delta on every insert.

Not in this PR: Enforcements of limits.

jjbayer added 6 commits

June 2, 2022 08:32


          feat: Measure bucket size

b993fcd


          feat: Cost tracker

6b9f1af


          feat: Count cost deltas

3e9bee3


          test: test cost tracker and fix edge cases

cd0ab62


          test: Correct costs added

f2592be


          ref: Typo & name fix

024fb12

untitaker reviewed

View reviewed changes

relay-metrics/src/aggregation.rs Outdated

+                  /// This is very similar to [`relative_size`], which can possibly be removed.
+                  pub fn cost(&self) -> usize {
+                      match self {
+                          Self::Counter(c) => std::mem::size_of_val(c),

Member

untitaker Jun 2, 2022

I know we talked about tracking memory allocation, but thinking about it more I would actually prefer to hardcode numbers here. If we accidentally increase the struct size, i don't think this should immediately and implicitly be reflected in abuse limits. If we change abuse limits and how they are calculated I think it's probably better to do so explicitly.

Member Author

jjbayer Jun 2, 2022

Agreed.

jjbayer commented

View reviewed changes

relay-metrics/src/aggregation.rs

+                  /// because datastructures might have a memory overhead.
+                  ///
+                  /// This is very similar to [`relative_size`], which can possibly be removed.
+                  pub fn cost(&self) -> usize {

Member Author

jjbayer Jun 2, 2022

Not sure if cost is the best name for what we're modeling here. Open for suggestions.

relay-metrics/src/aggregation.rs

+              struct CostTracker {
+                  total_cost: usize,
+                  // Choosing a BTreeMap instead of a HashMap here, under the assumption that a BTreeMap
+                  // is still more efficient for the number of project keys we store.

Member Author

jjbayer Jun 2, 2022

This reasoning is up for discussion.

relay-metrics/src/aggregation.rs

+                              let cost_before = bucket_value.cost();
+                              value.merge_into(bucket_value)?;
+                              let cost_after = bucket_value.cost();
+                              added_cost = cost_after.saturating_sub(cost_before);

Member Author

jjbayer Jun 2, 2022

We could probably optimize this by making merge_into return the actual cost delta, but I decided against it for the sake of simplicity.

Member

jan-auer Jun 3, 2022

Wouldn't it be simpler to just add the cost of the single value here and have merge_into return whether or not something was added?

Member Author

jjbayer Jun 3, 2022

Yeah that could work.


          ref: PR review feedback

e213cb4

jjbayer marked this pull request as ready for review

June 2, 2022 13:25

jjbayer requested a review from a team

June 2, 2022 13:25

untitaker approved these changes

View reviewed changes


          feat: Add statsd metric for cost per project key

cb0cdaf

jjbayer merged commit acb94fa into master

jjbayer deleted the feat/track-metrics-footprint branch

June 3, 2022 07:49

jan-auer reviewed

View reviewed changes

relay-metrics/src/aggregation.rs

+                  /// This is very similar to [`BucketValue::relative_size`], which can possibly be removed.
+                  pub fn cost(&self) -> usize {
+                      match self {
+                          Self::Counter(_) => 8,

Member

jan-auer Jun 3, 2022

Instead of hard-coding, use std::mem::size_of with typedefs for the values.

Also, note that this is wrong. The size of a bucket value is always size_of::<BucketValue> + the allocations that happen within. So a more correct implementation would be:

const DIST_SIZE: usize = mem::size_of::<f64>() + mem::size_of::<Count>();

let allocations = match self {
    Self::Counter(_) => 0,
    Self::Set(s) => s.len() * mem::size_of::<u32>(), // better to typedef this to `SetValue` now
    Self::Gauge(_) => 0,
    Self::Distribution(m) => m.internal_size() * DIST_SIZE,
};

mem::size_of::<Self>() + allocations

Member Author

jjbayer Jun 3, 2022

I had an implementation that used size_of (at least partially), but @untitaker argued that having an explicit model that has to be changed manually is better: #1284 (comment)

Member

jan-auer Jun 3, 2022

This is still hared-coded and explicit, it just doesn't use magic numbers to be more self-explanatory.

relay-metrics/src/aggregation.rs

+                              let cost_before = bucket_value.cost();
+                              value.merge_into(bucket_value)?;
+                              let cost_after = bucket_value.cost();
+                              added_cost = cost_after.saturating_sub(cost_before);

Member

jan-auer Jun 3, 2022

Wouldn't it be simpler to just add the cost of the single value here and have merge_into return whether or not something was added?

jan-auer mentioned this pull request

feat(metrics): Limits on bucketing cost in aggregator [INGEST-1132] #1287

Merged

jjbayer mentioned this pull request

fix(metrics): Track memory footprint more accurately [INGEST-1132] #1288

Merged

jjbayer added a commit that referenced this pull request


          fix(metrics): Track memory footprint more accurately (#1288)

192a8eb

#1284 introduced a cost model for measuring the memory footprint of
metrics buckets stored in the aggregator. It has two flaws:

- It did not take into account the fixed size overhead of a BucketValue
(only looked at the values inside).
- It did not take into account the
size overhead of storing the BucketKey. This PR attempts to fix both
issues.

jan-auer added a commit that referenced this pull request


          Merge branch 'master' into jferg/replay-recordings

a54ef48

* master:
  ref(metrics): Stop logging relative bucket size (#1302)
  fix(metrics): Rename misnamed aggregator option (#1298)
  fix(server): Avoid a panic in the Sentry middleware (#1301)
  build: Update dependencies with known vulnerabilities (#1294)
  fix(metrics): Stop logging statsd metric per project key (#1295)
  feat(metrics): Limits on bucketing cost in aggregator [INGEST-1132] (#1287)
  fix(metrics): Track memory footprint more accurately (#1288)
  build(deps): Bump dependencies (#1293)
  feat(aws): Add relay-aws-extension crate which implements AWS extension as an actor (#1277)
  fix(meta): Update codeowners for the release actions (#1286)
  feat(metrics): Track memory footprint of metrics buckets (#1284)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet