-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(metrics): Track memory footprint of metrics buckets [INGEST-1132] #1284
Conversation
relay-metrics/src/aggregation.rs
Outdated
/// This is very similar to [`relative_size`], which can possibly be removed. | ||
pub fn cost(&self) -> usize { | ||
match self { | ||
Self::Counter(c) => std::mem::size_of_val(c), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we talked about tracking memory allocation, but thinking about it more I would actually prefer to hardcode numbers here. If we accidentally increase the struct size, i don't think this should immediately and implicitly be reflected in abuse limits. If we change abuse limits and how they are calculated I think it's probably better to do so explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
/// because datastructures might have a memory overhead. | ||
/// | ||
/// This is very similar to [`relative_size`], which can possibly be removed. | ||
pub fn cost(&self) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if cost
is the best name for what we're modeling here. Open for suggestions.
struct CostTracker { | ||
total_cost: usize, | ||
// Choosing a BTreeMap instead of a HashMap here, under the assumption that a BTreeMap | ||
// is still more efficient for the number of project keys we store. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reasoning is up for discussion.
let cost_before = bucket_value.cost(); | ||
value.merge_into(bucket_value)?; | ||
let cost_after = bucket_value.cost(); | ||
added_cost = cost_after.saturating_sub(cost_before); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could probably optimize this by making merge_into
return the actual cost delta, but I decided against it for the sake of simplicity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be simpler to just add the cost of the single value here and have merge_into
return whether or not something was added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that could work.
/// This is very similar to [`BucketValue::relative_size`], which can possibly be removed. | ||
pub fn cost(&self) -> usize { | ||
match self { | ||
Self::Counter(_) => 8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of hard-coding, use std::mem::size_of
with typedefs for the values.
Also, note that this is wrong. The size of a bucket value is always size_of::<BucketValue>
+ the allocations that happen within. So a more correct implementation would be:
const DIST_SIZE: usize = mem::size_of::<f64>() + mem::size_of::<Count>();
let allocations = match self {
Self::Counter(_) => 0,
Self::Set(s) => s.len() * mem::size_of::<u32>(), // better to typedef this to `SetValue` now
Self::Gauge(_) => 0,
Self::Distribution(m) => m.internal_size() * DIST_SIZE,
};
mem::size_of::<Self>() + allocations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had an implementation that used size_of
(at least partially), but @untitaker argued that having an explicit model that has to be changed manually is better: #1284 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still hared-coded and explicit, it just doesn't use magic numbers to be more self-explanatory.
let cost_before = bucket_value.cost(); | ||
value.merge_into(bucket_value)?; | ||
let cost_after = bucket_value.cost(); | ||
added_cost = cost_after.saturating_sub(cost_before); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be simpler to just add the cost of the single value here and have merge_into
return whether or not something was added?
#1284 introduced a cost model for measuring the memory footprint of metrics buckets stored in the aggregator. It has two flaws: - It did not take into account the fixed size overhead of a BucketValue (only looked at the values inside). - It did not take into account the size overhead of storing the BucketKey. This PR attempts to fix both issues.
* master: ref(metrics): Stop logging relative bucket size (#1302) fix(metrics): Rename misnamed aggregator option (#1298) fix(server): Avoid a panic in the Sentry middleware (#1301) build: Update dependencies with known vulnerabilities (#1294) fix(metrics): Stop logging statsd metric per project key (#1295) feat(metrics): Limits on bucketing cost in aggregator [INGEST-1132] (#1287) fix(metrics): Track memory footprint more accurately (#1288) build(deps): Bump dependencies (#1293) feat(aws): Add relay-aws-extension crate which implements AWS extension as an actor (#1277) fix(meta): Update codeowners for the release actions (#1286) feat(metrics): Track memory footprint of metrics buckets (#1284)
To be able to limit the memory footprint of metrics buckets in the aggregator, we need to keep track of the number of elements we store. Instead of measuring the actual memory consumption, we apply a simple model, roughly measuring the bytes needed to encode a bucket:
f64
)f32
)f64
+u32
)f64
)To avoid iterating over all the buckets every time we want to query the memory footprint, we keep a map of counters per project key (plus one total count) that is incremented with the footprint delta on every insert.
Not in this PR: Enforcements of limits.