-
Notifications
You must be signed in to change notification settings - Fork 814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Submitting metrics to the api endpoint #3180
Conversation
Add a test Add support for device
formatted_sample = [s['metric'], s['points'][0][0], s['points'][0][1], attributes] | ||
legacy_payload_split['metrics'].append(formatted_sample) | ||
|
||
self.assertEqual(legacy_payload, legacy_payload_split) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will always be True
, split_payload
modifies the original payload by reference, in other words
# &legacy_payload_split == &legacy_payload
legacy_payload_split, metrics_payload = split_payload(legacy_payload)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, and so that things are clearer maybe split_payload
should either return a legacy_payload_split
and not mutate the legacy_payload
passed as argument or not return a legacy_payload_split
and mutate the legacy_payload
(which probably makes more sense here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch 🤦♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'll just pass a copy of the payload to the test for 2 reasons.
- Better to be explicit, if the function is mutating the dict, having it being returned is better than silently mutating it.
- We could make a copy of the dict in the split payload function but this would come with a memory cost.
So i'll just make a copy of the payload in the test function.
Good catch
emitter.py
Outdated
|
||
metrics_payload["series"].append(sample) | ||
|
||
return legacy_payload, metrics_payload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming nitpick: legacy_payloads
refers to both the original input, and the modified/stripped output. Shall we use two different names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't matter and it's still a legacy payload
sample['tags'] = ts[3]['tags'] | ||
if ts[3].get('device_name'): | ||
sample['device'] = ts[3]['device_name'] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not handling any exception in this block.
Considering that metrics are strongly formatted, I believe it is "safe" to accept it and avoid a try ... except ... continue
overhead.
@olivielpeau any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not handling exceptions here looks safe to me, the format is well-defined and the code is safe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this code is safe.
emitter.py
Outdated
|
||
|
||
def post_headers(agentConfig, payload): | ||
def split_payload(legacy_payload): | ||
metrics = list(legacy_payload['metrics']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: the method performs reads only on metrics
, thus, it is not required to make a copy of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test needs a re-write but the overall logic looks good to me.
I'd also perform a simple benchmark to measure what overhead this adds in the collector.
if ts[3].get('tags'): | ||
sample['tags'] = ts[3]['tags'] | ||
if ts[3].get('device_name'): | ||
sample['device'] = ts[3]['device_name'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could change those keys directly in the MetricAggregator
formatter to speed things up.
sample.update(ts[3])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should device_name
be changed to device
? Dogstatsd uses device_name
in its API payload (https://github.com/DataDog/dd-agent/blob/5.11.3/aggregator.py#L991)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yann i didn't want to change the formatter as i thought it was used by the legacy checks. Turns out it's not the case. I'll update this.
@olivielpeau what do you mean ? legacy payload processing will use device_name
, new payload expects device
hence my change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually i want to keep this logic kinda isolated as it's a hack. Better to keep it in a single place to keep the code cleaner. Agent 6 will get rid of all of this anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, added a few comments.
Just as a side note, it looks like handling the api series
format directly in the new-style and old-style check base classes would be a bit tricky since the unix system metrics can't use the api series
endpoint
emitter.py
Outdated
for ts in metrics: | ||
sample = { | ||
"metric": ts[0], | ||
"points": [[ts[1], ts[2]]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could be a tuple
instead of a list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed.
if ts[3].get('tags'): | ||
sample['tags'] = ts[3]['tags'] | ||
if ts[3].get('device_name'): | ||
sample['device'] = ts[3]['device_name'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should device_name
be changed to device
? Dogstatsd uses device_name
in its API payload (https://github.com/DataDog/dd-agent/blob/5.11.3/aggregator.py#L991)
sample['tags'] = ts[3]['tags'] | ||
if ts[3].get('device_name'): | ||
sample['device'] = ts[3]['device_name'] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not handling exceptions here looks safe to me, the format is well-defined and the code is safe
formatted_sample = [s['metric'], s['points'][0][0], s['points'][0][1], attributes] | ||
legacy_payload_split['metrics'].append(formatted_sample) | ||
|
||
self.assertEqual(legacy_payload, legacy_payload_split) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, and so that things are clearer maybe split_payload
should either return a legacy_payload_split
and not mutate the legacy_payload
passed as argument or not return a legacy_payload_split
and mutate the legacy_payload
(which probably makes more sense here)
- Do not make a copy of the metrics - Pass a copy to the function in the test
Addressed some of your comments and fixed a few things. Let me know if there is other stuff but i think we're good now. |
What does this PR do?
This PR splits the collector payload into a
legacy
payload and ametrics
payload.It does so by extracting the metrics that are in the "new" format from the current payload, transforming them, a putting them into a new payload to be posted to the
api/v1/series
endpoint.Motivation
This will help up phasing out a backend service used by the old payload.
Testing Guidelines