-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
input/otlp: Group Scope Metrics By Attributes Only ( Instead of timestamp + attrs ) #372
Comments
@anakineo your expected result document does not include a timestamp. These are time series, so naturally they need to have a timestamp. What timestamp would you expect to be recorded on the resulting document if we did not include that in the grouping key? (Bear in mind that data may arrive late, so the time at which it is received may be much different to the time it was exported.)
Would you be able to elaborate on this a little bit? How does having them in the same document make them easier to use for alerting and dashboarding? I'm asking because there are long-running discussions about the issues related to storing a single metric per document (e.g. storage overhead), and grouping (which can introduce other issues): elastic/elasticsearch#91775 |
@axw Thanks for getting back
You're right. I was wrong to suggest to remove timestamp and what I wanted to convey was to set timestamp of the metrics from one scope to a single timestamp, so that all metrics will be grouped and sent as one document. Our case is we have different scopes each containing from few to few dozen metrics, within each scope, all the metrics are recorded with the same I feel like this might be best done at the Otel collector side or configurable at APM side.
For dashboarding, we often find the need to consolidate different documents into one view and do calculations based on metrics spanning few documents. The case with alerting. Say we want give whoever receives and alert a complete context around what's happened (they can be in the form of labels, but not all are labels), it's much easier if the related metrics are in one doc. Right now we need to join some docs to get the complete info. |
@anakineo thanks for the additional info!
Got it, thanks. I wonder if the OTel metrics API should provide a means of specifying the timestamp when recording a data point. I think that would make sense for example when you take a snapshot of various aspects of your system, and then record each one as a separate metric. The timestamp would be the time at which you took the snapshot.
Without some kind of change to the OTel API, I think this will need to be handled with an OTel Collector processor before the data gets to APM Server. I can think of a couple of hypothetical options:
So you're calculating derivative metrics at query/aggregation time? Do you have a concrete example of that which you're able to share?
I haven't thought about this deeply, but it feels like a shortcoming of alerts if you're forced to group things in the same doc to get context in the alert. I mean, that context gathering/correlation could probably also be deferred to when an alert occurs. But anyway, understood. |
Thanks for the pointers! I did look at the transform processor and it can do many things, including setting timestamp for each datapoint at the time of processing. I was particularly interested in using the timestamp in the datapoint instead of the time of processing which can be deployed by a lot if, for example, collector is down. Rounding to the second sounds promising! I knew about the interval processor but thought it wasn't the processor I'm looking for
It's about correlating. Suppose we have documents containing 20 metics plus some labels. We will use some to build a line graph that shows the result of metric#1/metric#2 ( like a rate ), in the meanwhile, we want to put some references to this line using other metrics from the same documents, such as resource usages, disk I/O and other ( we know from our experiences that are or used to be the cause of the issue ) related system metrics. The goal is if we receive some alerts, we can also tell from the line graph if some other things are off. This might be XY problem that has other solutions but it's difficult to do if the metrics are scattered in different document. The same goes for alerting, IMO, it's best to have everything ( that we know it's relevant to this particular case) pre-populated in alerts to get quicker troubleshooting (say the rate reaching the threshold fires an alert, and there are resource usage metrics in the alert as well ). Agree we can gather the context when an alert fires but that goes back to the question about how we easy we can get the data if they are in different documents if we don't' want to do lots of joins. Having related metrics in one document makes it a bit easier |
@anakineo this is great, thanks. I will bring this information to the wider team here and see if anyone has some additional suggestions. If nothing else, it'll be useful for us to consider while building the solution. In the mean time I'd still recommend pre-processing with the collector.
If it turns out to be an XY problem, we certainly need to document how to do these sorts of things. It doesn't sound like a very unusual requirement. |
Hi,
APM server seems to break
Scope Metrics
into different documents (as seen on Kibana) if metrics are instrumented via SDK, instead of all metrics from the same scope in one document, due to metrics having different timestamp. Not sure if this is a feature or an issueIssue Statement
We're instrumenting via Opentelemetry-go SDK and have different types of metrics organized around scopes (i.e., different Meter). Then all metrics are pipelined through OTeL collector, then onto APM server.
Below is the snippets of the metrics from one Scope, as output by Otel collector. 5 metrics under
scope1
with the same attributes but different timestampCurrent result seen on Kibana: 5 documents with each containing one metric
Expected result seen on Kibana: one document with 5 metrics
Possible cause
The group key is the combination of timestamp and attributes on Datapoint as seen here
key := metricsetKey{timestamp: timestamp, signature: signatureBuilder.String()}
So if the metrics come in with different timestamp, they will end up in different metricset.
Version
8.15
Proposal
It makes sense that all metrics from one scope be seen in one document. This makes it easier to later use them for alerting and dashboard. Without knowing how data is stored in Elastic, I feel like this may save some storage as well
So it seems reasonable to only use attributes on Datapoint as group key since metrics will certainly have different timestamp if generated by SDK. To my knowledge, there is no way to set timestamp while instrumenting.
From:
To:
The text was updated successfully, but these errors were encountered: