-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add exemplars support to prometheusremotewriteexporter #5578
add exemplars support to prometheusremotewriteexporter #5578
Conversation
Signed-off-by: Anne-Elisabeth Lelievre <[email protected]>
Signed-off-by: Anne-Elisabeth Lelievre <[email protected]>
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
@codeboten I think this PR is ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -327,7 +361,34 @@ func addSingleHistogramDataPoint(pt pdata.HistogramDataPoint, resource pdata.Res | |||
Timestamp: time, | |||
} | |||
infLabels := createAttributes(resource, pt.Attributes(), externalLabels, nameStr, baseName+bucketStr, leStr, pInfStr) | |||
addSample(tsMap, infBucket, infLabels, metric) | |||
|
|||
promExemplar := getPromExemplar(pt, len(pt.BucketCounts())-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify where len(pt.BucketCounts())-1
comes from? I could be misunderstanding it, but my reading of the proto doesn't indicate any relationship between bucket counts and exemplars
Shouldn't we just go through all the exemplars, convert them and add them all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anuraaga according to my understanding, the len(pt.BucketCounts())-1 seems to be the index for the +inf bucket. For me, from what I read, it seems to exists a relationship between the bucket index and the exemplars with the support in prometheus:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying! Can you add a comment referencing that we need to use the +inf bucket?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes of course! done
Signed-off-by: Anne-Elisabeth Lelièvre <[email protected]>
Signed-off-by: Anne-Elisabeth Lelievre <[email protected]>
Signed-off-by: Anne-Elisabeth Lelièvre <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my comments, @anuraaga please review and approve if your comments have been addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm super new to the code base but I think making sure the traceid label is present in as many exemplars is important. Otherwise the exported exemplars are not very useful in Prometheus.
Other than that, good job! I can't wait to see this merged!
labels := exemplar.FilteredAttributes().AsRaw() | ||
for key, value := range labels { | ||
promLabel := prompb.Label{ | ||
Name: key, | ||
Value: value.(string), | ||
} | ||
|
||
promExemplar.Labels = append(promExemplar.Labels, promLabel) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are doing this because of
exemplar.FilteredAttributes().Insert(traceIDKey, pdata.NewAttributeValueString(traceID.HexString())) |
But this means the trace_id
key will only be set on exemplars from the spanmetricsprocessor. I think we should set the trace_id
on every exemplar. Instead of using FilteredAttributes
, why not just do:
promLabel := prompb.Label{
Name: traceIDKey,
Value: exemplar.TraceID().HexString(),
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @gouthamve, initially I wanted to do it like this, but I think it is better to make it as generic as possible by going through all the labels and not specifying only the trace_id but the others as well. From what I understood, the trace_id could be set in any processor and the exporter just needs to copy what he received from the processor and not wonder if there is really the trace_id. But tbh, I'm not 100% sure and I understand your point and maybe I'm wrong, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm, so the current solution only works for exemplars emitted by spanmetricsprocessor
, if a service sends an exemplar, it would be sent to Prometheus, but it would be ~useless because the only consumer of exemplars (Grafana) needs a traceID.
I think its fair to add all the labels, but we should also make sure the traceID is one of the labels and if its not, add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, there are two use cases:
Use case (1):
+-----------------+ +------------------------+ +--------------------------+
| OTLP Receiver | ------> | Span Metrics Processor | ------> | OTLP Exporter |
+-----------------+ +------------------------+ | +--------------------------+
|
| +--------------------------+
+--> | Prometheus Remote Write |
+--------------------------+
Use case (2):
+-----------------------+ +--------------------------+
| Prometheus Receiver | ------> | Prometheus Remote Write |
+-----------------------+ +--------------------------+
In use case (1), metrics are generated from trace spans and the trace_id
is injected as an exemplar in the spanmetricsprocessor
.
However, in use case (2), metrics are coming from scrapped instances (i.e. no traceID
are associated for those metrics in pdata
). IMHO we should honor exemplar(s) that are attached to those metrics and forward them.
WDYT?
@Anne-Eli we are currently having the following errors when sending metrics to Cortex:
maybe related to: #5263 |
@secat I'm on it |
I created this PR 6140 to fix this in the span metrics processor with the current changes of this PR and everything seems to be ok now when sending metrics to cortex. |
Please do not merge this PR, I just found 2 little issues, I'll commit 2 fixes ASAP, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking as requested #5578 (comment)
…the AsString conversion Signed-off-by: Anne-Elisabeth Lelièvre <[email protected]>
…sociated bucket bound Signed-off-by: Anne-Elisabeth Lelièvre <[email protected]>
I just pushed some new changes in this PR and here are the 2 changes I added :
before:
now:
Finally, I re-tested sending to prometheus and cortex and everything looks ok now. In prometheus, the "le" values are ok now. I know some unit tests are failing but I don't thinks so this is related to my new changes (It seems to come from here https://github.com/anneelisabethlelievre/opentelemetry-collector-contrib/blob/main/extension/awsproxy/factory_test.go#L51, but I'm not sure there is anything to do with my changes). cc: @codeboten |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fixes @anneelisabethlelievre! You're right the failures are unrelated.
/easycla |
Signed-off-by: Anne-Elisabeth Lelievre [email protected]
Hello, this is the implementation of #5192 to support exemplars for the prometheus remote write exporter.
Thanks