-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exporter/elasticexporter: add Elastic APM exporter #240
Conversation
Latest CI failure doesn't appear related to this PR, please let me know if I'm mistaken. Otherwise, this is now ready for review. Sorry for the noisy commits, happy to squash now or later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some smaller comments mainly related to the elastic translator.
for _, attr := range resource.GetAttributes() { | ||
switch k := attr.GetKey(); k { | ||
case conventions.AttributeServiceName: | ||
service.Name = cleanServiceName(attr.GetStringValue()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe prefix the service.Name
with conventions.AttributeServiceName
if set to ensure the final service.Name
is unique.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean prefix with namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that's what I meant, sorry for the confusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was all set to do that, but I see this comment in the spec:
Note: service.namespace and service.name are not intended to be concatenated for the purpose of forming a single globally unique name for the service. For example the following 2 sets of attributes actually describe 2 different services (despite the fact that the concatenation would result in the same string):
And looking at other exporters based on the old OpenCensus API (i.e. using ServiceInfo.Name), I see that they're not taking namespace into account. I'll stick with this for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing that out. I was thinking to concatenate with a -
instead of a .
to avoid the ambiguity. Fine with me though to move forward with this as is.
Codecov Report
@@ Coverage Diff @@
## master #240 +/- ##
==========================================
+ Coverage 78.56% 78.78% +0.22%
==========================================
Files 153 159 +6
Lines 7578 8048 +470
==========================================
+ Hits 5954 6341 +387
- Misses 1294 1351 +57
- Partials 330 356 +26
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me!
exporter/elasticexporter/exporter.go
Outdated
return exporterhelper.NewTraceExporter(cfg, func(ctx context.Context, traces pdata.Traces) (int, error) { | ||
var dropped int | ||
var errs []error | ||
for _, resourceSpans := range pdata.TracesToOtlp(traces) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason to operate on OTLP instead of internal data model here? Internal data interface should be suitable for use cases like this. If not, we'd need to cover the gaps later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking of adding support to receive OTLP directly in the Elastic APM Server later, to ease maintenance and minimise signal loss. That would obviate the need to translate to our wire protocol in opentelemetry-collector.
If/when that happens, I would move the internal/translator
code out of this repo, and the exporter would be a thin wrapper around the OTLP exporter; it would mostly just be about configuring things like auth headers.
If not, we'd need to cover the gaps later.
Could you please elaborate on that? What gaps are you referring to? The OTLP exporter will need to do this anyway won't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please elaborate on that? What gaps are you referring to? The OTLP exporter will need to do this anyway won't it?
We usually use internal data model consumer/pdata
not OTLP format directly. So I was thinking that you used the OTLP because the internal data interface doesn't work for your use case. In that case we would need to fill the gaps in the interface.
I see your point now. If you want to reuse translation from OTLP in the Elastic APM backend later on, it make sense to use OTLP instead of internal data model here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, thanks for the elaboration. If we don't end up going the route I described, I'll revisit this code and switch over to using the internal data model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are about to move OTLP generated code to our repository: open-telemetry/opentelemetry-collector#1037. Once we do it this code will no longer compile.
If it happens and you are not here to fix your component quickly we will have no choice but to disable it until you can fix it (according to the rules maintenance of contrib components is responsibility of contributors).
Once we have the generation on our side we may start modifying the code generation to produce more optimal in-memory structures. When we do so we will aim to hide the changes behind the public pdata
wrappers so that your code does not break. If you use generated OTLP structs directly again this can break your code.
I advise to stay with pdata
wrappers to avoid this sorts of problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tigrannajaryan fair enough, I wasn't aware that was happening. I can see now that it makes more sense to use pdata. I've updated the code accordingly. I suppose we could use that in the Elastic APM backend too.
I've removed the metrics translation for now, and will reinstate it when the switchover to pdata.Metrics is complete.
Thanks for the review @dmitryax! I've updated to use configtls, PTAL. If using the underlying OTLP types is going to be an issue I can rework the PR, but I'd prefer to keep it that way for the stated reasons. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@dmitryax what's the process for merging? Can you please do that for me? (I am not an approved committer.) |
I'm not authorized to do that as well. We need to wait for @tigrannajaryan or @bogdandrutu |
@dmitryax @tigrannajaryan I've updated to use pdata, in response to #240 (comment). Could you please take another look? |
Version dc17498 successfully tested |
@axw I've tried to run this PR on my setup and i could not get the spans to show on the APM UI, however i can see they are correctly written into ES. Is there anything that i need to enable on the Elastic cloud side to make the exporter works ? |
@vmarchaud you shouldn't need to do anything special in Elastic Cloud. It sounds like there could be a problem with the translator code. Do you have an example I can use to reproduce the issue? If that's not viable, then it might help to use the opentelemetry-collector logs exporter, and attach the log output after capturing a trace. Edit: just to be sure, what version of Kibana are you running? An old version might also explain why the UI doesn't pick them up. |
@axw I use the OT JS 0.8.3 version with some default plugins, i already got the debug log of the exporter enabled, do you want me to send them to you ? |
Yes please, just attach/paste the log here if you don't mind, and I'll take a look in the next few days. |
@vmarchaud I can see the traces with the OpenTelemetry Auto Instr Java as you can see on the screenshot I shared above. There may be something related to your setup. |
@axw I have some "sensitive" info that i wouldnt want to be on internet, could i send you an email instead ? @cyrille-leclerc I guess it could be related to my setup but it's a new elastic cloud environment (i created it ~10 days ago and didnt use the apm until today). The weird things is that i can see some spans (<5%) but not all. Some services aren't even shown in the UI even though there is data inside the index. |
@vmarchaud sure, you can send it to me at [email protected] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please fix the merge conflict.
@tigrannajaryan done, and squashed down to one commit. Thanks! |
Metrics are currently not exported; we'll wait for the data model changes to settle, so we can build the translation off the OTLP representation. Not all of the OpenTelemetry model is covered by Elastic APM. In particular, there's currently no support for links or events. We'll add support for events later, and most likely links too (see elastic/apm#122).
I just pushed a very small change to the translator, to fix an issue that @vmarchaud helped diagnose. |
@axw thank you! |
This PR introduces an exporter for [Elastic APM](https://www.elastic.co/apm). The exporter works by translating spans and metrics into the ND-JSON format expected by Elastic APM Server, and sending over HTTP. Currently only spans are supported. Code for translating metrics exists, but is not yet wired up to the exporter; we'll do that once the switch over to the new metrics model is done. Not all of the OpenTelemetry model is covered by Elastic APM. In particular, there's currently no support for links or span events. We'll add support for events later, and most likely links too (see elastic/apm#122). **Testing:** Unit tests added for translating resources, spans, and metrics to the Elastic APM model. This has been tested using a mock in-memory Elastic APM Server. Coverage is > 80%. Manually tested, sending to an [Elastic Cloud](https://cloud.elastic.co/) deployment. **Documentation:** Added a README, which describes the exporter's config. Metrics are currently not exported; we'll wait for the data model changes to settle, so we can build the translation off the OTLP representation.
Span Rename Processor exposes functionality to rename a span using attribute values from a span. This commit only adds: - the configuration - tests for the configuration - config documentation in the code and test yaml. - Updated processor/readme.md to list all processors. A follow up PR will add processor functionality and documentation to processor/readme.md
* move global trace provider api to global package. * fix doc.
Description:
This PR introduces an exporter for Elastic APM. The exporter works by translating spans and metrics into the ND-JSON format expected by Elastic APM Server, and sending over HTTP.
Currently only spans are supported. Code for translating metrics exists, but is not yet wired up to the exporter; we'll do that once the switch over to the new metrics model is done.
Not all of the OpenTelemetry model is covered by Elastic APM. In particular, there's currently no support for links or span events. We'll add support for events later, and most likely links too (see elastic/apm#122).
Testing:
Unit tests added for translating resources, spans, and metrics to the Elastic APM model. This has been tested using a mock in-memory Elastic APM Server. Coverage is > 80%.
Manually tested, sending to an Elastic Cloud deployment.
Documentation:
Added a README, which describes the exporter's config.