Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exporter/elasticexporter: add Elastic APM exporter #240

Merged
merged 1 commit into from
Jun 1, 2020

Conversation

axw
Copy link
Contributor

@axw axw commented May 16, 2020

Description:

This PR introduces an exporter for Elastic APM. The exporter works by translating spans and metrics into the ND-JSON format expected by Elastic APM Server, and sending over HTTP.

Currently only spans are supported. Code for translating metrics exists, but is not yet wired up to the exporter; we'll do that once the switch over to the new metrics model is done.

Not all of the OpenTelemetry model is covered by Elastic APM. In particular, there's currently no support for links or span events. We'll add support for events later, and most likely links too (see elastic/apm#122).

Testing:

Unit tests added for translating resources, spans, and metrics to the Elastic APM model. This has been tested using a mock in-memory Elastic APM Server. Coverage is > 80%.

Manually tested, sending to an Elastic Cloud deployment.

Documentation:

Added a README, which describes the exporter's config.

@axw axw requested a review from a team May 16, 2020 11:54
@axw
Copy link
Contributor Author

axw commented May 16, 2020

Latest CI failure doesn't appear related to this PR, please let me know if I'm mistaken. Otherwise, this is now ready for review. Sorry for the noisy commits, happy to squash now or later.

Copy link

@simitt simitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some smaller comments mainly related to the elastic translator.

exporter/elasticexporter/config_test.go Outdated Show resolved Hide resolved
exporter/elasticexporter/config.go Outdated Show resolved Hide resolved
exporter/elasticexporter/exporter.go Outdated Show resolved Hide resolved
for _, attr := range resource.GetAttributes() {
switch k := attr.GetKey(); k {
case conventions.AttributeServiceName:
service.Name = cleanServiceName(attr.GetStringValue())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe prefix the service.Name with conventions.AttributeServiceName if set to ensure the final service.Name is unique.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean prefix with namespace?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that's what I meant, sorry for the confusion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was all set to do that, but I see this comment in the spec:

Note: service.namespace and service.name are not intended to be concatenated for the purpose of forming a single globally unique name for the service. For example the following 2 sets of attributes actually describe 2 different services (despite the fact that the concatenation would result in the same string):

And looking at other exporters based on the old OpenCensus API (i.e. using ServiceInfo.Name), I see that they're not taking namespace into account. I'll stick with this for now.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out. I was thinking to concatenate with a - instead of a . to avoid the ambiguity. Fine with me though to move forward with this as is.

@codecov
Copy link

codecov bot commented May 20, 2020

Codecov Report

Merging #240 into master will increase coverage by 0.22%.
The diff coverage is 82.76%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #240      +/-   ##
==========================================
+ Coverage   78.56%   78.78%   +0.22%     
==========================================
  Files         153      159       +6     
  Lines        7578     8048     +470     
==========================================
+ Hits         5954     6341     +387     
- Misses       1294     1351      +57     
- Partials      330      356      +26     
Impacted Files Coverage Δ
exporter/elasticexporter/exporter.go 68.91% <68.91%> (ø)
...asticexporter/internal/translator/elastic/utils.go 68.96% <68.96%> (ø)
exporter/elasticexporter/factory.go 83.33% <83.33%> (ø)
...icexporter/internal/translator/elastic/metadata.go 85.96% <85.96%> (ø)
...sticexporter/internal/translator/elastic/traces.go 86.73% <86.73%> (ø)
exporter/elasticexporter/config.go 100.00% <100.00%> (ø)
receiver/carbonreceiver/transport/tcp_server.go 65.71% <0.00%> (-1.91%) ⬇️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e75df5b...8f6e041. Read the comment docs.

Copy link
Member

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me!

return exporterhelper.NewTraceExporter(cfg, func(ctx context.Context, traces pdata.Traces) (int, error) {
var dropped int
var errs []error
for _, resourceSpans := range pdata.TracesToOtlp(traces) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason to operate on OTLP instead of internal data model here? Internal data interface should be suitable for use cases like this. If not, we'd need to cover the gaps later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of adding support to receive OTLP directly in the Elastic APM Server later, to ease maintenance and minimise signal loss. That would obviate the need to translate to our wire protocol in opentelemetry-collector.

If/when that happens, I would move the internal/translator code out of this repo, and the exporter would be a thin wrapper around the OTLP exporter; it would mostly just be about configuring things like auth headers.

If not, we'd need to cover the gaps later.

Could you please elaborate on that? What gaps are you referring to? The OTLP exporter will need to do this anyway won't it?

Copy link
Member

@dmitryax dmitryax May 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate on that? What gaps are you referring to? The OTLP exporter will need to do this anyway won't it?

We usually use internal data model consumer/pdata not OTLP format directly. So I was thinking that you used the OTLP because the internal data interface doesn't work for your use case. In that case we would need to fill the gaps in the interface.

I see your point now. If you want to reuse translation from OTLP in the Elastic APM backend later on, it make sense to use OTLP instead of internal data model here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, thanks for the elaboration. If we don't end up going the route I described, I'll revisit this code and switch over to using the internal data model.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are about to move OTLP generated code to our repository: open-telemetry/opentelemetry-collector#1037. Once we do it this code will no longer compile.

If it happens and you are not here to fix your component quickly we will have no choice but to disable it until you can fix it (according to the rules maintenance of contrib components is responsibility of contributors).

Once we have the generation on our side we may start modifying the code generation to produce more optimal in-memory structures. When we do so we will aim to hide the changes behind the public pdata wrappers so that your code does not break. If you use generated OTLP structs directly again this can break your code.

I advise to stay with pdata wrappers to avoid this sorts of problems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tigrannajaryan fair enough, I wasn't aware that was happening. I can see now that it makes more sense to use pdata. I've updated the code accordingly. I suppose we could use that in the Elastic APM backend too.

I've removed the metrics translation for now, and will reinstate it when the switchover to pdata.Metrics is complete.

exporter/elasticexporter/config.go Outdated Show resolved Hide resolved
@axw
Copy link
Contributor Author

axw commented May 26, 2020

Thanks for the review @dmitryax! I've updated to use configtls, PTAL.

If using the underlying OTLP types is going to be an issue I can rework the PR, but I'd prefer to keep it that way for the stated reasons.

Copy link
Member

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@axw
Copy link
Contributor Author

axw commented May 27, 2020

@dmitryax what's the process for merging? Can you please do that for me? (I am not an approved committer.)

@dmitryax
Copy link
Member

I'm not authorized to do that as well. We need to wait for @tigrannajaryan or @bogdandrutu

@cyrille-leclerc
Copy link
Member

Successfully tested with 3 java apps using the opentelemetry-auto-instr-java sending traces to opentelemetry-collector exporting to Elastic APM.

image

@axw axw force-pushed the elasticexporter branch from 62cfc0c to bc2120a Compare May 29, 2020 02:47
@axw
Copy link
Contributor Author

axw commented May 29, 2020

@dmitryax @tigrannajaryan I've updated to use pdata, in response to #240 (comment). Could you please take another look?

@cyrille-leclerc
Copy link
Member

Version dc17498 successfully tested

image

@vmarchaud
Copy link
Member

@axw I've tried to run this PR on my setup and i could not get the spans to show on the APM UI, however i can see they are correctly written into ES. Is there anything that i need to enable on the Elastic cloud side to make the exporter works ?

@axw
Copy link
Contributor Author

axw commented May 29, 2020

@vmarchaud you shouldn't need to do anything special in Elastic Cloud. It sounds like there could be a problem with the translator code.

Do you have an example I can use to reproduce the issue? If that's not viable, then it might help to use the opentelemetry-collector logs exporter, and attach the log output after capturing a trace.

Edit: just to be sure, what version of Kibana are you running? An old version might also explain why the UI doesn't pick them up.

@vmarchaud
Copy link
Member

@axw I use the OT JS 0.8.3 version with some default plugins, i already got the debug log of the exporter enabled, do you want me to send them to you ?

@axw
Copy link
Contributor Author

axw commented May 29, 2020

Yes please, just attach/paste the log here if you don't mind, and I'll take a look in the next few days.

@cyrille-leclerc
Copy link
Member

@vmarchaud I can see the traces with the OpenTelemetry Auto Instr Java as you can see on the screenshot I shared above. There may be something related to your setup.
I am at the moment looking for an example of OpenTelemetry instrumentation covering multiple languages. I have pinged @mtwo to see if he was aware of such demo.

@vmarchaud
Copy link
Member

@axw I have some "sensitive" info that i wouldnt want to be on internet, could i send you an email instead ?

@cyrille-leclerc I guess it could be related to my setup but it's a new elastic cloud environment (i created it ~10 days ago and didnt use the apm until today). The weird things is that i can see some spans (<5%) but not all. Some services aren't even shown in the UI even though there is data inside the index.

@axw
Copy link
Contributor Author

axw commented May 29, 2020

@vmarchaud sure, you can send it to me at [email protected]

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please fix the merge conflict.

@axw axw force-pushed the elasticexporter branch from dc17498 to 794f74a Compare May 30, 2020 01:36
@axw
Copy link
Contributor Author

axw commented May 30, 2020

@tigrannajaryan done, and squashed down to one commit. Thanks!

Verified

This commit was signed with the committer’s verified signature.
axw Andrew Wilkins
Metrics are currently not exported; we'll wait for
the data model changes to settle, so we can build
the translation off the OTLP representation.

Not all of the OpenTelemetry model is covered by
Elastic APM. In particular, there's currently no
support for links or events. We'll add support for
events later, and most likely links too
(see elastic/apm#122).
@axw axw force-pushed the elasticexporter branch from 794f74a to 8f6e041 Compare June 1, 2020 01:17
@axw
Copy link
Contributor Author

axw commented Jun 1, 2020

I just pushed a very small change to the translator, to fix an issue that @vmarchaud helped diagnose.

@tigrannajaryan
Copy link
Member

@axw thank you!

@axw axw deleted the elasticexporter branch June 2, 2020 02:12
wyTrivail referenced this pull request in mxiamxia/opentelemetry-collector-contrib Jul 13, 2020
This PR introduces an exporter for [Elastic APM](https://www.elastic.co/apm). The exporter works by translating spans and metrics into the ND-JSON format expected by Elastic APM Server, and sending over HTTP.

Currently only spans are supported. Code for translating metrics exists, but is not yet wired up to the exporter; we'll do that once the switch over to the new metrics model is done.

Not all of the OpenTelemetry model is covered by Elastic APM. In particular, there's currently no support for links or span events. We'll add support for events later, and most likely links too (see elastic/apm#122).

**Testing:**

Unit tests added for translating resources, spans, and metrics to the Elastic APM model. This has been tested using a mock in-memory Elastic APM Server. Coverage is > 80%.

Manually tested, sending to an [Elastic Cloud](https://cloud.elastic.co/) deployment.

**Documentation:**

Added a README, which describes the exporter's config.
Metrics are currently not exported; we'll wait for
the data model changes to settle, so we can build
the translation off the OTLP representation.
mxiamxia referenced this pull request in mxiamxia/opentelemetry-collector-contrib Jul 22, 2020
Span Rename Processor exposes functionality to rename a span using attribute values from a span.

This commit only adds:

- the configuration 
- tests for the configuration
- config documentation in the code and test yaml.
- Updated processor/readme.md to list all processors.

A follow up PR will add processor functionality and documentation to processor/readme.md
ljmsc referenced this pull request in ljmsc/opentelemetry-collector-contrib Feb 21, 2022

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
* move global trace provider api to global package.

* fix doc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants