Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add opentelemetry env specific setup #9529

Merged
merged 17 commits into from
Feb 3, 2021

Conversation

ericmustin
Copy link
Contributor

What does this PR do?

This PR adds additional environment specific configuration for the opentelemetry collector and datadog exporter, for host based, containerized, and k8s environments.

Motivation

Some background:

  • A number of users are experiencing issues with onboarding. 😢
  • Our current documentation assumes that the user is more or less familiar with OpenTelemetry and has a collector configured and some prior art within their org around OTEL and Datadog Apm. In practice we've found this is not the case, they're often new to both areas at the same time, and so they get stuck setting up the collector.
  • Our docs were relying on links to the canonical opentelemetry docs, but we've found these to be subpar and a little bit unclear in practice.
  • Additionally, the collector and application configuration will vary depending on the env they're deployed in, and getting this correct also helps us get billing correct and linkage within the platform correct.

With that in mind, I've tried to mimic the datadog-agent section of our in app apm onboarding docs+scenarios, but for the opentelemetry-collector instead. This PR does not include language-sdk specific setup info like our in app apm docs do, but i'd like to add those in the near future and will be important. For now though, what's most critical is making sure folks deploy the collector correctly, as accurate hostname resolution from the collector impacts billing, and so this PR provides examples for the most common scenarios to do so.

Preview

https://docs-staging.datadoghq.com/ericmustin/add_otel_collector_config/tracing/setup_overview/open_standards/

Additional Notes

  • hey! plz help make me write more good! any feedback is appreciated, if it's unclear to you, dear reader, it's unclear to our users!
  • Formatting is very rough, if you find yourself reviewing and noticing the same formatting issues over and over, just point me to the docs for the right way to do it and i'll clean up.
  • From a technical perspective, it's entirely possible i've gotten some details wrong, as I am not a K8s expert. If anything looks fishy to you, it's probably wrong, could be improved, etc. I will try to loop in some more official k8s help here for a review.
  • I wasn't sure how to split some of the example yaml details into their own pages, but perhaps that would make sense to reduce bloat. Additionally, if there's a better way to present these sections via our templating, please let know.

cc'ing to start, @mx-psi , @andrewardito , @andrewsouthard1 , @albertvaka , @KSerrania , @kayayarai . Any feedback is super helpful.

Reviewer checklist

  • Review the changed files.
  • Review the URLs listed in the Preview section.
  • Review any mentions of "Contact Datadog support" for internal support documentation.

@ericmustin ericmustin requested a review from a team as a code owner January 20, 2021 16:45
@ericmustin ericmustin changed the title add opentelemtry env specific setup add opentelemetry env specific setup Jan 20, 2021
@kayayarai kayayarai self-requested a review January 20, 2021 17:04
@kayayarai kayayarai added the Do Not Merge Just do not merge this PR :) label Jan 20, 2021
@@ -51,7 +51,7 @@ On each OpenTelemetry-instrumented application, set the resource attributes `dev

### Ingesting OpenTelemetry Traces with the Collector

The OpenTelemetry Collector is configured by adding a [pipeline][8] to your `otel-collector-configuration.yml` file. Supply the relative path to this configuration file when you start the collector by passing it in via the `--config=<path/to/configuration_file>` command line argument. For examples of supplying a configuration file, see the [OpenTelemetry Collector documentation][9].
The OpenTelemetry Collector is configured by adding a [pipeline][8] to your `otel-collector-configuration.yml` file. Supply the relative path to this configuration file when you start the collector by passing it in via the `--config=<path/to/configuration_file>` command line argument. For examples of supplying a configuration file, see the [environment specific setup](#environent-specific-setup) section below or the [OpenTelemetry Collector documentation][9].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The OpenTelemetry Collector is configured by adding a [pipeline][8] to your `otel-collector-configuration.yml` file. Supply the relative path to this configuration file when you start the collector by passing it in via the `--config=<path/to/configuration_file>` command line argument. For examples of supplying a configuration file, see the [environment specific setup](#environent-specific-setup) section below or the [OpenTelemetry Collector documentation][9].
The OpenTelemetry Collector is configured by adding a [pipeline][8] to your `otel-collector-configuration.yml` file. Supply the relative path to this configuration file when you start the collector by passing it in via the `--config=<path/to/configuration_file>` command line argument. For examples of supplying a configuration file, see the [environment specific setup](#environment-specific-setup) section below or the [OpenTelemetry Collector documentation][9].

@andrewardito
Copy link
Contributor

Unrelated to changes in this PR, but would it be worth adding one more item to the partials for Otel Collector (Language Agnostic) or something similar? The first paragraph is great but if I don't read it I end up clicking a language and not realizing there is a collector option.

Copy link
Collaborator

@kayayarai kayayarai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An initial docs review for you. Feel free to chat with me in Slack if you want to discuss any of these or want pointers on fixing them.

@@ -93,6 +93,371 @@ service:
exporters: [datadog/api]
```

### Environment Specific Setup
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Environment Specific Setup
### Environment specific setup

All the headings (except the H1/page title) should be sentence-case. We probably aren't consistent on this, but any new headings and changed pages should be sentence-case.

@@ -93,6 +93,371 @@ service:
exporters: [datadog/api]
```

### Environment Specific Setup

#### Host:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Host:
#### Host


- Download the appropriate binary from [the project repository latest release](https://github.com/open-telemetry/opentelemetry-collector-contrib/releases/latest).

- Create a `otel_collector_config.yaml` file. Here is an example template to get started. It enables the collector's otlp receiver and datadog exporter.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Here is an example" Should we have a link to a sample? Which sample do you have in mind?

Suggested change
- Create a `otel_collector_config.yaml` file. Here is an example template to get started. It enables the collector's otlp receiver and datadog exporter.
- Create a `otel_collector_config.yaml` file. Here is an example template to get started. It enables the collector's `otlp` receiver and the Datadog exporter.

or OTLP? are you using "otlp" as a code keyword or as an adjective?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- Run on the host with the configration yaml file set via the `--config` parameter. For example,

```
curl -L https://github.com/open-telemetry/opentelemetry-collector-contrib/releases/latest/download/otelcontribcol_linux_amd64 | otelcontribcol_linux_amd64 --config otel_collector_config.yaml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they downloaded the binary, they wouldn't use curl right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup good catch. this was combining downloading the release and running it...will remove the curl bit to make it more clear.


- Create a `otel_collector_config.yaml` file. [Here is an example template](https://docs.datadoghq.com/tracing/setup_overview/open_standards/#ingesting-opentelemetry-traces-with-the-collector) to get started. It enables the collector's otlp receiver and datadog exporter.

- Use a published docker image such as [`otel/opentelemetry-collector-contrib:latest`](https://hub.docker.com/r/otel/opentelemetry-collector-contrib/tags)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Use a published docker image such as [`otel/opentelemetry-collector-contrib:latest`](https://hub.docker.com/r/otel/opentelemetry-collector-contrib/tags)
2. Choose a published docker image such as [`otel/opentelemetry-collector-contrib:latest`](https://hub.docker.com/r/otel/opentelemetry-collector-contrib/tags).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, move link to bottom of the file.


In order to accurately track the appropriate metadata in Datadog for information and billing purposes, it is recommended the OpenTelemetry Collector be run at least in agent mode on each of the Kubernetes Nodes.

- When deploying the OpenTelemetry Collector as a Daemonset, refer to [the example configuration below](#opentelemetry-kubernetes-example-collector-configuration) as a guide.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- When deploying the OpenTelemetry Collector as a Daemonset, refer to [the example configuration below](#opentelemetry-kubernetes-example-collector-configuration) as a guide.
When deploying the OpenTelemetry Collector as a daemonset, refer to [the example configuration below](#opentelemetry-kubernetes-example-collector-configuration) as a guide.


- When deploying the OpenTelemetry Collector as a Daemonset, refer to [the example configuration below](#opentelemetry-kubernetes-example-collector-configuration) as a guide.

- On the application container, use the downward API to pull the host IP; the application container needs an environment variable that points to status.hostIP. The OpenTelemetry Collector container Agent expects this to be named `OTEL_EXPORTER_OTLP_SPAN_ENDPOINT`. Use the [below example snippet](#opentelemetry-kubernetes-example-application-configuration) as a guide.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- On the application container, use the downward API to pull the host IP; the application container needs an environment variable that points to status.hostIP. The OpenTelemetry Collector container Agent expects this to be named `OTEL_EXPORTER_OTLP_SPAN_ENDPOINT`. Use the [below example snippet](#opentelemetry-kubernetes-example-application-configuration) as a guide.
On the application container, use the downward API to pull the host IP. The application container needs an environment variable that points to `status.hostIP`. The OpenTelemetry Collector container Agent expects this to be named `OTEL_EXPORTER_OTLP_SPAN_ENDPOINT`. Use the [below example snippet](#opentelemetry-kubernetes-example-application-configuration) as a guide.


- On the application container, use the downward API to pull the host IP; the application container needs an environment variable that points to status.hostIP. The OpenTelemetry Collector container Agent expects this to be named `OTEL_EXPORTER_OTLP_SPAN_ENDPOINT`. Use the [below example snippet](#opentelemetry-kubernetes-example-application-configuration) as a guide.

##### OpenTelemetry Kubernetes Example Collector Configuration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
##### OpenTelemetry Kubernetes Example Collector Configuration
##### Example Kubernetes collector configuration

name: otel-collector-config-vol
```

##### Opentelemetry Kubernetes Example Application Configuration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
##### Opentelemetry Kubernetes Example Application Configuration
##### Example Kubernetes application configuration

Copy link
Member

@mx-psi mx-psi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep these docs for Datadog specific configuration only, and link to the OpenTelemetry docs as much as possible. If not, we are going to end up having very long docs that are going to have to be constantly updated.

In particular, I think we should focus on answering these questions:

  1. What components do we require on each environment?
  2. What specific configuration (different from the one on the component docs) do we require on these components?

For example, for Kubernetes we should state that we require the k8s_tagger on Kubernetes and that the default configuration should be used.

Now, if some docs are missing upstream for a given component (like it is the case here) we should add it here now and contribute those docs upstream as soon as we can so that they are kept up to date and are useful for everyone

content/en/tracing/setup_overview/open_standards/_index.md Outdated Show resolved Hide resolved
content/en/tracing/setup_overview/open_standards/_index.md Outdated Show resolved Hide resolved
content/en/tracing/setup_overview/open_standards/_index.md Outdated Show resolved Hide resolved
content/en/tracing/setup_overview/open_standards/_index.md Outdated Show resolved Hide resolved
curl -L https://github.com/open-telemetry/opentelemetry-collector-contrib/releases/latest/download/otelcontribcol_linux_amd64 | otelcontribcol_linux_amd64 --config otel_collector_config.yaml
```

#### Docker
Copy link
Member

@mx-psi mx-psi Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we can mostly get rid of this section and just link here instead: https://opentelemetry.io/docs/collector/getting-started/#docker, making clear that we are present in the contrib flavor.

I don't think we should explain how to use Docker or how to use the OpenTelemetry Collector in general in Datadog docs, we should keep this for Datadog exporter specific configuration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These sections are quite misleading (note the kill $pid1; docker stop otelcol at the end of the docker command, for example) and have the user cloning the repo locally. i think it's better to have working examples in our docs and then contrib upstream to improve the upstream docs, and when those are improved we can point to them directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, let's keep it then. Can you open an issue for this on the https://github.com/open-telemetry/opentelemetry.io repo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

content/en/tracing/setup_overview/open_standards/_index.md Outdated Show resolved Hide resolved
content/en/tracing/setup_overview/open_standards/_index.md Outdated Show resolved Hide resolved
@ericmustin
Copy link
Contributor Author

ericmustin commented Jan 26, 2021

@mx-psi @kayayarai I tried to address all the feedback here. Let me know what you think. The two main points seem to be.

  1. General process around using our own examples vs upstream otel-collector docs
  • So, yea i would like to use as much of the upstream stuff as possible but by and large their examples and docs fall short right now or are too general. I think @mx-psi and I have agreed to try to take what works here and contribute it upstream, and then when upstream is in a good place, update our docs to point to the upstream links. but for now it's better to host our own, clear, datadog specific examples.
  1. K8s stuff is too chonky
  • I've included the full example yaml manifest in a PR to our otel-collector-contrib section here: [exporter/datadog] add example k8s configs open-telemetry/opentelemetry-collector-contrib#2193, linked to it, and instead have tried to highlight the key, datadog specific, sections and steps users should enable in their own k8s setup. Tbh i'm not super happy with how this reads currently, i'm not sure if it's obvious to readers that these are partials that the full example yaml manifest already includes.

Lastly, I think @andrewardito makes a good point here around ordering and discoverability. It would be really nice if we could preference or elevate the collector setup section. I'm not sure how best to accomplish that, i don't think it shoud be a blocker from getting these docs out but if anyone has ideas i'd love to try them.

@ericmustin
Copy link
Contributor Author

@kayayarai @mx-psi i uh..forgot to push up the changes, sorry about that, updated


#### Kubernetes

The OpenTelemetry Collector can be run in two types of [deployment scenarios][13].
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The OpenTelemetry Collector can be run in two types of [deployment scenarios][13].
The OpenTelemetry Collector can be run in two types of [deployment scenarios][13]:


- As an OpenTelemetry Collector "agent" running on the same host as the application in a sidecar or daemonset; or

- As a standalone service, e.g. a container or deployment, typically per-cluster, -datacenter or -region.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- As a standalone service, e.g. a container or deployment, typically per-cluster, -datacenter or -region.
- As a standalone service, for example a container or deployment, typically per-cluster, per-datacenter, or per-region.


When deploying the OpenTelemetry Collector as a daemonset, refer to [the example configuration below](#opentelemetry-kubernetes-example-collector-configuration) as a guide.

On the application container, use the downward API to pull the host IP. The application container needs an environment variable that points to `status.hostIP`. The OpenTelemetry Application SDKs expects this to be named `OTEL_EXPORTER_OTLP_ENDPOINT`. Use the [below example snippet](#opentelemetry-kubernetes-example-application-configuration) as a guide.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it "the SDKs expect" or "the SDK expects"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sdks expect...updated!


##### Example Kubernetes OpenTelemetry Collector configuration

A full example k8s manifest for deploying the OpenTelemetry Collector as both daemonset and standalone collector [can be found here][14]. Depending on your environment this example may be modified, however the important sections to note specific to Datadog are as follows.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A full example k8s manifest for deploying the OpenTelemetry Collector as both daemonset and standalone collector [can be found here][14]. Depending on your environment this example may be modified, however the important sections to note specific to Datadog are as follows.
A full example Kubernetes manifest for deploying the OpenTelemetry Collector as both daemonset and standalone collector [can be found here][14]. Modify the example to suit your environment. The key sections that are specific to Datadog are as follows:


2. Create a `otel_collector_config.yaml` file. [Here is an example template](#ingesting-opentelemetry-traces-with-the-collector) to get started. It enables the collector's OTLP Receiver and Datadog Exporter.

3. Run on the host with the configration yaml file set via the `--config` parameter. For example,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
3. Run on the host with the configration yaml file set via the `--config` parameter. For example,
3. Run the download on the host, specifying the configration YAML file in the `--config` parameter. For example:

# ...
```

3. For any OpenTelemetry-Collector's in "standalone collector" mode, which receive traces from downstream collectors and export to Datadog's backend, include a `batch` processor configured with a `timeout` of `10s` as well as the `k8s_tagger` enabled. These should be included along with the `datadog` exporter and added to the `traces` pipeline.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
3. For any OpenTelemetry-Collector's in "standalone collector" mode, which receive traces from downstream collectors and export to Datadog's backend, include a `batch` processor configured with a `timeout` of `10s` as well as the `k8s_tagger` enabled. These should be included along with the `datadog` exporter and added to the `traces` pipeline.
3. For OpenTelemetry Collectors in standalone collector mode, which receive traces from downstream collectors and export to Datadog's backend, include a `batch` processor configured with a `timeout` of `10s` as well as the `k8s_tagger` enabled. These should be included along with the `datadog` exporter and added to the `traces` pipeline.


3. For any OpenTelemetry-Collector's in "standalone collector" mode, which receive traces from downstream collectors and export to Datadog's backend, include a `batch` processor configured with a `timeout` of `10s` as well as the `k8s_tagger` enabled. These should be included along with the `datadog` exporter and added to the `traces` pipeline.

- In the `otel-collector-conf` ConfigMap's `data.otel-collector-config` `processors` section
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- In the `otel-collector-conf` ConfigMap's `data.otel-collector-config` `processors` section
In the `otel-collector-conf` ConfigMap's `data.otel-collector-config` `processors` section:

# ...
```

- In the `otel-collector-conf` ConfigMap's `data.otel-collector-config` `exporters` section
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- In the `otel-collector-conf` ConfigMap's `data.otel-collector-config` `exporters` section
In the `otel-collector-conf` ConfigMap's `data.otel-collector-config` `exporters` section:

key: <YOUR_API_KEY>
```

- In the `otel-agent-conf` ConfigMap's `data.otel-agent-config` `service.pipelines.traces` section
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- In the `otel-agent-conf` ConfigMap's `data.otel-agent-config` `service.pipelines.traces` section
In the `otel-agent-conf` ConfigMap's `data.otel-agent-config` `service.pipelines.traces` section:


##### Example Kubernetes OpenTelemetry application configuration

In addition to the OpenTelemetry Collector configuration, ensure OpenTelemetry SDKs installed in an application transmit telemetry data to the Collector by configuring the environment variable `OTEL_EXPORTER_OTLP_ENDPOINT` with the host IP. Use the downward API to pull the host IP, and set it as an environment variable, which is then interpolated when setting the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In addition to the OpenTelemetry Collector configuration, ensure OpenTelemetry SDKs installed in an application transmit telemetry data to the Collector by configuring the environment variable `OTEL_EXPORTER_OTLP_ENDPOINT` with the host IP. Use the downward API to pull the host IP, and set it as an environment variable, which is then interpolated when setting the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable.
In addition to the OpenTelemetry Collector configuration, ensure that OpenTelemetry SDKs that are installed in an application transmit telemetry data to the collector, by configuring the environment variable `OTEL_EXPORTER_OTLP_ENDPOINT` with the host IP. Use the downward API to pull the host IP, and set it as an environment variable, which is then interpolated when setting the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable:

@ericmustin
Copy link
Contributor Author

@kayayarai Hey! added the last round of feedback, i think we should be ok here. Just noting this branch is definitely from my own fork, but i also have happened to push up a branch with identical naming ericmustin/branch_name + commits on the datadog/documentation repo which is why that staging link automagically works. I can close this PR and open up against that branch if u prefer and that helps with any workflows, but fwiw they should be identical (and would prefer not to lose the pr comments)

@kayayarai
Copy link
Collaborator

@kayayarai Hey! added the last round of feedback...

Cool cool! I'll take a look at it this afternoon, no worries about the fork v. branch situation, you can leave it as is I think.

…aDog/documentation into ericmustin/add_otel_collector_config

merge upstream
@kayayarai kayayarai merged commit 7f9e917 into DataDog:master Feb 3, 2021
@kayayarai kayayarai removed the Do Not Merge Just do not merge this PR :) label Feb 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants