Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New component: processor/datadog #15689

Closed
2 tasks done
gbbr opened this issue Oct 27, 2022 · 5 comments · Fixed by #16607
Closed
2 tasks done

New component: processor/datadog #15689

gbbr opened this issue Oct 27, 2022 · 5 comments · Fixed by #16607
Labels
Accepted Component New component has been sponsored Vendor-Specific Component New component that interfaces with a vendor API and will be maintained by the vendor.

Comments

@gbbr
Copy link
Member

gbbr commented Oct 27, 2022

The purpose and use-cases of the new component

Collects pre-sampling trace metrics. Users who wish to use the probabilistic sampler or the tailsamplingprocessor can prepend the "datadog" processor in their pipelines to see the full picture. Please see the example configuration below.

Example configuration for the component

Please note the new datadog processor addition.

receivers:
  otlp:
    protocols:
      http:

processors:
  batch:
  probabilistic_sampler:
    sampling_percentage: 10
  datadog:

exporters:
  datadog:
    api:
      key: ${DD_API_KEY}

service:
  pipelines:
    metrics:
      receivers: [otlp]
      processors: [batch, k8sattributes]
      exporters: [datadog]
    traces:
      receivers: [otlp]
      processors: [batch, k8sattributes, datadog, probabilistic_sampler]
      exporters: [datadog]

An exporter_name setting would also be available if one wishes to use a non-Datadog exporter. Otherwise, the datadog processor would automatically detect the presence of the Datadog exporter and use that.

Telemetry data types supported

Traces.

Is this a vendor-specific component?

  • This is a vendor-specific component
  • If this is a vendor-specific component, I am proposing to contribute this as a representative of the vendor.

Sponsor (optional)

@mx-psi

Additional context

Why don't you use the spanmetricsprocessor?

While this seems to be the most natural solution, there are some disadvantages and limitations that are blocking it:

  • Stats accuracy is too low for Datadog. It uses Histogram instead of ExponentialHistogram.
  • Impossible to obtain the original instrumentation scope which is overridden to "spanmetricsprocessor". This is needed because it is an essential property.
  • Is high cardinality because it requires the collection of many dimensions to deduce the Datadog counter-part attributes.
  • Is marked as experimental.
  • Is meant for generic use cases instead of a specific vendor product.
  • Does not obfuscate SQL queries, which is needed for aggregation keys to function correctly at Datadog.
  • Hacky setup that needs to specify exporter vs. automatic detection that is being proposed here.
  • Can not communicate to a potentially remote Datadog Exporter (in the case of gateway deployments) that stats have already been computed elsewhere and it should skip doing that. The proposed processor can just intercept the traces that pass through it and add a resource attribute to mark them as "computed".

Why don't you use two separate pipelines?

One could use two separate pipelines with two separate exporters:

  • One for traces with sampling
  • One for traces without sampling that exports only stats

This change involves adding a second exporter, which is fine (considering it is accepted by the community), but raises several difficulties:

  • Memory usage is increased. Traces are duplicated in memory (because the batch processor marks itself as mutating).
  • Config needs to be duplicated between the two exporters (API keys, etc), which makes it error prone. YAML anchors & aliases can mitigate this.
  • Worse user experience in terms of configuration: the user not only has to duplicate the exporter config, but also the pipelines (minus the sampler), which is error prone.
  • Does not work very well with gateway deployments. If a local collector has the Datadog trace metrics exporter, it can not communicate to the existing exporter to skip the stats computation, resulting in duplicate computation.
@gbbr gbbr added the needs triage New item requiring triage label Oct 27, 2022
@gbbr gbbr changed the title New component: datadogtracemetrics New component: processor/datadogtracemetrics Oct 27, 2022
@evan-bradley evan-bradley added Sponsor Needed New component seeking sponsor and removed needs triage New item requiring triage labels Nov 1, 2022
@evan-bradley
Copy link
Contributor

@mx-psi Can you confirm that you will be sponsoring this component?

@mx-psi
Copy link
Member

mx-psi commented Nov 2, 2022

Yes, I will be sponsoring this, but we want to make sure that this is the right choice since we would be the first vendor to have a vendor-specific processor.

@mx-psi mx-psi added Accepted Component New component has been sponsored Vendor-Specific Component New component that interfaces with a vendor API and will be maintained by the vendor. and removed Sponsor Needed New component seeking sponsor labels Nov 2, 2022
@gbbr gbbr changed the title New component: processor/datadogtracemetrics New component: processor/datadog Nov 3, 2022
@gbbr
Copy link
Member Author

gbbr commented Nov 14, 2022

Since there are no objections here, I'm going to start working on this.

gbbr added a commit to gbbr/opentelemetry-collector-contrib that referenced this issue Dec 12, 2022
This change adds the processor described in open-telemetry#15689. It is the initial PR
containing the structure and implementation.
mx-psi pushed a commit that referenced this issue Dec 13, 2022
* processor: add datadogprocessor

This change adds the processor described in #15689. It is the initial PR
containing the structure and implementation.

* Address PR comments

* go mod tidy

* Ensure Shutdown can be called even if Start fails

* Use component.ID

* Address PR comments

* Update linting errors

* make generate-gh-issue-templates
@jpkrohling
Copy link
Member

Since there are no objections here, I'm going to start working on this.

Was this discussed during a SIG meeting?

@mx-psi
Copy link
Member

mx-psi commented Jan 20, 2023

Since there are no objections here, I'm going to start working on this.

Was this discussed during a SIG meeting?

No, it was not. This component is vendor specific and I sponsored it, but if you have thoughts on the design or approach please comment then now better than later!

Note that it's possible that once connectors are a thing we may want to change this and the spanmetricsprocessor into connectors instead, but this seemed like the best approach at the time when we were discussing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Component New component has been sponsored Vendor-Specific Component New component that interfaces with a vendor API and will be maintained by the vendor.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants