-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datadog and ingress-nginx #10082
Comments
This issue is currently awaiting triage. If Ingress contributors determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
AFAIK, datadog can consume traces via otlp-grpc receiving collector https://kubernetes.github.io/ingress-nginx/user-guide/third-party-addons/opentelemetry/
/assign @esigo |
@longwuyuan Yes, that's one option. It even works today. One shortcoming of that option, though, as I explain in a related issue in the OpenTelemetry project, is that it prevents us from providing some features that are difficult or impossible to express in OpenTelemetry as it currently is. One such feature is trace sampling. Here are some ways we could go about continuing support for Datadog in the ingress controller:
|
Need to wait for @esigo comments I guess and I am not even a developer. Option 1 seems not right to degrade. Once again, just placeholder comments from me, my opinion, and I am not even a developer. |
cc @rikatz news on datadog here |
@dgoffredo thanks for the info. could you please be specific with the feature of DD that otel doesn't support right now? OpenTracing project is already archived. I can't see why we need to support archived project. |
Removing my ingress-nginx maintainer hat, and, putting my "user" hat: I think it would be great if DD integration with OTEL is 100% covered. This would help me not get into a "lock in" where I should use solution A because my Ingress or whatever can just talk with A or B, and migrating would be a breaking change Putting back my maintainer hat:
We could "kind of" isolate this on cloud providers as we generate manifests specific for one or another based on customizations, but not having a Cloud A or Cloud B specific feature. I think DD have a great opportunity here, already pointed by @esigo above which is getting the missing feature/integration as part of OTEL, and IIUC this would be a gain for the overall OTEL community :) Sorry for being harsh, I don't mean at all :) I'm just pointing and reasoning out that we need to focus the project more on generic features. |
@esigo Sampling is a sticky example, but the one with which I'm the most familiar. I can think of three sampling-related features that Datadog provides and for which I am not aware of an OpenTelemetry equivalent. There's a description in the OpenTracing-based version of the Datadog tracer. Briefly:
There are other features, too, currently not implemented in C++, that we might like to add in the future. Examples include automatic profiling and application security monitoring.
I agree. We wrote dd-trace-cpp precisely so that Datadog tracing could exist without OpenTracing. |
@rikatz Perhaps contributing to opentelemetry-cpp is a great opportunity. One example of a tracing implementation that integrates with OpenTelemetry while adding a lot of its own code is the Event Tracing for Windows (ETW) tracer. It lives in the OpenTelemetry C++ code (i.e. code written in terms of opentelemetry-cpp's public API) can interface with this ETW tracer, but a client program must specify the implementation on startup by installing the specific Datadog might be able to integrate with opentelemetry-cpp in a similar way. All code that uses In this way, "company-specific features" can be provided entirely within the OpenTelemetry API, but a client must choose that implementation when first setting up tracing. Pending the relevant discussion with the opentelemetry-cpp maintainers, does adding the "OpenTelemetry provider" degree of freedom to otel_ngx_module and to ingress-nginx sound reasonable to you? |
Per open-telemetry/opentelemetry-cpp#2196, the opentelemetry-cpp project is on board with the idea of Datadog creating a library that implements the OpenTelemetry API using the OpenTelemetry SDK.
Then Is this course of action acceptable? |
I agree with the approach :) If this is just a simple OTEL module configuration after the change, so let's do it! Thanks for sticking with us @dgoffredo :) |
(and please keep me posted with it) |
Great, thanks @rikatz. I'll reach out to the |
To summarize: Datadog can integrate with Then Then You'll hear more from me when this work begins. Thanks for weighing in! |
@dgoffredo are you folks following https://github.com/nginxinc/nginx-otel ? What your thoughts on it? From a project perspective, would be great to use official nginx pre compiled modules, let me know how dd integration is going on it and if you folks are on some conversation with f5 as well :) |
I wasn't aware of nginx-otel until you mentioned it, thanks for the heads up. My initial observations are:
Do you plan to move to using
I'm working on it, though recently we've been onboarding a new team member, so I've taken a short break from it. To reiterate the tentative overall plan:
Nope. We met with F5 a number of months ago to get a feel for what a partnership might hold. Both sides seemed open to the idea, but there was not then anything to work on together. |
@dgoffredo sorry for the delay, missed the notification of this. Yes, the plan would be to use nginx-otel, as this would reduce a lot our compilation scope, including the fact that in a future if we decide to move back to packaged NGINX, this will probably be an already dynamic module that we can just use. Additionally, this module does not use curl or other libraries that we've been willing to remove from controller. Is it possible that this module can be extended with F5 folks to support datadog? I know they may have a lot of customer cases for DD + Nginx+ as well. Thanks!! |
was testing nginx-otel compilation here, but it also needs opentelemetry-cpp @esigo :) So I think for the experimental image I will drop everything else and leave the opentelemetry image as Ehsan has built for it, then we can discuss what next. This means as soon as the datadog features are supported also by OTEL contrib, we can use it |
Ah, I missed that! Here I was trying to figure out our other options. I'll have another look at the source of nginx-otel to see whether the |
As far as I can see, nginx-otel does not use any of the types Rather than use opentelemetry-cpp directly, nginx-otel dips into opentelemetry-cpp for some concrete details (generating an ID, proto definitions) but otherwise contains a complete tracer implementation separate from anything in opentelemetry-cpp.
My main reason for implementing the interfaces from opentelemetry-cpp is to contribute the implementation to opentelemetry-cpp-contrib's otel_ngx_module, which ingress-nginx can pass configuration to (e.g. If/when ingress-nginx moves to nginx-otel (for the reasons @rikatz described in previous comments), I'd have to find another approach. At that point one option would be to give up on Datadog-specific code built into ingress-nginx. For example, nginx-otel has an interesting sampling feature that Datadog customers might be able to use in lieu of dd-trace-cpp's DD_TRACE_SAMPLING_RULES or nginx-datadog's datadog_sample_rate. I'm wondering, if I continue on the opentelemetry-cpp based path, how soon (if ever) we'll have to move to something else. |
Hi again, @rikatz. I see that OpenTracing (and thus Datadog) was removed from ingress-nginx in #10615. In order to determine how to support Datadog customers using ingress-nginx, I'd like to know from you which OpenTelemetry nginx module this project plans to use:
|
As there is no advabtage on using F5 one, my vote is to keep using otel-contrib |
/close Closing it for now, we are sticking with OTEL only as the solution for v1.10 @dgoffredo let me know once we have some movement on otel side to support the specifics of Datadog :) |
@rikatz: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@rikatz Will do. We have some shifting priorities but will revisit this soon. |
do I get it right that currently when updating to helm chart 4.10.0 and ingress-nginx 1.10.0 the datadog APM (tracing) integration is not working like before and is lacking a lot of features? |
@michael-mader We (Datadog) need to update our public documentation so that it either advises use of OpenTelemetry or some other interim solution until a Datadog-specific OpenTelemetry configuration option is available. It would have been better for us to do this before the most recent release, but here we are. My teammate or I will update this github issue when the new documentation is available. For now, you can use ingress-nginx's OpenTelemetry feature, and point it at the Datadog Agent's OTLP collector interface. |
@dgoffredo will this conflict with trace samplin rules? |
Datadog trace sampling rules, i.e. those configured by the At least for the most common use case, though, OpenTelemetry's @dmehala, my teammate, for visiblity. |
For lack of a better place to put this, and to help others who might run into the same, I thought I'd call out a few issues we ran into while trying to upgrade from 1.9.x to 1.10.1:
|
Hi @mscrivo Thank you for sharing your feedback and experiences. I am taking the responsibility to update the public documentation and raise the We are currently investigating the possibility of reintroducing the Datadog tracer to ingress-nginx to address these concerns. In the meantime, I recommend submitting a feature request here for visibility. |
I managed to pass the environment by setting the ---
controller:
config:
main-snippet: |
env OTEL_RESOURCE_ATTRIBUTES;
extraEnvs:
- name: HOST_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: "http://$(HOST_IP):4317"
- name: OTEL_RESOURCE_ATTRIBUTES
value: "deployment.environment=production"
|
@dmehala any update on the |
Update to my previous comment: turned out it works even this way: controller:
config:
main-snippet: |
env OTEL_RESOURCE_ATTRIBUTES=env=production; |
In a recent pull request to update the version of the Datadog tracer used
within ingress-nginx, @esigo pointed out that the project plans to move
away from OpenTracing and towards OpenTelemetry. As a result, OpenTracing-only
tracing libraries, such as Datadog, will no longer be supported.
Datadog would like to continue integrating with ingress-nginx. I can think of
a few ways to do this, and would like to discuss our potential options here.
The text was updated successfully, but these errors were encountered: