Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opentelemetry: create a new module #4361

Merged
merged 76 commits into from
Mar 8, 2022

Conversation

andriikushch
Copy link
Contributor

What
This PR adds a new module to support OpenTelemetry tracing.

The new module wraps implementation provided by: github.com/open-telemetry/opentelemetry-go and made it possible to append it to the chain of handlers.

This module is compatible with specification: opentelemetry-specification.

How to configure

This module can be enabled in Caddyfile and configured with env variables mentioned in the specification. Please see more information in README.md in this PR.

Known limitation

  • The current implementation supports only grpc otlp protocol.
  • It supports only the following propagators tracecontext,baggage

Some other limitations are mentioned here: open-telemetry/opentelemetry-go#1698

@CLAassistant
Copy link

CLAassistant commented Sep 27, 2021

CLA assistant check
All committers have signed the CLA.

@andriikushch andriikushch marked this pull request as draft September 27, 2021 09:53
@andriikushch andriikushch marked this pull request as ready for review September 27, 2021 10:03
@hairyhenderson
Copy link
Collaborator

Hi @andriikushch, thank you so much for working on this! I don't have a lot of time to put any more time into reviewing this today unfortunately, and I'll be on vacation with limited Internet access for the rest of the week.

I think it'd be useful to see some of the attributes added by https://pkg.go.dev/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp added to this as well (though you probably won't be able to use that directly).

Also, I'm not sure how much I like the fact that certain environment variables are required. Most OTel-enabled applications I've dealt with will default to using OTLP with gRPC, so I think defaulting to that is totally reasonable.

Since many users of Caddy are not yet familiar with OpenTelemetry, or with Distributed Tracing in general, I think it would be very helpful to write a short tutorial article describing how to get started, similar to https://caddyserver.com/docs/metrics.

I had a few other vague thoughts but they escape me now, and I need to get going - again, thank you for working on this!

caddytest/integration/caddyfile_adapt/opentelemetry.txt Outdated Show resolved Hide resolved
modules/caddyhttp/opentelemetry/README.md Outdated Show resolved Hide resolved
modules/caddyhttp/opentelemetry/README.md Outdated Show resolved Hide resolved
modules/caddyhttp/opentelemetry/README.md Outdated Show resolved Hide resolved
@francislavoie francislavoie added the under review 🧐 Review is pending before merging label Sep 27, 2021
@francislavoie francislavoie added this to the v2.5.0 milestone Sep 27, 2021
@andriikushch
Copy link
Contributor Author

JFYI: For the next few days, I will not have Internet access and I will be glad to implement/discuss all suggestions after the 9th of October.

@andriikushch
Copy link
Contributor Author

I am exited about this PR and hope, it will be merged (and released) soon. Thank you for your work!

It is possible with this PR to configure caddy for the sampling? We use caddy as proxy to our services. It would be nice, if caddy could be configured to trace only some request. For example ratio based.

In the documentation of this PR I can't find anything about sampling. Do I have to use the environment variable OTEL_TRACES_SAMPLER? Would it be possible to add a configuration to the module, so the sampling can also be configured in the caddy-file? Or is this out of scope of this PR?

Hi, @ostcar,

Thank you for your interest in the topic. You are absolutely right, this PR does not support the configuration of the sampling via Caddyfile.

It was a design decision to use environment variables to configure the "opentelemetry" functionality.

As far as I know, currently "opentelemetry" go library does not support the OTEL_TRACES_SAMPLER environment variable. Here is more info about why open-telemetry/opentelemetry-go#1698.

Nevertheless, there is an open issue for this open-telemetry/opentelemetry-go#2305 and already created PR open-telemetry/opentelemetry-go#2517. It looks that soon this support will be added.

Best regards,
Andrii

@andriikushch
Copy link
Contributor Author

Hi friends 😃

Thank you a lot for your patience and support. 👍

I hope, that I have answered all the questions and made all suggested changes and improvements 😃

The most significant are:

  1. As @mholt suggested, I moved a readme to the public documentation in the website repository (Add documentation for tracing directive website#205).
  2. The PR, kindly provided by @vibhavp, was accepted and merged into the current PR. I think now the module looks simpler and more elegant.

I did some manual integration tests, results were positive. Tracing headers were handled as expected.

Also, I think it is worth watching this issue's follow-up stories (open-telemetry/opentelemetry-go#1698). Most of them are about adding support for more configuration environment variables. And when they are ready, we just need to update an "opentelemetry" dependency.

Please check the latest changes and your feedback is more than welcome!

Best regards,
Andrii

Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking much better, thank you 🎉

My remaining comments are minor nits, we can probably merge this shortly.

modules/caddyhttp/tracing/module.go Outdated Show resolved Hide resolved
modules/caddyhttp/tracing/serve.go Outdated Show resolved Hide resolved
@mholt
Copy link
Member

mholt commented Jan 21, 2022

Ok, great, this is looking good. Thanks very much for building this. This basically has my approval (with the caveat that I'm not familiar with opentelemetry and/or the dependencies).

I pulled down the branch and built Caddy and this PR adds > 2 MB to the total size of the binary (about a 5% increase). I'm a bit sensitive to that now as we approach 50 MB 😬. That seems like kind of a lot for tracing requests but I guess I'm not too surprised.

I was surprised, however, to see that this feature doesn't change anything in existing Caddy files, let alone the core -- only adding files. (Yay for Caddy's extensible architecture!) I figured there'd have to be some code in the core of the HTTP module, like where the HTTP handler chain is compiled. (This isn't a bad thing, just an observation.)

Anyway, as I think on things, and as Caddy 2's standard distribution is finally beginning to stabilize, I wonder if this could be better had as a separate module first, and then we can bring it into the official/standard distribution if it's used by enough users. This will let us prove out the stability of dependencies (something I wish I had done before with other features) and reduce bloat and maintenance burden, while still serving everyone who wants to use tracing. (Especially since we already have logging and metrics in the standard distribution.)

We would still make it an official repo in the caddyserver org, and put you in charge of maintaining it. Then we can see how popular and necessary it is deemed by experience. It's much easier to bring something into the official repo than it is to deprecate/remove it if there are problems.

What do you think?

@cedricziel
Copy link
Contributor

Hi @mholt , thanks for your extensive notes :)

I am a product manager with Instana and Andrii asked me for some feedback here. Technically we (and probably other observability vendors as well) think observability needs to be built into the core of modern-day software to make them observable by default. We created this module in order to enable everyone running Caddy in production to integrate with their observability suite out of the box without a custom build.

Our preference here would definitely be an in-tree plugin shipped by default, so users of your helm-chart (caddyserver/ingress#52) could enable it out-of-the box. However if you feel uncomfortable, we'd also be willing to go the external-plugin route to gain some traction.

@proffalken
Copy link

Just wanted to add my voice to this PR - I'm in full agreement with @cedricziel that observability should be a "core feature" rather than an "after thought" or "optional extra", however I'd rather see tracing as a plugin rather than not at all!

The progress made here is fantastic, and if I can provide our customers with a recommendation for a lightweight, Open Source webserver that has both openmetrics and opentelemetry built in rather than having to filter out stuff from Nginx, Traefik, or Apache logs, then that would be amazing!

Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for the feedback.

Now obviously this is a big change and I haven't scrutinized every line, nor have I personally tested it, but as it's purely additive and there seems to be strong consensus it should ship by default, this has my approval.

However, I think I'd like to mark this module as EXPERIMENTAL in the godoc comments so that in the documentation on our website it's clear to users that it could be changed or removed (like, brought out to a separate repository) at any time (for example, if there's major problems discovered). Just until we get confident/comfortable with things.

@mholt mholt modified the milestones: v2.6.0, v2.5.0 Mar 5, 2022
@proffalken
Copy link

proffalken commented Mar 5, 2022

Thank you @mholt, that's fantastic news!

FWIW, I've just rolled Caddy into the docker-compose for https://github.com/makemonmouth/mventory, so once this is available I'll set it up on my test network and see how it goes!

@mholt
Copy link
Member

mholt commented Mar 5, 2022

If the merge conflicts are resolved before we tag 2.5 (in the next couple of days), we can include it in that release, otherwise it'll have to wait to 2.6.

@mholt
Copy link
Member

mholt commented Mar 7, 2022

@andriikushch Do you think you could resolve the merge conflicts by tomorrow evening? Just really hoping to get the first beta out the door soon and would love to include this so it can be tested by more people.

@cedricziel
Copy link
Contributor

Since Andrii is currently out, I took the liberty of resolving the conflicts :)

@mholt
Copy link
Member

mholt commented Mar 8, 2022

Awesome, thanks @cedricziel -- merging in just a moment.

@mholt mholt merged commit d0b608a into caddyserver:master Mar 8, 2022
@mholt mholt removed the under review 🧐 Review is pending before merging label Mar 8, 2022
@cedricziel cedricziel deleted the add-opentelemetry-module branch April 21, 2022 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature ⚙️ New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants