-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tracing service and configuration #12699
Conversation
5d81950
to
8277251
Compare
lib/observability/tracing/tracing.go
Outdated
return trace.BadParameter("exporter URL cannot be empty") | ||
} | ||
|
||
if !strings.Contains(c.ExporterURL, "://") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer a proper URL parse and then a check on Scheme
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially tried doing this; however url.Parse("localhost:4317")
will result in a url with the Scheme
set to localhost
(https://go.dev/play/p/VoYSAAhscPQ). Open to other ideas if you have them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could either try to parse it as an address first and then as a URL on error, or we could just always require a URL, and whatever error happens ("unsupported schema localhost:
") is the user's fault.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My goal here was to have an empty scheme default to grpc. This seems to be how most other tracing exporters operate. Forcing users to always provide one of grpc
, http
, https
would allow us to just do a url.Parse
here but might be annoying to users. I suppose that would be fine as long as I add that grpc://
is required in the yet to exist tracing docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To identify that no Scheme
was provided, we could check if the Host
is empty after parsing the URL (https://go.dev/play/p/K_m82MHG1OO). As the documentation states:
URLs that do not start with a slash after the scheme are interpreted as:
scheme:opaque[?query][#fragment]
After knowing no scheme was provided, having a default scheme is valuable if it is the most used, but it should also be documented.
sdktrace.WithResource(res), | ||
sdktrace.WithSpanProcessor(sdktrace.NewBatchSpanProcessor(exporter)), | ||
)} | ||
otel.SetTracerProvider(provider) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens during tests or during a CA rotation, when the global provider might get overwritten?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm good question. I will test this and possibly add a check to see if the global provider is already configured correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In most cases the global provider should only be used at creation time to provide a component specific tracer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we perhaps run the tracing configuration earlier than other components, and store this as a variable in TeleportProcess
rather than storing it globally? Or is otel really only supposed to work with a single global exporter for the whole program?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should do both. The default global provider is a noop which won't record any spans. I'm explicitly setting it here so that in the event a provider can't be provided as a dependency we can fallback to retrieving it from the global provider. Though I hope we can avoid using the global provider as much as possible
e90bff2
to
69376c7
Compare
lib/observability/tracing/tracing.go
Outdated
return trace.BadParameter("exporter URL cannot be empty") | ||
} | ||
|
||
if !strings.Contains(c.ExporterURL, "://") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could either try to parse it as an address first and then as a URL on error, or we could just always require a URL, and whatever error happens ("unsupported schema localhost:
") is the user's fault.
sdktrace.WithResource(res), | ||
sdktrace.WithSpanProcessor(sdktrace.NewBatchSpanProcessor(exporter)), | ||
)} | ||
otel.SetTracerProvider(provider) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we perhaps run the tracing configuration earlier than other components, and store this as a variable in TeleportProcess
rather than storing it globally? Or is otel really only supposed to work with a single global exporter for the whole program?
lib/service/cfg.go
Outdated
// KeyPairs are the key and certificate pairs that the tracing service will. | ||
// use for mTLS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// KeyPairs are the key and certificate pairs that the tracing service will. | |
// use for mTLS. | |
// KeyPairs are the paths for key and certificate pairs that the tracing service will use for outbound TLS connections. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we ask Cloud if they'd like a way to specify these inline in the configuration file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cloud actually designed and implemented the metrics service and since this is a direct copy of that I don't think they will have any objections. I can ping them to be certain though
lib/service/service.go
Outdated
log.Info("Shutting down immediately.") | ||
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) | ||
defer cancel() | ||
warnOnErr(provider.Shutdown(ctx), log) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI the scram shutdown (on SIGINT/SIGTERM) will not wait for services to shutdown, so it's very likely that we won't have the full 5 seconds to perform the provider shutdown - is that going to be a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any queued spans which haven't been flushed yet will be lost
Provides a new tracing configuration block, which can be used to configure if and how spans are exported to a telemetry backend. In the example below, the tracing service is enabled and will export spans to `collector.example.com:4317` via gRPC with mTLS enabled. ```yaml tracing_service: enabled: yes exporter_url: collector.example.com:4317 sampling_rate_per_million: 1000000 ca_certs: - /certs/rootCA.pem keypairs: - key_file: /certs/example.com-client-key.pem cert_file: /certs/example.com-client.pem ``` This configuration ends up being consumed by the `TeleportProcess` and passed to `tracing.NewTraceProvider` which sets up the OpenTelemetry Exporter, TracerProvider, Propagator and Sampler. In order for spans to be exported, the `tracing_service` must be enabled **and** have a `sampling_rate_per_million` value > 0.
69376c7
to
b8332b7
Compare
* Add tracing service and configuration Provides a new tracing configuration block, which can be used to configure if and how spans are exported to a telemetry backend. In the example below, the tracing service is enabled and will export spans to `collector.example.com:4317` via gRPC with mTLS enabled. ```yaml tracing_service: enabled: yes exporter_url: collector.example.com:4317 sampling_rate_per_million: 1000000 ca_certs: - /certs/rootCA.pem keypairs: - key_file: /certs/example.com-client-key.pem cert_file: /certs/example.com-client.pem ``` This configuration ends up being consumed by the `TeleportProcess` and passed to `tracing.NewTraceProvider` which sets up the OpenTelemetry Exporter, TracerProvider, Propagator and Sampler. In order for spans to be exported, the `tracing_service` must be enabled **and** have a `sampling_rate_per_million` value > 0.
Add tracing service and configuration (gravitational#12699) * Add tracing service and configuration Provides a new tracing configuration block, which can be used to configure if and how spans are exported to a telemetry backend. In the example below, the tracing service is enabled and will export spans to `collector.example.com:4317` via gRPC with mTLS enabled. ```yaml tracing_service: enabled: yes exporter_url: collector.example.com:4317 sampling_rate_per_million: 1000000 ca_certs: - /certs/rootCA.pem keypairs: - key_file: /certs/example.com-client-key.pem cert_file: /certs/example.com-client.pem ``` This configuration ends up being consumed by the `TeleportProcess` and passed to `tracing.NewTraceProvider` which sets up the OpenTelemetry Exporter, TracerProvider, Propagator and Sampler. In order for spans to be exported, the `tracing_service` must be enabled **and** have a `sampling_rate_per_million` value > 0.
Provides a new tracing configuration block, which can be
used to configure if and how spans are exported to a
telemetry backend. In the example below, the tracing
service is enabled and will export spans to
collector.example.com:4317
via gRPC with mTLS enabled.This configuration ends up being consumed by the
TeleportProcess
and passed to
tracing.NewTraceProvider
which sets up the OpenTelemetryExporter, TracerProvider, Propagator and Sampler. In order for spans to
be exported, the
tracing_service
must be enabled and have asampling_rate_per_million
value > 0.#12241