-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTLP*HTTP Exporter: Custom HTTP client #2632
Comments
Is this a request related to the |
The |
This is about the HTTP client that sends the traces to the collector, not the HTTP client that we want to trace. E.g. our roundtripper can read passwords from files, and I'd like to be able to use the same abilities to speak to the OTLP receivers. |
This issue has been raised a number of times. I'm not in favor of it because of the ease with which you can create loops. Let's say somewhere you use the contribs net/http to do:
And later someone comes along and does
Now, every time an HTTP request is made, it will use the instrumented client to make a request, which will generate more traces. Forever. Any solution to this should also include some protection that we don't create these loops, or that the instrumentation has some escape hatch to not add tracing to the exports. |
@lmolkova was working on specifying a mechanism for suppressing instrumentation from multiple layers of clients. I'm not sure if it's perfectly applicable here, but it would probably be worth looking at that OTEP to see if it aligns with the need to prevent instrumentation loops in exporters. |
looks like without this issue the oidc extension is mostly useless at it is not possible to update the used token maybe another possible solution could be to use a function as value for headers.. so it can be implemented more dynamically |
We're looking into adding OTLP endpoint with AAD (oauth2, etc) auth on Azure Monitor and would be intersted in passing custom client that would allow this. I.e. we'd want to enable the following flow:
This should be a common concern for any HTTP-based OTLP endpoint with auth enabled. If it helps, OTLP exporter in .NET provides a similar option: https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/src/OpenTelemetry.Exporter.OpenTelemetryProtocol#configure-httpclient |
@MadVikingGod can we add the feature to atleast provide a specifications (config) of the underlying custom This will eliminate the usage an already instrumented client as the passed config/spec will not be the exact |
We've hit this issue when trying to export traces from an environment without external connectivity (a secure enclave.) We need to create a custom Currently, the only workaround that works for us is to not use the |
OTel doesn't support injecting a transport, so we have to do this from the outside. :( open-telemetry/opentelemetry-go#2632 Signed-off-by: Alex Suraci <[email protected]>
* use official otel log sdk, dedupe telemetry pkg * dagger.io/dagger/telemetry is now the canonical package for integrating Dagger with OpenTelemetry (engine, CLI, Go SDK) * add engine/telemetry for engine and CLI specific telemetry (labels, cloud, engine pub/sub, etc) Signed-off-by: Alex Suraci <[email protected]> * log dialing errors Signed-off-by: Alex Suraci <[email protected]> * session: ignore SIGPIPE, don't panic if either of these happen ther CLI-side span won't be finished and telemetry won't drain. The same fix was applied recently for stdout, but I observed SIGPIPE happening on write(2) - stderr. I suspect this could still happen for stdout. My guess is it happens when the parent process goes away without waiting for the process. I couldn't find where that's happening (if it is happening), so I think the safest bet is to just drop the signal, since we should be shutting down around then anyway with stdin also being closed; might as well do it cleanly. Also, I didn't see the panic(err) become an issue, but it happens in a goroutine, which AFAIK is un-recoverable and won't run defers. Better to just log to stderr. Signed-off-by: Alex Suraci <[email protected]> * ci: test custom: configurable -count for tests Signed-off-by: Alex Suraci <[email protected]> * sdk/go: heartbeatinga (TODO replace) committing for now since I lost it at one point and it's better than nothing but really this responsibility should move outside of the SDK, probably with an exporter, in both the CLI and the engine Signed-off-by: Alex Suraci <[email protected]> * i retire Signed-off-by: Alex Suraci <[email protected]> * TODO write commit message Signed-off-by: Alex Suraci <[email protected]> * lints Signed-off-by: Alex Suraci <[email protected]> * disable test otel integration Cloud can't handle this yet, bring back later with testctx for proper instrumentation, ideally also with proper parallelism Signed-off-by: Alex Suraci <[email protected]> * wip: enable otel for tests, see if it passes after all Signed-off-by: Alex Suraci <[email protected]> * tui: show spans IDs with --debug very helpful when troubleshooting otel issues Signed-off-by: Alex Suraci <[email protected]> * plain frontend: show span IDs with --debug Signed-off-by: Alex Suraci <[email protected]> * ci: support building engine and CLI with -race for hack/dev, set RACE=1 Signed-off-by: Alex Suraci <[email protected]> * pubsub: finish draining if parent spans complete this will prevent hangs in situations where a process exits abruptly without cleaninly finishing all the spans it starts, as seems to happen with go test -failfast, but in principle could happen just about anywhere. Signed-off-by: Alex Suraci <[email protected]> * pubsub: emit to no one when no clients Signed-off-by: Alex Suraci <[email protected]> * add test for multiple clients this passes reliably against hack/dev but seems to be flaky in dagger call. still investigating. Signed-off-by: Alex Suraci <[email protected]> * assertion against error output otherwise this can false pass because --debug prints the span that sets up the code Signed-off-by: Alex Suraci <[email protected]> * frontend: show hashes with verbose=1 Signed-off-by: Alex Suraci <[email protected]> * show client ID, helps with troubleshooting Signed-off-by: Alex Suraci <[email protected]> * fix data race with ctx var capture this one is spooky - theoretically this would put a 10 minute timeout on the connection Signed-off-by: Alex Suraci <[email protected]> * withDeadlineConn: fix data race Signed-off-by: Alex Suraci <[email protected]> * include stderr when session exits poorly Signed-off-by: Alex Suraci <[email protected]> * fix frontend not printing final frame root cause of TestModuleConstructors/return_error/go Signed-off-by: Alex Suraci <[email protected]> * fix data race in heartbeating Signed-off-by: Alex Suraci <[email protected]> * move span heartbeating to Cloud exporter Signed-off-by: Alex Suraci <[email protected]> * remove unused exporter Signed-off-by: Alex Suraci <[email protected]> * lint Signed-off-by: Alex Suraci <[email protected]> * test that logs only get printed once Signed-off-by: Alex Suraci <[email protected]> * fix comments and client names Signed-off-by: Alex Suraci <[email protected]> * drain logs too, refactor logging infra * otel logging now respects Close() which sends an 'eof' log record, indicated by stdio.eof=true. * engine pub/sub now drains until it sees stdio.eof=true for each stream. this fixes the integration test which needed to seep(1s). * refactor logging helpers to reduce duplication. * retire ioctx; redundant with otel logging. still used for dagql tests though, so moved to internal. * use an attr to indicate global logs, rather than a different library using library made little sense since you'd still want to know that info from the different call sites. using an attr is the same amount of effort without overloading a field. Signed-off-by: Alex Suraci <[email protected]> * disable gocritic.ifElseChain this one is silly. switch carries subtle implications, and breaks break. don't want to have to use labels to make this linter happy. Signed-off-by: Alex Suraci <[email protected]> * client stream test: work around log limitations Signed-off-by: Alex Suraci <[email protected]> * const-ify Signed-off-by: Alex Suraci <[email protected]> * lint Signed-off-by: Alex Suraci <[email protected]> * remove accidentally-committed file Signed-off-by: Alex Suraci <[email protected]> * sdk/go: use env carrier Signed-off-by: Alex Suraci <[email protected]> * auto-refresh Cloud exporter token OTel doesn't support injecting a transport, so we have to do this from the outside. :( open-telemetry/opentelemetry-go#2632 Signed-off-by: Alex Suraci <[email protected]> * remove early client ID setup this isn't really load-bearing and raises more questions than it answers Signed-off-by: Alex Suraci <[email protected]> * fix heartbeating this was broken in two ways that canceled each other out: * deadlocked on first heartbeat * never cleared out active spans, so it'd constantly make everything 'running' Signed-off-by: Alex Suraci <[email protected]> * send function logs + spans to API call * retire dagger.io/ui.mask - white whale defeated * simplify TS/Go/Python SDKs - don't need to do the mask/passthrough dance anymore Signed-off-by: Alex Suraci <[email protected]> * pubsub: abandon child spans on completion way simpler implementation Signed-off-by: Alex Suraci <[email protected]> * python lint Signed-off-by: Alex Suraci <[email protected]> * untie ClientMetadataMetaKey knot Signed-off-by: Alex Suraci <[email protected]> * simplify otel proxy setup (*http.Server).Shutdown handles most of what we needed before Signed-off-by: Alex Suraci <[email protected]> * remove dead code Signed-off-by: Alex Suraci <[email protected]> * tui: move digests to -vvv Signed-off-by: Alex Suraci <[email protected]> * hide exec /runtime when redirecting telemetry Signed-off-by: Alex Suraci <[email protected]> * propagate log level config to engine client * reduce telemetry logs down to debug * fix debug var being clobbered via double-flag-parsing * --debug etc get parsed twice: once before function flags are learned, and again after. so we just need to pass in the current value. unfortunately --verbosity gets clobbered but that doesn't matter yet. Signed-off-by: Alex Suraci <[email protected]> * frontend: don't print blank line on EOF Signed-off-by: Alex Suraci <[email protected]> * use http.Error, nix comment Signed-off-by: Alex Suraci <[email protected]> * add sanity check to avoid infinite loop Signed-off-by: Alex Suraci <[email protected]> * link to SIGPIPE docs Signed-off-by: Alex Suraci <[email protected]> * plain: handle EOF log record for primary logs Signed-off-by: Alex Suraci <[email protected]> * plain: don't render logs if --silent Signed-off-by: Alex Suraci <[email protected]> * session: trigger exit on stdout write fail Signed-off-by: Alex Suraci <[email protected]> * simplify pub/sub client tracking * spans don't inherit their parent span's ID anymore. I don't think we actually need this, since we always step recursively through the span ancestry to collect the set of relevant client IDs. better not to funge data if we don't need to. * clean up spanDone state, not needed with new child abandonment scheme * clean up all state for a trace when all clients go away. this way we also clean up spans that don't have an associated client. Signed-off-by: Alex Suraci <[email protected]> * ci: add spans to go lint Signed-off-by: Alex Suraci <[email protected]> * fix: plain finalRender should disable the context hold Signed-off-by: Justin Chadwell <[email protected]> --------- Signed-off-by: Alex Suraci <[email protected]> Signed-off-by: Justin Chadwell <[email protected]> Co-authored-by: Justin Chadwell <[email protected]>
* use official otel log sdk, dedupe telemetry pkg * dagger.io/dagger/telemetry is now the canonical package for integrating Dagger with OpenTelemetry (engine, CLI, Go SDK) * add engine/telemetry for engine and CLI specific telemetry (labels, cloud, engine pub/sub, etc) Signed-off-by: Alex Suraci <[email protected]> * log dialing errors Signed-off-by: Alex Suraci <[email protected]> * session: ignore SIGPIPE, don't panic if either of these happen ther CLI-side span won't be finished and telemetry won't drain. The same fix was applied recently for stdout, but I observed SIGPIPE happening on write(2) - stderr. I suspect this could still happen for stdout. My guess is it happens when the parent process goes away without waiting for the process. I couldn't find where that's happening (if it is happening), so I think the safest bet is to just drop the signal, since we should be shutting down around then anyway with stdin also being closed; might as well do it cleanly. Also, I didn't see the panic(err) become an issue, but it happens in a goroutine, which AFAIK is un-recoverable and won't run defers. Better to just log to stderr. Signed-off-by: Alex Suraci <[email protected]> * ci: test custom: configurable -count for tests Signed-off-by: Alex Suraci <[email protected]> * sdk/go: heartbeatinga (TODO replace) committing for now since I lost it at one point and it's better than nothing but really this responsibility should move outside of the SDK, probably with an exporter, in both the CLI and the engine Signed-off-by: Alex Suraci <[email protected]> * i retire Signed-off-by: Alex Suraci <[email protected]> * TODO write commit message Signed-off-by: Alex Suraci <[email protected]> * lints Signed-off-by: Alex Suraci <[email protected]> * disable test otel integration Cloud can't handle this yet, bring back later with testctx for proper instrumentation, ideally also with proper parallelism Signed-off-by: Alex Suraci <[email protected]> * wip: enable otel for tests, see if it passes after all Signed-off-by: Alex Suraci <[email protected]> * tui: show spans IDs with --debug very helpful when troubleshooting otel issues Signed-off-by: Alex Suraci <[email protected]> * plain frontend: show span IDs with --debug Signed-off-by: Alex Suraci <[email protected]> * ci: support building engine and CLI with -race for hack/dev, set RACE=1 Signed-off-by: Alex Suraci <[email protected]> * pubsub: finish draining if parent spans complete this will prevent hangs in situations where a process exits abruptly without cleaninly finishing all the spans it starts, as seems to happen with go test -failfast, but in principle could happen just about anywhere. Signed-off-by: Alex Suraci <[email protected]> * pubsub: emit to no one when no clients Signed-off-by: Alex Suraci <[email protected]> * add test for multiple clients this passes reliably against hack/dev but seems to be flaky in dagger call. still investigating. Signed-off-by: Alex Suraci <[email protected]> * assertion against error output otherwise this can false pass because --debug prints the span that sets up the code Signed-off-by: Alex Suraci <[email protected]> * frontend: show hashes with verbose=1 Signed-off-by: Alex Suraci <[email protected]> * show client ID, helps with troubleshooting Signed-off-by: Alex Suraci <[email protected]> * fix data race with ctx var capture this one is spooky - theoretically this would put a 10 minute timeout on the connection Signed-off-by: Alex Suraci <[email protected]> * withDeadlineConn: fix data race Signed-off-by: Alex Suraci <[email protected]> * include stderr when session exits poorly Signed-off-by: Alex Suraci <[email protected]> * fix frontend not printing final frame root cause of TestModuleConstructors/return_error/go Signed-off-by: Alex Suraci <[email protected]> * fix data race in heartbeating Signed-off-by: Alex Suraci <[email protected]> * move span heartbeating to Cloud exporter Signed-off-by: Alex Suraci <[email protected]> * remove unused exporter Signed-off-by: Alex Suraci <[email protected]> * lint Signed-off-by: Alex Suraci <[email protected]> * test that logs only get printed once Signed-off-by: Alex Suraci <[email protected]> * fix comments and client names Signed-off-by: Alex Suraci <[email protected]> * drain logs too, refactor logging infra * otel logging now respects Close() which sends an 'eof' log record, indicated by stdio.eof=true. * engine pub/sub now drains until it sees stdio.eof=true for each stream. this fixes the integration test which needed to seep(1s). * refactor logging helpers to reduce duplication. * retire ioctx; redundant with otel logging. still used for dagql tests though, so moved to internal. * use an attr to indicate global logs, rather than a different library using library made little sense since you'd still want to know that info from the different call sites. using an attr is the same amount of effort without overloading a field. Signed-off-by: Alex Suraci <[email protected]> * disable gocritic.ifElseChain this one is silly. switch carries subtle implications, and breaks break. don't want to have to use labels to make this linter happy. Signed-off-by: Alex Suraci <[email protected]> * client stream test: work around log limitations Signed-off-by: Alex Suraci <[email protected]> * const-ify Signed-off-by: Alex Suraci <[email protected]> * lint Signed-off-by: Alex Suraci <[email protected]> * remove accidentally-committed file Signed-off-by: Alex Suraci <[email protected]> * sdk/go: use env carrier Signed-off-by: Alex Suraci <[email protected]> * auto-refresh Cloud exporter token OTel doesn't support injecting a transport, so we have to do this from the outside. :( open-telemetry/opentelemetry-go#2632 Signed-off-by: Alex Suraci <[email protected]> * remove early client ID setup this isn't really load-bearing and raises more questions than it answers Signed-off-by: Alex Suraci <[email protected]> * fix heartbeating this was broken in two ways that canceled each other out: * deadlocked on first heartbeat * never cleared out active spans, so it'd constantly make everything 'running' Signed-off-by: Alex Suraci <[email protected]> * send function logs + spans to API call * retire dagger.io/ui.mask - white whale defeated * simplify TS/Go/Python SDKs - don't need to do the mask/passthrough dance anymore Signed-off-by: Alex Suraci <[email protected]> * pubsub: abandon child spans on completion way simpler implementation Signed-off-by: Alex Suraci <[email protected]> * python lint Signed-off-by: Alex Suraci <[email protected]> * untie ClientMetadataMetaKey knot Signed-off-by: Alex Suraci <[email protected]> * simplify otel proxy setup (*http.Server).Shutdown handles most of what we needed before Signed-off-by: Alex Suraci <[email protected]> * remove dead code Signed-off-by: Alex Suraci <[email protected]> * tui: move digests to -vvv Signed-off-by: Alex Suraci <[email protected]> * hide exec /runtime when redirecting telemetry Signed-off-by: Alex Suraci <[email protected]> * propagate log level config to engine client * reduce telemetry logs down to debug * fix debug var being clobbered via double-flag-parsing * --debug etc get parsed twice: once before function flags are learned, and again after. so we just need to pass in the current value. unfortunately --verbosity gets clobbered but that doesn't matter yet. Signed-off-by: Alex Suraci <[email protected]> * frontend: don't print blank line on EOF Signed-off-by: Alex Suraci <[email protected]> * use http.Error, nix comment Signed-off-by: Alex Suraci <[email protected]> * add sanity check to avoid infinite loop Signed-off-by: Alex Suraci <[email protected]> * link to SIGPIPE docs Signed-off-by: Alex Suraci <[email protected]> * plain: handle EOF log record for primary logs Signed-off-by: Alex Suraci <[email protected]> * plain: don't render logs if --silent Signed-off-by: Alex Suraci <[email protected]> * session: trigger exit on stdout write fail Signed-off-by: Alex Suraci <[email protected]> * simplify pub/sub client tracking * spans don't inherit their parent span's ID anymore. I don't think we actually need this, since we always step recursively through the span ancestry to collect the set of relevant client IDs. better not to funge data if we don't need to. * clean up spanDone state, not needed with new child abandonment scheme * clean up all state for a trace when all clients go away. this way we also clean up spans that don't have an associated client. Signed-off-by: Alex Suraci <[email protected]> * ci: add spans to go lint Signed-off-by: Alex Suraci <[email protected]> * fix: plain finalRender should disable the context hold Signed-off-by: Justin Chadwell <[email protected]> --------- Signed-off-by: Alex Suraci <[email protected]> Signed-off-by: Justin Chadwell <[email protected]> Co-authored-by: Justin Chadwell <[email protected]>
Problem Statement
It would be useful to be able to pass a custom HTTP client to this instrumentation library.
Proposed Solution
Add a WithClient() option.
Alternatives
Configuring everything with options, but it's quite complex and it costs a lot when you already have a fully featured http client.
Additional Context
I would use this for Prometheus, which has a fully featured http client.
The text was updated successfully, but these errors were encountered: