Skip to content
This repository has been archived by the owner on Nov 1, 2023. It is now read-only.

Revisit logging to enhance observability #312

Open
ranweiler opened this issue Nov 16, 2020 · 2 comments
Open

Revisit logging to enhance observability #312

ranweiler opened this issue Nov 16, 2020 · 2 comments
Labels
enhancement New feature or request Reviewed

Comments

@ranweiler
Copy link
Member

ranweiler commented Nov 16, 2020

Revisit our logging, and move to a model that allows:

  • Distributed tracing concepts, such as spans with inclusion and correlation
    • Ideally this should map to spans at the service level, which would need to be implemented
  • Structured logging
  • Fan-out to multiple backends at different levels
    • Telemetry ingested by App Insights, ideally via OpenTelemetry
    • stderr, controlled by an env var
    • A circular VM-local log file

Since we use tokio, the tracing library with an OpenTelemetry backend would achieve all of the above.

AB#36002

@ranweiler ranweiler added enhancement New feature or request Needs: triage labels Nov 16, 2020
@ranweiler ranweiler removed the backlog label Nov 8, 2021
@ranweiler ranweiler self-assigned this Nov 8, 2021
@ranweiler
Copy link
Member Author

Current ecosystem support for OpenTelemetry + Application Insights:

Rust

No first-party OpenTelemetry/App Insights support here, even at the Preview level. There is a third-party Application Insights exporter for the opentelemetry SDK crate.

All together, we can use tracing, opentelemetry, tracing-opentelemetry, and opentelemetry-application-insights to generate and export async-compatible span data. We can even export log-style events as App Insights Trace telemetry, correctly-associated with spans.

image

I don't yet see an off-the-shelf mechanism for Custom Events, but it seems like it'd be easy to add. We could also have a separate telemetry channel that uses the appinsights crate just for specialized telemetry like Custom Events. This may be specifically preferable for the optional non-identifying global telemetry.

Python

First-party support, but only in preview. May get some wins if we focus on spans (without events), or use libraries that are getting early attention for pervasive OpenTelemetry instrumentation (FastAPI?).

We can use OpenTelemetry with Python via opentelemetry-sdk/opentelemetry-api, and export spans to Application Insights via azuremonitor-opentelemetry-exporter. The latter is in preview. It currently appears to drop all span-associated events (#21747). Haven't yet checked if there's a way to auto-instrument logging to be span-aware, but seems unlikely (especially since the OpenTelemetry logging spec is not yet stabilized).

@ranweiler
Copy link
Member Author

I don't yet see an off-the-shelf mechanism for Custom Events, but it seems like it'd be easy to add.

Confirmed, this was very easy to add to the exporter backend. The design question then becomes: how do we determine when a span-parented Event from tracing should be exported as Application Insights "Trace Telemetry", vs. a "Custom Event"? The presence/absence of a level field is not a viable cue, because all normally-created tracing events currently have a Level.

In the long run, OpenTelemetry Logging will make the "event"/"log message" distinction clear in a way that tracing libraries can propagate in a more principled way. In the short term, we wouldn't be any worse off than we already are (most telemetry would become "trace event" items). Exceptions are special-cased in the Rust backend. Also, Custom Events are not displayed in a nicer way in the Transaction Timeline view of Application Insights, nor are they in any way more queryable than Trace Events.

For our "Custom Events" that are more properly treated as metric data, there is a (now-frozen) OpenTelemetry Metrics API that has feature-flagged support in the Rust Application Insights exporter.

@ranweiler ranweiler removed their assignment Feb 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request Reviewed
Projects
None yet
Development

No branches or pull requests

3 participants