Skip to content

Commit

Permalink
[Docs] Telemetry page (#1556)
Browse files Browse the repository at this point in the history
  • Loading branch information
martintmk authored Sep 6, 2023
1 parent bc07c7c commit 7b95756
Show file tree
Hide file tree
Showing 7 changed files with 256 additions and 191 deletions.
179 changes: 178 additions & 1 deletion docs/telemetry.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,180 @@
# Telemetry

🚧 This documentation is being written as part of the Polly v8 release.
Starting with version 8, Polly provides telemetry for all built-in resilience strategies.

## Usage

To enable telemetry in Polly, add the `Polly.Extensions` package to your project:

```sh
dotnet add package Polly.Extensions
```

Afterwards, you can use the `ConfigureTelemetry(...)` extension method on the `ResiliencePipelineBuilder`:

<!-- snippet: configure-telemetry -->
```cs
var telemetryOptions = new TelemetryOptions
{
// Configure logging
LoggerFactory = LoggerFactory.Create(builder => builder.AddConsole())
};

// Configure enrichers
telemetryOptions.MeteringEnrichers.Add(new MyMeteringEnricher());

// Configure telemetry listeners
telemetryOptions.TelemetryListeners.Add(new MyTelemetryListener());

var builder = new ResiliencePipelineBuilder()
.AddTimeout(TimeSpan.FromSeconds(1))
.ConfigureTelemetry(telemetryOptions) // This method enables telemetry in the builder
.Build();
```
<!-- endSnippet -->

The `MyTelemetryListener` and `MyMeteringEnricher` is implemented as:

<!-- snippet: telemetry-listeners -->
```cs
internal class MyTelemetryListener : TelemetryListener
{
public override void Write<TResult, TArgs>(in TelemetryEventArguments<TResult, TArgs> args)
{
Console.WriteLine($"Telemetry event occurred: {args.Event.EventName}");
}
}

internal class MyMeteringEnricher : MeteringEnricher
{
public override void Enrich<TResult, TArgs>(in EnrichmentContext<TResult, TArgs> context)
{
context.Tags.Add(new("my-custom-tag", "custom-value"));
}
}
```
<!-- endSnippet -->

Alternatively, you can use the `AddResiliencePipeline(...)` extension method which automatically enables telemetry for defined pipeline:

<!-- snippet: add-resilience-pipeline-with-telemetry -->
```cs
var serviceCollection = new ServiceCollection()
.AddLogging(builder => builder.AddConsole())
.AddResiliencePipeline("my-strategy", builder => builder.AddTimeout(TimeSpan.FromSeconds(1)))
.Configure<TelemetryOptions>(options =>
{
// Configure enrichers
options.MeteringEnrichers.Add(new MyMeteringEnricher());

// Configure telemetry listeners
options.TelemetryListeners.Add(new MyTelemetryListener());
});
```
<!-- endSnippet -->

## Metrics

The metrics are emitted under the `Polly` meter name. The subsequent sections provide insights into the metrics produced by Polly. Please note that any custom enriched dimensions are not depicted in the following tables.

Every telemetry event has the following dimensions:

- `pipeline-name`: Optional, comes from `ResiliencePipelineBuilder.Name`.
- `pipeline-instance`: Optional, comes from `ResiliencePipelineBuilder.InstanceName`.
- `strategy-name`: Optional, comes from `RetryStrategyOptions.Name`.

The sample below demonstrates how to assign these dimensions:

<!-- snippet: telemetry-coordinates -->
```cs
var builder = new ResiliencePipelineBuilder();
builder.Name = "my-name";
builder.Name = "my-instance-name";

builder.AddRetry(new RetryStrategyOptions
{
// The default value is "Retry"
Name = "my-retry-name"
});
```
<!-- endSnippet -->

These values are subsequently reflected in the metrics below:

### resilience-events

- Type: *Counter*
- Description: Emitted upon the occurrence of a resilience event.

Dimensions:

|Name|Description|
|---| ---|
|`event-name`| The name of the emitted event.|
|`event-severity`| The severity of the event (`Debug`, `Information`, `Warning`, `Error`, `Critical`).|
|`pipeline-name`| The name of the pipeline corresponding to the resilience pipeline.|
|`pipeline-instance`| The instance name of the pipeline corresponding to the resilience pipeline.|
|`strategy-name`| The name of the strategy generating this event.|
|`operation-key`| The operation key associated with the call site. |
|`exception-name`| The full name of the exception assigned to the execution result (`System.InvalidOperationException`). |

### execution-attempt-duration

- Type: *Histogram*
- Unit: *milliseconds*
- Description: Tracks the duration of execution attempts, produced by `Retry` and `Hedging` resilience strategies.

Dimensions:

|Name|Description|
|---| ---|
|`event-name`| The name of the emitted event.|
|`event-severity`| The severity of the event (`Debug`, `Information`, `Warning`, `Error`, `Critical`).|
|`pipeline-name`| The name of the pipeline corresponding to the resilience pipeline.|
|`pipeline-instance`| The instance name of the pipeline corresponding to the resilience pipeline.|
|`strategy-name`| The name of the strategy generating this event.|
|`operation-key`| The operation key associated with the call site. |
|`exception-name`| The full name of the exception assigned to the execution result (`System.InvalidOperationException`). |
|`attempt-number`| The execution attempt number, starting at 0 (0, 1, 2, etc.). |
|`attempt-handled`| Indicates if the execution outcome was handled. A handled outcome indicates execution failure and the need for retry (`true`, `false`). |

### pipeline-execution-duration

- Type: *Histogram*
- Unit: *milliseconds*
- Description: Measures the duration of resilience pipelines.

Dimensions:

|Name|Description|
|---| ---|
|`pipeline-name`| The name of the pipeline corresponding to the resilience pipeline.|
|`pipeline-instance`| The instance name of the pipeline corresponding to the resilience pipeline.|
|`operation-key`| The operation key associated with the call site. |
|`exception-name`| The full name of the exception assigned to the execution result (`System.InvalidOperationException`). |

## Logs

Logs are registered under the `Polly` logger name. Here are some examples of the logs:

``` text
// This log is recorded whenever a resilience event occurs. EventId = 0
Resilience event occurred. EventName: '{EventName}', Source: '{PipelineName}/{PipelineInstance}/{StrategyName}', Operation Key: '{OperationKey}', Result: '{Result}'
// This log is recorded when a resilience pipeline begins executing. EventId = 1
Resilience pipeline executing. Source: '{PipelineName}/{PipelineInstance}', Operation Key: '{OperationKey}'
// This log is recorded when a resilience pipeline finishes execution. EventId = 2
Resilience pipeline executed. Source: '{PipelineName}/{PipelineInstance}', Operation Key: '{OperationKey}', Result: '{Result}', Execution Health: '{ExecutionHealth}', Execution Time: {ExecutionTime}ms
// This log is recorded upon the completion of every execution attempt. EventId = 3
Execution attempt. Source: '{PipelineName}/{PipelineInstance}/{StrategyName}', Operation Key: '{OperationKey}', Result: '{Result}', Handled: '{Handled}', Attempt: '{Attempt}', Execution Time: '{ExecutionTimeMs}'
```

## Emitting telemetry events

Each resilience strategy can generate telemetry data through the [`ResilienceStrategyTelemetry`](../src/Polly.Core/Telemetry/ResilienceStrategyTelemetry.cs) API. Polly encapsulates event details as [`TelemetryEventArguments`](../src/Polly.Core/Telemetry/TelemetryEventArguments.cs) and emits them via `TelemetryListener`.

To leverage this telemetry data, users should assign a `TelemetryListener` instance to `ResiliencePipelineBuilder.TelemetryListener` and then consume the `TelemetryEventArguments`.

For common scenarios, it is expected that users would make use of `Polly.Extensions`. This extension enables telemetry configuration through the `ResiliencePipelineBuilder.ConfigureTelemetry(...)` method, which processes `TelemetryEventArguments` to generate logs and metrics.
9 changes: 0 additions & 9 deletions src/Polly.Core/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,6 @@ Recommended signatures for these delegates are:
- `Func<Args<TResult>, ValueTask<TValue>>` (Reactive)
- `Func<Args, ValueTask<TValue>>` (Proactive)


These delegates accept either `Args` or `Args<TResult>` arguments, which encapsulate event information. Note that all these delegates are asynchronous and return a `ValueTask`.

> [!NOTE]
Expand Down Expand Up @@ -248,11 +247,3 @@ new ResiliencePipelineBuilder<string>()
.Build();
```
<!-- endSnippet -->

## Telemetry

Each resilience strategy can generate telemetry data through the [`ResiliencePipelineTelemetry`](Telemetry/ResiliencePipelineTelemetry.cs) API. Polly encapsulates event details as [`TelemetryEventArguments`](Telemetry/TelemetryEventArguments.cs) and emits them via `TelemetryListener`.

To leverage this telemetry data, users should assign a `TelemetryListener` instance to `ResiliencePipelineBuilder.TelemetryListener` and then consume the `TelemetryEventArguments`.

For common scenarios, it is expected that users would make use of `Polly.Extensions`. This extension enables telemetry configuration through the `ResiliencePipelineBuilder.ConfigureTelemetry(...)` method, which processes `TelemetryEventArguments` to generate logs and metrics.
139 changes: 4 additions & 135 deletions src/Polly.Extensions/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Polly.Extensions Overview

`Polly.Extensions` provides a set of features that streamline the integration of Polly with the standard `IServiceCollection` Dependency Injection (DI) container. It further enhances telemetry by exposing a `ConfigureTelemetry` extension method that enables [logging](https://learn.microsoft.com/dotnet/core/extensions/logging?tabs=command-line) and [metering](https://learn.microsoft.com/dotnet/core/diagnostics/metrics) for all strategies created via DI extension points.
`Polly.Extensions` provides a set of features that streamline the integration of Polly with the standard `IServiceCollection` Dependency Injection (DI) container. It further enhances telemetry by exposing a `ConfigureTelemetry` extension method that enables [logging](https://learn.microsoft.com/dotnet/core/extensions/logging?tabs=command-line) and [metering](https://learn.microsoft.com/dotnet/core/diagnostics/metrics) for all strategies created via DI extension points.

Below is an example illustrating the usage of `AddResiliencePipeline` extension method:

Expand Down Expand Up @@ -36,141 +36,10 @@ await pipeline.ExecuteAsync(async cancellation => await Task.Delay(100, cancella
<!-- endSnippet -->

> [!NOTE]
> Telemetry is enabled by default when utilizing the `AddResiliencePipeline` extension method.
> Telemetry is enabled by default when utilizing the `AddResiliencePipeline(...)` extension method.
## Telemetry Features

Upon invoking the `ConfigureTelemetry` extension method, Polly begins to emit logs and metrics. Here's an example:
This project implements the `TelemetryListener` and uses it to bridge the Polly-native events into logs and metrics.

<!-- snippet: configure-telemetry -->
```cs
var telemetryOptions = new TelemetryOptions
{
// Configure logging
LoggerFactory = LoggerFactory.Create(builder => builder.AddConsole())
};

// Configure enrichers
telemetryOptions.MeteringEnrichers.Add(new MyMeteringEnricher());

// Configure telemetry listeners
telemetryOptions.TelemetryListeners.Add(new MyTelemetryListener());

var builder = new ResiliencePipelineBuilder()
.AddTimeout(TimeSpan.FromSeconds(1))
.ConfigureTelemetry(telemetryOptions) // This method enables telemetry in the builder
.Build();
```
<!-- endSnippet -->

<!-- snippet: telemetry-listeners -->
```cs
internal class MyTelemetryListener : TelemetryListener
{
public override void Write<TResult, TArgs>(in TelemetryEventArguments<TResult, TArgs> args)
{
Console.WriteLine($"Telemetry event occurred: {args.Event.EventName}");
}
}

internal class MyMeteringEnricher : MeteringEnricher
{
public override void Enrich<TResult, TArgs>(in EnrichmentContext<TResult, TArgs> context)
{
context.Tags.Add(new("my-custom-tag", "custom-value"));
}
}
```
<!-- endSnippet -->

Alternatively, you can use the `AddResiliencePipeline` extension which automatically adds telemetry:

<!-- snippet: add-resilience-pipeline-with-telemetry -->
```cs
var serviceCollection = new ServiceCollection()
.AddLogging(builder => builder.AddConsole())
.AddResiliencePipeline("my-strategy", builder => builder.AddTimeout(TimeSpan.FromSeconds(1)))
.Configure<TelemetryOptions>(options =>
{
// Configure enrichers
options.MeteringEnrichers.Add(new MyMeteringEnricher());

// Configure telemetry listeners
options.TelemetryListeners.Add(new MyTelemetryListener());
});
```
<!-- endSnippet -->

### Emitted Metrics

The emitted metrics are emitted under the `Polly` meter name. The subsequent sections provide insights into the metrics produced by Polly. Please note that any custom enriched dimensions are not depicted in the following tables.

#### resilience-events

- Type: *Counter*
- Description: Emitted upon the occurrence of a resilience event.

Dimensions:

|Name|Description|
|---| ---|
|`event-name`| The name of the emitted event.|
|`event-severity`| The severity of the event (`Debug`, `Information`, `Warning`, `Error`, `Critical`).|
|`pipeline-name`| The name of the pipeline corresponding to the resilience pipeline.|
|`pipeline-instance`| The instance name of the pipeline corresponding to the resilience pipeline.|
|`strategy-name`| The name of the strategy generating this event.|
|`operation-key`| The operation key associated with the call site. |
|`exception-name`| The full name of the exception assigned to the execution result (`System.InvalidOperationException`). |

#### execution-attempt-duration

- Type: *Histogram*
- Unit: *milliseconds*
- Description: Tracks the duration of execution attempts, produced by `Retry` and `Hedging` resilience strategies.

Dimensions:

|Name|Description|
|---| ---|
|`event-name`| The name of the emitted event.|
|`event-severity`| The severity of the event (`Debug`, `Information`, `Warning`, `Error`, `Critical`).|
|`pipeline-name`| The name of the pipeline corresponding to the resilience pipeline.|
|`pipeline-instance`| The instance name of the pipeline corresponding to the resilience pipeline.|
|`strategy-name`| The name of the strategy generating this event.|
|`operation-key`| The operation key associated with the call site. |
|`exception-name`| The full name of the exception assigned to the execution result (`System.InvalidOperationException`). |
|`attempt-number`| The execution attempt number, starting at 0 (0, 1, 2). |
|`attempt-handled`| Indicates if the execution outcome was handled. A handled outcome indicates execution failure and the need for retry (`true`, `false`). |

#### pipeline-execution-duration

- Type: *Histogram*
- Unit: *milliseconds*
- Description: Measures the duration and results of resilience pipelines.

Dimensions:

|Name|Description|
|---| ---|
|`pipeline-name`| The name of the pipeline corresponding to the resilience pipeline.|
|`pipeline-instance`| The instance name of the pipeline corresponding to the resilience pipeline.|
|`operation-key`| The operation key associated with the call site. |
|`exception-name`| The full name of the exception assigned to the execution result (`System.InvalidOperationException`). |

### Logs

Logs are registered under the `Polly` logger name. Here are some examples of the logs:

``` text
// This log is recorded whenever a resilience event occurs. EventId = 0
Resilience event occurred. EventName: '{EventName}', Source: '{PipelineName}/{PipelineInstance}/{StrategyName}', Operation Key: '{OperationKey}', Result: '{Result}'
// This log is recorded when a resilience pipeline begins executing. EventId = 1
Resilience pipeline executing. Source: '{PipelineName}/{PipelineInstance}', Operation Key: '{OperationKey}'
// This log is recorded when a resilience pipeline finishes execution. EventId = 2
Resilience pipeline executed. Source: '{PipelineName}/{PipelineInstance}', Operation Key: '{OperationKey}', Result: '{Result}', Execution Health: '{ExecutionHealth}', Execution Time: {ExecutionTime}ms
// This log is recorded upon the completion of every execution attempt. EventId = 3
Execution attempt. Source: '{PipelineName}/{PipelineInstance}/{StrategyName}', Operation Key: '{OperationKey}', Result: '{Result}', Handled: '{Handled}', Attempt: '{Attempt}', Execution Time: '{ExecutionTimeMs}'
```
Explore [telemetry documentation](../../docs/telemetry.md) for more details.
2 changes: 1 addition & 1 deletion src/Polly.Extensions/Telemetry/TelemetryListenerImpl.cs
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ public TelemetryListenerImpl(TelemetryOptions options)
ExecutionDuration = Meter.CreateHistogram<double>(
"pipeline-execution-duration",
unit: "ms",
description: "The execution duration and execution results of resilience pipelines.");
description: "The execution duration of resilience pipelines.");
}

public Counter<int> Counter { get; }
Expand Down
Loading

0 comments on commit 7b95756

Please sign in to comment.