diff --git a/specification/metrics/api-user.md b/specification/metrics/api-user.md index 852366e68f4..6eeb84f57dd 100644 --- a/specification/metrics/api-user.md +++ b/specification/metrics/api-user.md @@ -12,10 +12,8 @@ + [Bound instrument calling convention](#bound-instrument-calling-convention) + [Direct instrument calling convention](#direct-instrument-calling-convention) + [RecordBatch calling convention](#recordbatch-calling-convention) - + [Label set re-use is encouraged](#label-set-re-use-is-encouraged) - [Missing label keys](#missing-label-keys) - - [Option: Convenience method to bypass `meter.Labels(...)`](#option-convenience-method-to-bypass-meterlabels) - - [Option: Ordered LabelSet construction](#option-ordered-labelset-construction) + - [Option: Ordered labels](#option-ordered-labels) - [Detailed specification](#detailed-specification) * [Instrument construction](#instrument-construction) + [Recommended label keys](#recommended-label-keys) @@ -85,13 +83,11 @@ include the instrument, a numerical value, and an optional set of labels. The instrument, discussed in detail below, contains the metric name and various optional settings. -Labels are key:value pairs associated with events describing various -dimensions or categories that describe the event. A "label key" -refers to the key component while "label value" refers to the -correlated value component of a label. Label refers to the pair of -label key and value. Labels are passed in to the metric event in the -form of a `LabelSet` argument, using several input methods discussed -below. +Labels are key:value pairs associated with events describing various dimensions +or categories that describe the event. A "label key" refers to the key +component while "label value" refers to the correlated value component of a +label. Label refers to the pair of label key and value. Labels are passed in +to the metric event at construction time. Metric events always have an associated component name, the name passed when constructing the corresponding `Meter`. Metric events are @@ -182,30 +178,32 @@ func newServer(meter metric.Meter) *server { func (s *server) operate(ctx context.Context) { // ... other work - s.instruments.counter1.Add(ctx, 1, s.meter.Labels( - label1.String("..."), - label2.String("..."))) + s.instruments.counter1.Add(ctx, 1, + key.String("label1", "..."), + key.String("label2", "..."), } ``` ### Metric calling conventions -This API is factored into three core types: instruments, bound instruments, -and label sets. In doing so, we provide several ways of capturing -measurements that are semantically equivalent and generate equivalent -metric events, but offer varying degrees of performance and -convenience. +The metrics API provides three semantically equivalent ways to capture measurements: + +- calling bound metric instruments +- calling unbound metric instruments with labels +- batch recording without a metric instrument + +All three methods generate equivalent metric events, but offer varying degrees +of performance and convenience. This section applies to calling conventions for counter, gauge, and measure instruments. -As described above, metric events consist of an instrument, a set of -labels, and a numerical value, plus associated context. The -performance of a metric API depends on the work done to enter a new -measurement. One approach to reduce cost is to aggregate intermediate -results in the SDK, so that subsequent events happening in the same -collection period, for the same label set, combine into the same -working memory. +As described above, metric events consist of an instrument, a set of labels, +and a numerical value, plus associated context. The performance of a metric +API depends on the work done to enter a new measurement. One approach to +reduce cost is to aggregate intermediate results in the SDK, so that subsequent +events happening in the same collection period, for the same set of labels, +combine into the same working memory. In this document, the term "aggregation" is used to describe the process of coalescing metric events for a complete set of labels, @@ -217,30 +215,28 @@ various trade-offs in terms of complexity and performance. #### Bound instrument calling convention In situations where performance is a requirement and a metric instrument is -repeatedly used with the same set of labels, the developer may elect -to use the _bound instrument_ calling convention as an optimization. -For bound instruments to be a benefit, it requires that a specific -instrument will be re-used with specific labels. If an instrument -will be used with the same label set more than once, obtaining an -bound instrument corresponding to the label set ensures the highest -performance available. - -To bind an instrument and label set, use the `Bind(LabelSet)` method to -return an interface that supports the `Add()`, `Set()`, or `Record()` -method of the instrument in question. +repeatedly used with the same set of labels, the developer may elect to use the +_bound instrument_ calling convention as an optimization. For bound +instruments to be a benefit, it requires that a specific instrument will be +re-used with specific labels. If an instrument will be used with the same +labels more than once, obtaining a bound instrument corresponding to the labels +ensures the highest performance available. + +To bind an instrument, use the `Bind(labels)` method to return an interface +that supports the `Add()`, `Set()`, or `Record()` method of the instrument in +question. Bound instruments may consume SDK resources indefinitely. ```golang func (s *server) processStream(ctx context.Context) { - streamLabels := s.meter.Labels( - labelA.String("..."), - labelB.String("..."), - ) // The result of Bind() is a bound instrument // (e.g., a BoundInt64Counter). - counter2 := s.instruments.counter2.Bind(streamLabels) + counter2 := s.instruments.counter2.Bind( + key.String("labelA", "..."), + key.String("labelB", "..."), + ) for _, item := <-s.channel { // ... other work @@ -254,10 +250,10 @@ func (s *server) processStream(ctx context.Context) { #### Direct instrument calling convention -When convenience is more important than performance, or there is no -re-use to potentially optimize with bound instruments, users may -elect to operate directly on metric instruments, supplying a label set -at the call site. +When convenience is more important than performance, or there is no re-use to +potentially optimize with bound instruments, users may elect to operate +directly on metric instruments, supplying labels at the call site. This method +offers the greatest convenience possible For example, to update a single counter: @@ -265,24 +261,16 @@ For example, to update a single counter: func (s *server) method(ctx context.Context) { // ... other work - s.instruments.counter1.Add(ctx, 1, s.meter.Labels(...)) + s.instruments.counter1.Add(ctx, 1, ...) } ``` -This method offers the greatest convenience possible. If performance -becomes a problem, one option is to use bound instruments as described above. -Another performance option, in some cases, is to just re-use the -labels. In the example here, `meter.Labels(...)` constructs a -re-usable label set which may be an important performance -optimization. - #### RecordBatch calling convention -There is one final API for entering measurements, which is like the -direct access calling convention but supports multiple simultaneous -measurements. The use of a RecordBatch API supports entering multiple -measurements, implying a semantically atomic update to several -instruments. +There is one final API for entering measurements, which is like the direct +access calling convention but supports multiple simultaneous measurements. The +use of a RecordBatch API supports entering multiple measurements, implying a +semantically atomic update to several instruments. For example: @@ -290,11 +278,7 @@ For example: func (s *server) method(ctx context.Context) { // ... other work - labelSet := s.meter.Labels(...) - - // ... more work - - s.meter.RecordBatch(ctx, labelSet, + s.meter.RecordBatch(ctx, labels, s.instruments.counter1.Measurement(1), s.instruments.gauge1.Measurement(10), s.instruments.measure2.Measurement(123.45), @@ -310,92 +294,23 @@ exporter's point of view. Calls to `RecordBatch` may potentially reduce costs because the SDK can enqueue a single bulk update, or take a lock only once, for example. -#### Label set re-use is encouraged - -A significant factor in the cost of metrics export is that labels, -which arrive as an unordered list of keys and values, must be -canonicalized in some way before they can be used for lookup. -Canonicalizing labels can be an expensive operation as it may require -sorting or de-duplicating by some other means, possibly even -serializing, the set of labels to produce a valid map key. - -The operation of converting an unordered set of labels into a -canonicalized set of labels, useful for pre-aggregation, is expensive -enough that we give it first-class treatment in the API. The -`meter.Labels(...)` API canonicalizes labels, returning an opaque -`LabelSet` object, another form of pre-computation available to the -user. - -Re-usable `LabelSet` objects provide a potential optimization for -scenarios where bound instruments might not be effective. For example, if the -label set will be re-used but only used once per metric, bound instruments do -not offer any optimization. It may be best to pre-compute a -canonicalized `LabelSet` once and re-use it with the direct calling -convention. - -Constructing a bound instrument is considered the higher-performance -option, when the bound instrument will be used more than once. Still, consider -re-using the result of `Meter.Labels(...)` when constructing more than -one bound instrument. - -```golang -func (s *server) method(ctx context.Context) { - // ... other work - - labelSet := s.meter.Labels(...) - - s.instruments.counter1.Add(ctx, 1, labelSet) - - // ... more work - - s.instruments.gauge1.Set(ctx, 10, labelSet) - - // ... more work - - s.instruments.measure1.Record(ctx, 100, labelSet) -} -``` - ##### Missing label keys -When the SDK interprets a `LabelSet` in the context of grouping -aggregated values for an exporter, and where there are keys that are -missing, the SDK is required to consider these values _explicitly -unspecified_, a distinct value type of the exported data model. - -##### Option: Convenience method to bypass `meter.Labels(...)` - -As a language-optional feature, the direct and bound instrument calling -convention APIs may support alternate convenience methods to pass raw -labels at the call site. These may be offered as overloaded methods -for `Add()`, `Set()`, and `Record()` (direct calling convention) or -`Bind()` (bound instrument calling convention), in both cases bypassing a -call to `meter.Labels(...)`. For example: +When the SDK interprets labels in the context of grouping aggregated values for +an exporter, and where there are keys that are missing, the SDK is required to +consider these values _explicitly unspecified_, a distinct value type of the +exported data model. -```java - public void method() { - // pass raw labels, no explicit `LabelSet` - s.instruments.counter1.add(1, labelA.value(...), labelB.value(...)) +##### Option: Ordered labels - // ... or - - // pass raw labels, no explicit `LabelSet` - BoundIntCounter counter = s.instruments.gauge1.bind(labelA, ..., labelB, ...) - for (...) { - counter.add(1) - } - } -``` - -##### Option: Ordered LabelSet construction - -As a language-level decision, APIs may support _ordered_ LabelSet -construction, in which a pre-defined set of ordered label keys is -defined such that values can be supplied in order. For example, +As a language-level decision, APIs may support label key ordering. In this +case, the user may specify an ordered sequence of label keys, which is used to +create an unordered set of labels from a sequence of similarly ordered label +values. For example: ```golang -var rpcLabelKeys = meter.OrderedLabelKeys("a", "b", "c") +var rpcLabelKeys = OrderedLabelKeys("a", "b", "c") for _, input := range stream { labels := rpcLabelKeys.Values(1, 2, 3) // a=1, b=2, c=3 @@ -404,11 +319,10 @@ for _, input := range stream { } ``` -This is specified as a language-optional feature because its safety, -and therefore its value as an input for monitoring, depends on the -availability of type-checking in the source language. Passing -unordered labels (i.e., a list of bound keys and values) to the -`Meter.Labels(...)` constructor is considered the safer alternative. +This is specified as a language-optional feature because its safety, and +therefore its value as an input for monitoring, depends on the availability of +type-checking in the source language. Passing unordered labels (i.e., a +mapping from keys to values) is considered the safer alternative. ## Detailed specification @@ -443,10 +357,10 @@ are usually selected by the developer for exhibiting low cardinality, importance for monitoring purposes, and _an intention to provide these variables locally_. -SDKs should consider grouping exported metric data by the recommended -label keys of each instrument, unless superceded by another form of -configuration. Recommended keys that are missing will be considered -explicitly unspecified, as for missing `LabelSet` keys in general. +SDKs should consider grouping exported metric data by the recommended label +keys of each instrument, unless superceded by another form of configuration. +Recommended keys that are missing will be considered explicitly unspecified, as +for missing labels in general. #### Instrument options @@ -467,11 +381,11 @@ information about the kind-specific monotonic and absolute options. ### Bound instrument API -Counter, gauge, and measure instruments each support allocating -bound instruments for the high-performance calling convention. The -`Instrument.Bind(LabelSet)` method returns an interface which -implements the `Add()`, `Set()` or `Record()` method, respectively, -for counter, gauge, and measure instruments. +Counter, gauge, and measure instruments each support allocating bound +instruments for the high-performance calling convention. The +`Instrument.Bind(labels)` method returns an interface which implements the +`Add()`, `Set()` or `Record()` method, respectively, for counter, gauge, and +measure instruments. ### Direct instrument API @@ -481,25 +395,23 @@ metric events. ### Interaction with distributed correlation context -The `LabelSet` type introduced above applies strictly to "local" -labels, meaning provided in a call to `meter.Labels(...)`. The -application explicitly declares these labels, whereas distributed -correlation context labels are implicitly associated with the event. +As described above, labels are strictly "local". I.e., the application +explicitly declares these labels, whereas distributed correlation context +labels are implicitly associated with the event. -There is a clear intention to pre-aggregate metrics within the SDK, -using the contents of a `LabelSet` to derive grouping keys. There are -two available options for users to apply distributed correlation -context to the local grouping function used for metrics -pre-aggregation: +There is a clear intention to pre-aggregate metrics within the SDK, using +labels to derive grouping keys. There are two available options for users to +apply distributed correlation context to the local grouping function used for +metrics pre-aggregation: 1. The distributed context, whether implicit or explicit, is associated with every metric event. The SDK could _automatically_ project selected label keys from the distributed correlation into the - metric event. This would require some manner of dynamic mapping from - `LabelSet` to grouping key during aggregation. + metric event. 2. The user can explicitly perform the same projection of distributed - correlation into a `LabelSet` by extracting from the correlation - context and including it in the call to `metric.Labels(...)`. + correlation into labels by extracting labels from the correlation + context and including them in the call to create the metric or bound + instrument. An example of an explicit projection follows. diff --git a/specification/metrics/api.md b/specification/metrics/api.md index d37ce09d901..e1e776d93fb 100644 --- a/specification/metrics/api.md +++ b/specification/metrics/api.md @@ -4,7 +4,7 @@ - [Overview](#overview) * [Metric Instruments](#metric-instruments) - * [Label sets](#label-sets) + * [Labels](#labels) * [Meter Interface](#meter-interface) * [Aggregations](#aggregations) * [Time](#time) @@ -16,7 +16,6 @@ * [Observer](#observer) - [Interpretation](#interpretation) * [Standard implementation](#standard-implementation) - * [Option: Dedicated Measure instrument for timing measurements](#option-dedicated-measure-instrument-for-timing-measurements) * [Future Work: Option Support](#future-work-option-support) * [Future Work: Configurable Aggregations / View API](#future-work-configurable-aggregations--view-api) - [Metric instrument selection](#metric-instrument-selection) @@ -99,27 +98,20 @@ events that it produces. Details about calling conventions for each kind of instrument are covered in the [user-level API specification](api-user.md). -### Label sets +### Labels -_Label_ is the term used to refer to a key-value attribute associated -with a metric event. Although they are fundamentally similar to [Span -attributes](../trace/api.md#span) in the tracing API, a label set is -given its own type in the Metrics API (generally: `LabelSet`). Label -sets are a feature of the API meant to facilitate re-use and thereby -to lower the cost of processing metric events. Users are encouraged -to re-use label sets whenever possible, as they may contain a -previously encoded representation of the labels. +A _Label_ is the term used to refer to a key-value attribute associated with a +metric event, similar to a [Span attribute](../trace/api.md#span) in the +tracing API. -Users obtain label sets by calling a `Meter` API function. Each of -the instrument calling conventions detailed in the [user-level API -specification](api-user.md) accepts a label set. +Each of the instrument calling conventions detailed in the [user-level API +specification](api-user.md) accept a set of labels as an argument. ### Meter Interface -To produce measurements using an instrument, you need an SDK that -implements the `Meter` API. This interface consists of a set of -instrument constructors, functionality related to label sets, and a -facility for capturing batches of measurements in a semantically atomic +To produce measurements using an instrument, you need an SDK that implements +the `Meter` API. This interface consists of a set of instrument constructors, +and a facilities for capturing batches of measurements in a semantically atomic way. There is a global `Meter` instance available for use that facilitates @@ -193,12 +185,12 @@ measurements. Metric events from Counter and Measure instruments are captured at the moment they happen, when the SDK receives the corresponding function call. -The Observer instrument supports an asynchronous API, allowing the SDK -to collect metric data on demand, once per collection interval. A -single Observer instrument callback can capture multiple metric events -associated with different label sets. Semantically, by definition, -these observations are captured at a single instant in time, the -instant that they became the current set of last-measured values. +The Observer instrument supports an asynchronous API, allowing the SDK to +collect metric data on demand, once per collection interval. A single Observer +instrument callback can capture multiple metric events associated with +different sets of labels. Semantically, by definition, these observations are +captured at a single instant in time, the instant that they became the current +set of last-measured values. Because metric events are implicitly timestamped, we could refer to a series of metric events as a _time series_. However, we reserve the @@ -214,14 +206,13 @@ perform this task, the SDK must aggregate metric events over the collection interval: (1) across time, (2) across key dimensions in _label space_. -When aggregating across spatial dimensions, metric events for -different label sets are combined into an aggregated value for each -distinct "group" of values for the key dimensions. It means that -measurements are combined for all metric events having the same values -for selected keys, explicitly disregarding any additional labels with -keys not in the set of aggregation keys. Some exporters are known to -require pre-specifying the label keys used for aggregation (e.g., -Prometheus). +When aggregating across spatial dimensions, metric events for different sets of +labels are combined into an aggregated value for each distinct "group" of +values for the key dimensions. It means that measurements are combined for all +metric events having the same values for selected keys, explicitly disregarding +any additional labels with keys not in the set of aggregation keys. Some +exporters are known to require pre-specifying the label keys used for +aggregation (e.g., Prometheus). For example, if `[ak1, ak2]` are the aggregation keys and `[ik1, ik2]` are the ignored keys, then a metric event having labels @@ -248,7 +239,7 @@ events produced through an instrument consist of: - [Context](../context/context.md) (Span context, Correlation context) - timestamp (implicit to the SDK) - instrument definition (name, kind, and semantic options) -- label set (associated key-values) +- associated label keys and values - value (a signed integer or floating point number) This format is the result of separating the API from the SDK--a common @@ -314,10 +305,9 @@ Context, by definition. This means, for example, it is not possible to associate Observer instrument events with Correlation or Span context. -Observer instruments capture not only current values, but also -effectively _which label sets are current_ at the moment of -collection. These instruments can be used to compute probabilities -and ratios, because values are part of a set. +Observer instruments capture not only current values, but also effectively +_which labels are current_ at the moment of collection. These instruments can +be used to compute probabilities and ratios, because values are part of a set. Unlike Counter and Measure instruments, Observer instruments are synchronized with collection. There is no aggregation across time for @@ -352,11 +342,11 @@ meaning of these actions. The standard implementation for the three instruments is defined as follows: -1. Counter. The `Add()` function accumulates a total for each distinct label set. When aggregating over distinct label sets for a Counter, combine using arithmetic addition and export as a sum. Depending on the exposition format, sums are exported either as pairs of label set and cumulative _delta_ or as pairs of label set and cumulative _total_. +1. Counter. The `Add()` function accumulates a total for each distinct set of labels. When aggregating over labels for a Counter, combine using arithmetic addition and export as a sum. Depending on the exposition format, sums are exported either as pairs of labels and cumulative _delta_ or as pairs of labels and cumulative _total_. -2. Measure. Use the `Record()` function to report events for which the SDK will compute summary statistics about the distribution of values, for each distinct label set. The summary statistics to use are determined by the aggregation, but they usually include at least the sum of values, the count of measurements, and the minimum and maximum values. When aggregating distinct Measure events, report summary statistics of the combined value distribution. Exposition formats for summary statistics vary widely, but typically include pairs of label set and (sum, count, minimum and maximum value). +2. Measure. Use the `Record()` function to report events for which the SDK will compute summary statistics about the distribution of values, for each distinct set of labels. The summary statistics to use are determined by the aggregation, but they usually include at least the sum of values, the count of measurements, and the minimum and maximum values. When aggregating distinct Measure events, report summary statistics of the combined value distribution. Exposition formats for summary statistics vary widely, but typically include pairs of labels and (sum, count, minimum and maximum value). -3. Observer. Current values are provided by the Observer callback at the end of each Metric collection period. When aggregating values _for the same label set_, combine using the most-recent value. When aggregating values _for different label sets_, combine the value distribution as for Measure instruments. Export as pairs of label set and (sum, count, minimum and maximum value). +3. Observer. Current values are provided by the Observer callback at the end of each Metric collection period. When aggregating values _for the same set of labels_, combine using the most-recent value. When aggregating values _for different sets of labels_, combine the value distribution as for Measure instruments. Export as pairs of labels and (sum, count, minimum and maximum value). We believe that the standard behavior of one of these three instruments covers nearly all use-cases for users of OpenTelemetry in @@ -442,15 +432,13 @@ server that supports several protocols. The number of bytes read should be labeled with the protocol name and aggregated in the process. -This is a typical application for the Counter instrument. Use one -Counter for capturing the number bytes read. When handling a request, -compute a LabelSet containing the name of the protocol and potentially -other useful labels, then call `Add()` with the same label set and the -number of bytes read. +This is a typical application for the Counter instrument. Use one Counter for +capturing the number bytes read. When handling a request, compute a LabelSet +containing the name of the protocol and potentially other useful labels, then +call `Add()` with the same labels and the number of bytes read. -To lower the cost of this reporting, you can `Bind()` the -instrument with each of the supported protocols ahead of time and -avoid computing the label set for each request. +To lower the cost of this reporting, you can `Bind()` the instrument with each +of the supported protocols ahead of time. ### Reporting total bytes read and bytes per request @@ -518,7 +506,7 @@ CPU usage is something that we naturally sum, which raises several questions. - Why not use a Counter instrument? In order to use a Counter instrument, we would need to convert total usage figures into deltas. Calculating deltas from the previous measurement is easy to do, but Counter instruments are not meant to be used from callbacks. -- Why not report deltas in the Observer callback? Observer instruments are meant to be used to observe current values. Nothing prevents reporting deltas with an Observer, but the standard aggregation for Observer instruments is to sum the current value across distinct label sets. The standard behavior is useful for determining the current rate of CPU usage, but special configuration would be required for an Observer instrument to use Counter aggregation. +- Why not report deltas in the Observer callback? Observer instruments are meant to be used to observe current values. Nothing prevents reporting deltas with an Observer, but the standard aggregation for Observer instruments is to sum the current value across distinct labels. The standard behavior is useful for determining the current rate of CPU usage, but special configuration would be required for an Observer instrument to use Counter aggregation. ### Reporting per-shard memory holdings @@ -526,11 +514,10 @@ Suppose you have a widely-used library that acts as a client to a sharded service. For each shard it maintains some client-side state, holding a variable amount of memory per shard. -Observe the current allocation per shard using an Observer instrument -with a shard label. These can be aggregated across hosts to compute -cluster-wide memory holdings by shard, for example, using the standard -aggregation for Observers, which sums the current value across -distinct label sets. +Observe the current allocation per shard using an Observer instrument with a +shard label. These can be aggregated across hosts to compute cluster-wide +memory holdings by shard, for example, using the standard aggregation for +Observers, which sums the current value across distinct labels. ### Reporting number of active requests