diff --git a/docs/scraping-receivers.md b/docs/scraping-receivers.md new file mode 100644 index 000000000000..3c4533bb400a --- /dev/null +++ b/docs/scraping-receivers.md @@ -0,0 +1,144 @@ +# Scraping metrics receivers + +Scraping metrics receivers collect predefined set of metrics from externals sources. Typically, the external source +is a monitored entity providing data about itself in some arbitrary format, scraping receivers parse the data and +translate it to the OpenTelemetry Pipeline Data. Examples of the scrapings metrics receivers: + +- [Host Metrics Receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/hostmetricsreceiver) +- [Redis Receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/redisreceiver) + +## Defining emitted metrics + +Each scraping receiver has `metadata.yaml` file that MUST define all the metrics emitted by the receiver. The +file is being used to generate an API for metrics recording, user settings to customize the emitted metrics and user +documentation. The file schema is defined in +https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/cmd/mdatagen/metric-metadata.yaml. +Defining a metric in `metadata.yaml` DOES NOT guarantee that the metric will always be produced by the receiver. In +some cases it may be impossible to fetch particular metrics from a system in a particular state. If it's know that +the metric is reported only under particular conditions (e.g. specific version) of the external source, the +condition MUST be mentioned in the metrics description. + +There are two main categories of the metrics emitted by scraping receivers: + +- **Default metrics**: emitted by default, but can be disabled in user settings. + +- **Optional metrics**: not emitted by default, but can be enabled in user settings. + +### How to identify if new metric should be default or optional? + +There is no strict rule to differentiate default metrics from optional. As a rule of thumb, default metrics SHOULD be +treated as metrics essential to determine healthiness of a particular monitored entity. Optional metrics from the +other hand SHOULD be treated as a source of additional information about the monitored entity. + +If two metrics are built based on the same data but have different representation, at least one of them MUST be +optional. For example: + +- `system.cpu.time` (CPU time reported as cumulative sum in seconds) - Default metric. +- `system.cpu.utilization` (fraction of CPU time spent in different states reported as gauge) - Optional metric. + +In this example, system CPU usage time can be emitted as two different metrics: a gauge as a fraction of 1 or +a cumulative sum in seconds. Potentially, each of them can be deducted from one another. That's why only one of +them is enabled by default. + +## Stability levels of scraping receivers + +All the requirements defined for components in [the Collector README](../README.md) are applicable to scraping +receivers as well. In addition, the following rules applied specifically to scraping metrics receivers: + +### In development + +All the metrics emitted by the receiver are not finalized and can change in any way. The receiver is not ready for +production use. + +### Alpha + +Receiver is ready for limited non-critical workloads. The list of emitted default metrics SHOULD be +considered as complete, but any changes to the `metadata.yaml` still MAY be applied. + +### Beta + +The receiver is ready for non-critical production workloads. The list of emitted default metrics MUST be +considered as complete. Breaking changes to the emitted metrics SHOULD be applied following the deprecation +process spanning across several releases. All the metric and resource attribute names SHOULD be released in an +[OpenTelemetry Specification](https://github.com/open-telemetry/opentelemetry-specification). + +### Stable + +The receiver is ready for production workloads. Breaking changes to the emitted metrics SHOULD be avoided. If +a change is required, it MUST be applied following the deprecation process spanning across several releases. All the +metric and resource attribute names MUST be released in an [OpenTelemetry +Specification](https://github.com/open-telemetry/opentelemetry-specification) + +## Changing the emitted metrics + +Some changes are not considered as breaking and can be applied to metrics emitted by scraping receivers of any +stability level: + +- Adding a new optional metric. +- Adding a new resource attribute to existing metrics. + +Most of other changes to metrics emitted by scraping receivers are considered breaking and MUST be handled +according to the stability level of the receiver. Each type of breaking change defines a set of steps that MUST (or +SHOULD) be applied across several releases for a Stable (or Beta) components. At least 2 versions SHOULD be kept +between the steps to give users time to prepare, e.g. if the step one is released in v0.62.0, the step two should be +released in 0.65.0. Any warnings SHOULD include the version starting from which the next step will take action. If +breaking change is more complicated and many metrics are involved, feature gates SHOULD be used instead. + +### Removing an optional metric + +Steps to remove an optional metric: + +1. Mark the metric as deprecated in `metadata.yaml` by adding "[DEPRECATED]" in its description. Show a warning that + the metric will be removed if the `enabled` option is set explicitly to `true` in user settings. +2. Remove the metric. + +### Removing a default metric + +Steps to remove a default metric: + +1. Mark the metric as deprecated in `metadata.yaml` by adding "[DEPRECATED]" in its description. Show a warning that + the metric will be removed if the `enabled` option is not set explicitly to `false` in user settings. +2. Turn the old metric into optional. Show a warning that the metric will be removed if the `enabled` option is set + explicitly to `true` in user settings. +3. Remove the metric. + +### Replacing an optional metric + +Steps to replace an optional metric with another optional metric: + +1. Mark the metric as deprecated in `metadata.yaml` by adding "[DEPRECATED]" in its description. Add a warning that + the metric will be replaced if the `enabled` option is set explicitly to `true` in user settings. Add the new + optional metric in `metadata.yaml`. +2. Remove the old metric. + +### Replacing a default metric + +Steps to replace a default metric with another default metric: + +1. Mark the metric as deprecated in `metadata.yaml` by adding "[DEPRECATED]" in its description. Add a warning that + the metric will be replaced if the `enabled` option is not set explicitly to `false` in user settings. Add the + new metric as optional for now. The warning should be hidden only if the `enabled` option is set explicitly to + any value for the new metric. +2. Turn the old metric into optional. Turn the new metric into default. Show the warning in case if user keeps the + old metric enabled explicitly. +3. Remove the old metric. + +### Making a default metric optional + +Steps to change metric from default to optional: + +1. Add a warning that the metric will be turned into optional if `enabled` field is not set explicitly to any value in + user settings. Warning example: "WARNING: Metric `foo.bar` will be disabled by default in v0.65.0. If you want to + keep it, please enable if explicitly in the receiver settings." +2. Remove the warning and update `metadata.yaml` to make the metric optional. + +### Adding a new default metric or turning an existing optional metric into default + +Adding a new default metric is a breaking for a scraping receiver because it introduces an unexpected output for users +and additional load on metric backends. Steps to apply such change: + +1. If the metric doesn't exist yet, add one as an optional metric. Add a warning that the metric will be turned into + default if the `enabled` option is not set explicitly to any value in user settings. A warning example: "WARNING: + Metric `foo.bar` will be enabled by default in v0.65.0. If you don't want that metric to be emitted, please + disable it in the receiver settings." +2. Remove the warning and update `metadata.yaml` to make the metric default.