Skip to content

Commit

Permalink
Add non-cumulative histogram (influxdata#7071)
Browse files Browse the repository at this point in the history
  • Loading branch information
fxedel authored and Mathieu Lecarme committed Apr 17, 2020
1 parent ba14073 commit 4321a28
Show file tree
Hide file tree
Showing 3 changed files with 223 additions and 110 deletions.
71 changes: 47 additions & 24 deletions plugins/aggregators/histogram/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,9 @@
The histogram aggregator plugin creates histograms containing the counts of
field values within a range.

Values added to a bucket are also added to the larger buckets in the
distribution. This creates a [cumulative histogram](https://en.wikipedia.org/wiki/Histogram#/media/File:Cumulative_vs_normal_histogram.svg).
If `cumulative` is set to true, values added to a bucket are also added to the
larger buckets in the distribution. This creates a [cumulative histogram](https://en.wikipedia.org/wiki/Histogram#/media/File:Cumulative_vs_normal_histogram.svg).
Otherwise, values are added to only one bucket, which creates an [ordinary histogram](https://en.wikipedia.org/wiki/Histogram#/media/File:Cumulative_vs_normal_histogram.svg)

Like other Telegraf aggregators, the metric is emitted every `period` seconds.
By default bucket counts are not reset between periods and will be non-strictly
Expand All @@ -16,7 +17,7 @@ increasing while Telegraf is running. This behavior can be changed by setting th
Each metric is passed to the aggregator and this aggregator searches
histogram buckets for those fields, which have been specified in the
config. If buckets are found, the aggregator will increment +1 to the appropriate
bucket otherwise it will be added to the `+Inf` bucket. Every `period`
bucket. Otherwise, it will be added to the `+Inf` bucket. Every `period`
seconds this data will be forwarded to the outputs.

The algorithm of hit counting to buckets was implemented on the base
Expand All @@ -39,16 +40,20 @@ of the algorithm which is implemented in the Prometheus
## of accumulating the results.
reset = false

## Whether bucket values should be accumulated. If set to false, "gt" tag will be added.
## Defaults to true.
cumulative = true

## Example config that aggregates all fields of the metric.
# [[aggregators.histogram.config]]
# ## The set of buckets.
# ## Right borders of buckets (with +Inf implicitly added).
# buckets = [0.0, 15.6, 34.5, 49.1, 71.5, 80.5, 94.5, 100.0]
# ## The name of metric.
# measurement_name = "cpu"

## Example config that aggregates only specific fields of the metric.
# [[aggregators.histogram.config]]
# ## The set of buckets.
# ## Right borders of buckets (with +Inf implicitly added).
# buckets = [0.0, 10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]
# ## The name of metric.
# measurement_name = "diskio"
Expand All @@ -64,8 +69,9 @@ option. Optionally, if `fields` is set only the fields listed will be
aggregated. If `fields` is not set all fields are aggregated.

The `buckets` option contains a list of floats which specify the bucket
boundaries. Each float value defines the inclusive upper bound of the bucket.
boundaries. Each float value defines the inclusive upper (right) bound of the bucket.
The `+Inf` bucket is added automatically and does not need to be defined.
(For left boundaries, these specified bucket borders and `-Inf` will be used).

### Measurements & Fields:

Expand All @@ -77,26 +83,43 @@ The postfix `bucket` will be added to each field key.

### Tags:

All measurements are given the tag `le`. This tag has the border value of
bucket. It means that the metric value is less than or equal to the value of
this tag. For example, let assume that we have the metric value 10 and the
following buckets: [5, 10, 30, 70, 100]. Then the tag `le` will have the value
10, because the metrics value is passed into bucket with right border value
`10`.
* `cumulative = true` (default):
* `le`: Right bucket border. It means that the metric value is less than or
equal to the value of this tag. If a metric value is sorted into a bucket,
it is also sorted into all larger buckets. As a result, the value of
`<field>_bucket` is rising with rising `le` value. When `le` is `+Inf`,
the bucket value is the count of all metrics, because all metric values are
less than or equal to positive infinity.
* `cumulative = false`:
* `gt`: Left bucket border. It means that the metric value is greater than
(and not equal to) the value of this tag.
* `le`: Right bucket border. It means that the metric value is less than or
equal to the value of this tag.
* As both `gt` and `le` are present, each metric is sorted in only exactly
one bucket.


### Example Output:

Let assume we have the buckets [0, 10, 50, 100] and the following field values
for `usage_idle`: [50, 7, 99, 12]

With `cumulative = true`:

```
cpu,cpu=cpu1,host=localhost,le=0.0 usage_idle_bucket=0i 1486998330000000000 # none
cpu,cpu=cpu1,host=localhost,le=10.0 usage_idle_bucket=1i 1486998330000000000 # 7
cpu,cpu=cpu1,host=localhost,le=50.0 usage_idle_bucket=2i 1486998330000000000 # 7, 12
cpu,cpu=cpu1,host=localhost,le=100.0 usage_idle_bucket=4i 1486998330000000000 # 7, 12, 50, 99
cpu,cpu=cpu1,host=localhost,le=+Inf usage_idle_bucket=4i 1486998330000000000 # 7, 12, 50, 99
```

With `cumulative = false`:

```
cpu,cpu=cpu1,host=localhost,le=0.0 usage_idle_bucket=0i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=10.0 usage_idle_bucket=0i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=20.0 usage_idle_bucket=1i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=30.0 usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=40.0 usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=50.0 usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=60.0 usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=70.0 usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=80.0 usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=90.0 usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=100.0 usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=+Inf usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,gt=-Inf,le=0.0 usage_idle_bucket=0i 1486998330000000000 # none
cpu,cpu=cpu1,host=localhost,gt=0.0,le=10.0 usage_idle_bucket=1i 1486998330000000000 # 7
cpu,cpu=cpu1,host=localhost,gt=10.0,le=50.0 usage_idle_bucket=1i 1486998330000000000 # 12
cpu,cpu=cpu1,host=localhost,gt=50.0,le=100.0 usage_idle_bucket=2i 1486998330000000000 # 50, 99
cpu,cpu=cpu1,host=localhost,gt=100.0,le=+Inf usage_idle_bucket=0i 1486998330000000000 # none
```
56 changes: 39 additions & 17 deletions plugins/aggregators/histogram/histogram.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,23 @@ import (
"github.com/influxdata/telegraf/plugins/aggregators"
)

// bucketTag is the tag, which contains right bucket border
const bucketTag = "le"
// bucketRightTag is the tag, which contains right bucket border
const bucketRightTag = "le"

// bucketInf is the right bucket border for infinite values
const bucketInf = "+Inf"
// bucketPosInf is the right bucket border for infinite values
const bucketPosInf = "+Inf"

// bucketLeftTag is the tag, which contains left bucket border (exclusive)
const bucketLeftTag = "gt"

// bucketNegInf is the left bucket border for infinite values
const bucketNegInf = "-Inf"

// HistogramAggregator is aggregator with histogram configs and particular histograms for defined metrics
type HistogramAggregator struct {
Configs []config `toml:"config"`
ResetBuckets bool `toml:"reset"`
Cumulative bool `toml:"cumulative"`

buckets bucketsByMetrics
cache map[uint64]metricHistogramCollection
Expand Down Expand Up @@ -57,8 +64,10 @@ type groupedByCountFields struct {
}

// NewHistogramAggregator creates new histogram aggregator
func NewHistogramAggregator() telegraf.Aggregator {
h := &HistogramAggregator{}
func NewHistogramAggregator() *HistogramAggregator {
h := &HistogramAggregator{
Cumulative: true,
}
h.buckets = make(bucketsByMetrics)
h.resetCache()

Expand All @@ -77,16 +86,20 @@ var sampleConfig = `
## of accumulating the results.
reset = false
## Whether bucket values should be accumulated. If set to false, "gt" tag will be added.
## Defaults to true.
cumulative = true
## Example config that aggregates all fields of the metric.
# [[aggregators.histogram.config]]
# ## The set of buckets.
# ## Right borders of buckets (with +Inf implicitly added).
# buckets = [0.0, 15.6, 34.5, 49.1, 71.5, 80.5, 94.5, 100.0]
# ## The name of metric.
# measurement_name = "cpu"
## Example config that aggregates only specific fields of the metric.
# [[aggregators.histogram.config]]
# ## The set of buckets.
# ## Right borders of buckets (with +Inf implicitly added).
# buckets = [0.0, 10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]
# ## The name of metric.
# measurement_name = "diskio"
Expand Down Expand Up @@ -167,18 +180,27 @@ func (h *HistogramAggregator) groupFieldsByBuckets(
tags map[string]string,
counts []int64,
) {
count := int64(0)
for index, bucket := range h.getBuckets(name, field) {
count += counts[index]
sum := int64(0)
buckets := h.getBuckets(name, field) // note that len(buckets) + 1 == len(counts)

tags[bucketTag] = strconv.FormatFloat(bucket, 'f', -1, 64)
h.groupField(metricsWithGroupedFields, name, field, count, copyTags(tags))
}
for index, count := range counts {
if !h.Cumulative {
sum = 0 // reset sum -> don't store cumulative counts

count += counts[len(counts)-1]
tags[bucketTag] = bucketInf
tags[bucketLeftTag] = bucketNegInf
if index > 0 {
tags[bucketLeftTag] = strconv.FormatFloat(buckets[index-1], 'f', -1, 64)
}
}

h.groupField(metricsWithGroupedFields, name, field, count, tags)
tags[bucketRightTag] = bucketPosInf
if index < len(buckets) {
tags[bucketRightTag] = strconv.FormatFloat(buckets[index], 'f', -1, 64)
}

sum += count
h.groupField(metricsWithGroupedFields, name, field, sum, copyTags(tags))
}
}

// groupField groups field by count value
Expand Down
Loading

0 comments on commit 4321a28

Please sign in to comment.