Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Tdigest Aggregation Plugin #6887

Closed
wants to merge 18 commits into from
Closed

Add Tdigest Aggregation Plugin #6887

wants to merge 18 commits into from

Conversation

PhoenixRion
Copy link
Contributor

@PhoenixRion PhoenixRion commented Jan 9, 2020

closes #6440

Required for all PRs:

  • Signed CLA.
  • Associated README.md updated.
  • Has appropriate unit tests.

Copy link
Contributor

@danielnelson danielnelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here an initial review. Ultimately my goal is to shrink down the code as much as we can, as this goes a long way on understanding and maintaining the plugin.

@@ -3,3 +3,4 @@
/telegraf.exe
/telegraf.gz
/vendor
.idea/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this file, place in your global core.excludesfile https://git-scm.com/docs/gitignore/2.22.1

@@ -308,3 +308,7 @@
[[override]]
name = "github.com/satori/go.uuid"
revision = "b2ce2384e17bbe0c6d34077efa39dbab3e09123b"

[[constraint]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We updated master to use Go modules after this PR was opened, kind of surprised this didn't trigger a conflict.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what I would need to change because of this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to either merge or rebase against Telegraf master branch, then these files will be deleted. To add the dependency you just need to run:

go get github.com/influxdata/[email protected]

This should add it to the new go.mod/go.sum files.


```
# Keep the histogram of each metric passing through.
[[aggregators.tdigestagg]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to tdigest, move files to directory with same name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This what not initially possible because it conflicted with the tdigest library but that is not longer an issue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also adjust an imports name if needed:

import (
    foo github.com/influxdata/tdigest
)
var tdigest = foo.TDigest{}

@@ -0,0 +1,172 @@
//Copyright (c) 2019, Groupon, Inc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove per file copyright notices

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am checking with Legal as the text per file is a current guideline.

ExcludeTags []string `toml:"exclude_tags"`
SourceTagKey string `toml:"source_tag_key"`
AtomReplacementTagKey string `toml:"atom_replacement_tag_key"`
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the contents of this file into tdigest.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would I move this out of the bucketing package when it is only referenced within it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this referenced from the main plugin struct?

emitting the aggregations and the histogram every `period` seconds.

### Aggregation Concepts
* Tag key ```source``` has special meaning as the primary key of developers destination TSDB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The measurement aka metric name should be this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Source is very different from measurement

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the some of the logic with source and atom is specific to your deployment. For inclusion into the main Telegraf tree I suggest we move these out of this plugin and into separate processor(s) than can check for tags and enforce rules as needed. If the current processors can't handle your case, then we can add new ones to complete the needed functionality.

I'm not exactly sure how you are using the source tag, but keep in mind that it is a somewhat special tag that Telegraf often uses to report the hostname of the system the metrics describe.


### Aggregation Concepts
* Tag key ```source``` has special meaning as the primary key of developers destination TSDB
* Tag key ```atom``` has special meaning to denote the atomic dimension of an aggregation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to denote the atomic dimension of an aggregation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the units (meters, bytes, etc), the aggregation type (min, max, etc) or something entirely different? Can you add some examples?

@@ -0,0 +1,37 @@
//Copyright (c) 2019, Groupon, Inc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On all files, rename to use snake_case.go

_time time.Time
}

type ClamAggregation struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to keep this format in, I'd like to do two things:

  1. Build everything as a single main type, and then do a conversion to you CLAM format at the last moment.
  2. Remove documentation of this format.

If you don't think this will be possible, then we should remove the format. There isn't any way we will be able to maintain the format if it unless it is very simple and self contained.

Comment on lines +57 to +61
_histogram *tdigest.TDigest
_tags map[string]string
_basicName string
_sum float64
_time time.Time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't very idiomatic for Go, name these without the leading underscore and export the function names in the interface.

@alexrocco
Copy link

Any news about this PR? IMO this will be very useful for the project, especially for on-line percentiles.

@sjwang90
Copy link
Contributor

@PhoenixRion Thanks for the PR but closing this in preference for #8594 with less code. If there's anything that this PR contains that is missing from #8594 feel free to comment.

@sjwang90 sjwang90 closed this Jan 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Need a more flexable histogram / tdigest
4 participants