-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics framework #926
Merged
cody-littley
merged 20 commits into
Layr-Labs:master
from
cody-littley:metrics-framework
Nov 26, 2024
Merged
Metrics framework #926
Changes from 8 commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
07f0cee
Created new metrics framework in common.
cody-littley e444849
Got latency metrics working.
cody-littley 538470c
Added counter.
cody-littley 1083c94
All metric types working.
cody-littley 31c4a43
Added auto-gauge.
cody-littley 7d826a8
Add mock metrics.
cody-littley 49b11e1
Auto-generate metrics docs.
cody-littley 2d6c116
Pass unit as part of promethious metadata.
cody-littley fbc7bc1
Use ticker instead of sleeping.
cody-littley 02e710d
Made suggested changes.
cody-littley 0217da7
Improve documentation.
cody-littley b0fd072
lint
cody-littley 72834e3
incremental progress
cody-littley 8f28ec0
Incremental progress.
cody-littley bcdd296
Add labels to counts
cody-littley 1a36247
Finish new label system.
cody-littley b28e075
Improve documentation for labels.
cody-littley f9df03e
Cleanup.
cody-littley 04df0d0
Made suggested changes.
cody-littley afe3120
Made suggested changes.
cody-littley File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
package metrics | ||
|
||
// Config provides configuration for a Metrics instance. | ||
type Config struct { | ||
// Namespace is the namespace for the metrics. | ||
Namespace string | ||
|
||
// HTTPPort is the port to serve metrics on. | ||
HTTPPort int | ||
|
||
// MetricsBlacklist is a list of metrics to blacklist. To determine the fully qualified metric name | ||
// for this list, use the format "metricName:metricLabel" if the metric has a label, or just "metricLabel" | ||
// if the metric does not have a label. Any fully qualified metric name that matches exactly with an entry | ||
// in this list will be blacklisted (i.e. it will not be reported). | ||
MetricsBlacklist []string | ||
|
||
// MetricsFuzzyBlacklist is a list of metrics to blacklist. To determine the fully qualified metric name | ||
// for this list, use the format "metricName:metricLabel" if the metric has a label, or just "metricLabel" | ||
// if the metric does not have a label. Any fully qualified metric that contains one of these strings | ||
// in any position to be blacklisted (i.e. it will not be reported). | ||
MetricsFuzzyBlacklist []string | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
package metrics | ||
|
||
import ( | ||
"github.com/prometheus/client_golang/prometheus" | ||
) | ||
|
||
var _ CountMetric = &countMetric{} | ||
|
||
// countMetric a standard implementation of the CountMetric. | ||
type countMetric struct { | ||
Metric | ||
|
||
// name is the name of the metric. | ||
name string | ||
|
||
// label is the label of the metric. | ||
label string | ||
|
||
// description is the description of the metric. | ||
description string | ||
|
||
// counter is the prometheus counter used to report this metric. | ||
counter prometheus.Counter | ||
} | ||
|
||
// newCountMetric creates a new CountMetric instance. | ||
func newCountMetric(name string, label string, description string, vec *prometheus.CounterVec) CountMetric { | ||
var counter prometheus.Counter | ||
if vec != nil { | ||
counter = vec.WithLabelValues(label, "count") | ||
} | ||
|
||
return &countMetric{ | ||
name: name, | ||
label: label, | ||
description: description, | ||
counter: counter, | ||
} | ||
} | ||
|
||
func (m *countMetric) Name() string { | ||
return m.name | ||
} | ||
|
||
func (m *countMetric) Label() string { | ||
return m.label | ||
} | ||
|
||
func (m *countMetric) Unit() string { | ||
return "count" | ||
} | ||
|
||
func (m *countMetric) Description() string { | ||
return m.description | ||
} | ||
|
||
func (m *countMetric) Type() string { | ||
return "counter" | ||
} | ||
|
||
func (m *countMetric) Enabled() bool { | ||
return m.counter != nil | ||
} | ||
|
||
func (m *countMetric) Increment() { | ||
if m.counter == nil { | ||
return | ||
} | ||
m.counter.Inc() | ||
} | ||
|
||
func (m *countMetric) Add(value float64) { | ||
if m.counter == nil { | ||
return | ||
} | ||
m.counter.Add(value) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
package metrics | ||
|
||
import ( | ||
"github.com/prometheus/client_golang/prometheus" | ||
) | ||
|
||
var _ GaugeMetric = &gaugeMetric{} | ||
|
||
// gaugeMetric is a standard implementation of the GaugeMetric interface via prometheus. | ||
type gaugeMetric struct { | ||
Metric | ||
|
||
// name is the name of the metric. | ||
name string | ||
|
||
// label is the label of the metric. | ||
label string | ||
|
||
// unit is the unit of the metric. | ||
unit string | ||
|
||
// description is the description of the metric. | ||
description string | ||
|
||
// gauge is the prometheus gauge used to report this metric. | ||
gauge prometheus.Gauge | ||
} | ||
|
||
// newGaugeMetric creates a new GaugeMetric instance. | ||
func newGaugeMetric( | ||
name string, | ||
label string, | ||
unit string, | ||
description string, | ||
vec *prometheus.GaugeVec) GaugeMetric { | ||
|
||
var gauge prometheus.Gauge | ||
if vec != nil { | ||
gauge = vec.WithLabelValues(label, unit) | ||
} | ||
|
||
return &gaugeMetric{ | ||
name: name, | ||
label: label, | ||
unit: unit, | ||
description: description, | ||
gauge: gauge, | ||
} | ||
} | ||
|
||
func (m *gaugeMetric) Name() string { | ||
return m.name | ||
} | ||
|
||
func (m *gaugeMetric) Label() string { | ||
return m.label | ||
} | ||
|
||
func (m *gaugeMetric) Unit() string { | ||
return m.unit | ||
} | ||
|
||
func (m *gaugeMetric) Description() string { | ||
return m.description | ||
} | ||
|
||
func (m *gaugeMetric) Type() string { | ||
return "gauge" | ||
} | ||
|
||
func (m *gaugeMetric) Enabled() bool { | ||
return m.gauge != nil | ||
} | ||
|
||
func (m *gaugeMetric) Set(value float64) { | ||
if m.gauge == nil { | ||
return | ||
} | ||
m.gauge.Set(value) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
package metrics | ||
|
||
import ( | ||
"github.com/prometheus/client_golang/prometheus" | ||
"time" | ||
) | ||
|
||
var _ LatencyMetric = &latencyMetric{} | ||
|
||
// latencyMetric is a standard implementation of the LatencyMetric interface via prometheus. | ||
type latencyMetric struct { | ||
Metric | ||
|
||
// name is the name of the metric. | ||
name string | ||
|
||
// label is the label of the metric. | ||
label string | ||
|
||
// description is the description of the metric. | ||
description string | ||
|
||
// observer is the prometheus observer used to report this metric. | ||
observer prometheus.Observer | ||
} | ||
|
||
// newLatencyMetric creates a new LatencyMetric instance. | ||
func newLatencyMetric(name string, label string, description string, vec *prometheus.SummaryVec) LatencyMetric { | ||
var observer prometheus.Observer | ||
if vec != nil { | ||
observer = vec.WithLabelValues(label, "seconds") | ||
} | ||
|
||
return &latencyMetric{ | ||
name: name, | ||
label: label, | ||
description: description, | ||
observer: observer, | ||
} | ||
} | ||
|
||
func (m *latencyMetric) Name() string { | ||
return m.name | ||
} | ||
|
||
func (m *latencyMetric) Label() string { | ||
return m.label | ||
} | ||
|
||
func (m *latencyMetric) Unit() string { | ||
return "seconds" | ||
} | ||
|
||
func (m *latencyMetric) Description() string { | ||
return m.description | ||
} | ||
|
||
func (m *latencyMetric) Type() string { | ||
return "latency" | ||
} | ||
|
||
func (m *latencyMetric) Enabled() bool { | ||
return m.observer != nil | ||
} | ||
|
||
func (m *latencyMetric) ReportLatency(latency time.Duration) { | ||
if m.observer == nil { | ||
// this metric has been disabled | ||
return | ||
} | ||
m.observer.Observe(latency.Seconds()) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,119 @@ | ||
package metrics | ||
|
||
import "time" | ||
|
||
// Metrics provides a convenient interface for reporting metrics. | ||
type Metrics interface { | ||
// Start starts the metrics server. | ||
Start() error | ||
|
||
// Stop stops the metrics server. | ||
Stop() error | ||
|
||
// GenerateMetricsDocumentation generates documentation for all currently registered metrics. | ||
// Documentation is returned as a string in markdown format. | ||
GenerateMetricsDocumentation() string | ||
|
||
// WriteMetricsDocumentation writes documentation for all currently registered metrics to a file. | ||
// Documentation is written in markdown format. | ||
WriteMetricsDocumentation(fileName string) error | ||
|
||
// NewLatencyMetric creates a new LatencyMetric instance. Useful for reporting the latency of an operation. | ||
// Metric name and label may only contain alphanumeric characters and underscores. | ||
NewLatencyMetric( | ||
name string, | ||
label string, | ||
description string, | ||
quantiles ...*Quantile) (LatencyMetric, error) | ||
|
||
// NewCountMetric creates a new CountMetric instance. Useful for tracking the count of a type of event. | ||
// Metric name and label may only contain alphanumeric characters and underscores. | ||
NewCountMetric( | ||
name string, | ||
label string, | ||
description string) (CountMetric, error) | ||
|
||
// NewGaugeMetric creates a new GaugeMetric instance. Useful for reporting specific values. | ||
// Metric name and label may only contain alphanumeric characters and underscores. | ||
NewGaugeMetric( | ||
name string, | ||
label string, | ||
unit string, | ||
description string) (GaugeMetric, error) | ||
|
||
// NewAutoGauge creates a new GaugeMetric instance that is automatically updated by the given source function. | ||
// The function is polled at the given period. This produces a gauge type metric internally. | ||
// Metric name and label may only contain alphanumeric characters and underscores. | ||
NewAutoGauge( | ||
name string, | ||
label string, | ||
unit string, | ||
description string, | ||
pollPeriod time.Duration, | ||
source func() float64) error | ||
} | ||
|
||
// Metric represents a metric that can be reported. | ||
type Metric interface { | ||
|
||
// Name returns the name of the metric. | ||
Name() string | ||
|
||
// Label returns the label of the metric. Metrics without a label will return an empty string. | ||
Label() string | ||
|
||
// Unit returns the unit of the metric. | ||
Unit() string | ||
|
||
// Description returns the description of the metric. Should be a one or two sentence human-readable description. | ||
Description() string | ||
|
||
// Type returns the type of the metric. | ||
Type() string | ||
|
||
// Enabled returns true if the metric is enabled. | ||
Enabled() bool | ||
} | ||
|
||
// GaugeMetric allows specific values to be reported. | ||
type GaugeMetric interface { | ||
Metric | ||
|
||
// Set sets the value of a gauge metric. | ||
Set(value float64) | ||
} | ||
|
||
// CountMetric allows the count of a type of event to be tracked. | ||
type CountMetric interface { | ||
Metric | ||
|
||
// Increment increments the count by 1. | ||
Increment() | ||
|
||
// Add increments the count by the given value. | ||
Add(value float64) | ||
} | ||
|
||
// Quantile describes a quantile of a latency metric that should be reported. For a description of how | ||
// to interpret a quantile, see the prometheus documentation | ||
// https://github.com/prometheus/client_golang/blob/v1.20.5/prometheus/summary.go#L126 | ||
type Quantile struct { | ||
Quantile float64 | ||
Error float64 | ||
} | ||
|
||
// NewQuantile creates a new Quantile instance. Error is set to 1% of the quantile. | ||
func NewQuantile(quantile float64) *Quantile { | ||
return &Quantile{ | ||
Quantile: quantile, | ||
Error: quantile / 100.0, | ||
} | ||
} | ||
|
||
// LatencyMetric allows the latency of an operation to be tracked. Similar to a gauge metric, but specialized for time. | ||
type LatencyMetric interface { | ||
Metric | ||
|
||
// ReportLatency reports a latency value. | ||
ReportLatency(latency time.Duration) | ||
} |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should metrics filtering be something that is taken care of by the application or the component that is collecting the metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we were preparing for the (since abandoned) traffic generator, this seemed like a useful concept. I had added a large number of metrics, but there was concern that some of them were low bang for the buck (since it costs us $$$ to store them). We were planning on disabling a bunch of metrics with configuration changes, and turning them on in the future if we ever had an issue where they would be useful for debugging.
That being said, if people don't think this is a useful feature, it would be fairly straight forward to remove. What do others think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah all I'm saying is that it's probably possible to filter the metrics, logs in a similar manner with the grafana agent.
This is pretty useful now though because we haven't figure out how to do that in the grafana agent. Regardless, if we want to save more money we would need to learn how to do it in the grafana agent because we're not always running applications that allow metrics filtering in this manner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I have removed the metrics blacklisting feature from this framework.