Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

actually describe PrometheusSink descriptors #129

Merged
merged 2 commits into from
Oct 19, 2021

Conversation

acpana
Copy link
Contributor

@acpana acpana commented Oct 12, 2021

Overview

Stumbled across this as I was trying to instantiate two TestAgents in consul. See hashicorp/consul#11273 for more context around that papercut.

It all comes down to registering more than on PrometheusSink on a registerer.

Today, that's not possible since we don't actually implement .Describe(). send the same descriptor regardless of the sink being used.

This approach runs over all the gauges, summaries and counters and sends .Desc() to the chan.

It also adds a "dummy" value to register "empty" sinks but it relies on a new field: PrometheusSink.name.

This approach introduces a new field, PrometheusSink.name, and creates a descriptor around it in order to avoid "collisions" with other PrometheusSinks.

Testing

  • unit

Related Issues

Copy link

@markan markan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good. My biggest concern is any back compatibility issues, but this looks pretty safe.

prometheus/prometheus.go Outdated Show resolved Hide resolved
Copy link
Contributor

@dnephin dnephin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not all that familiar with these Prometheus interfaces, so I was reading over https://pkg.go.dev/github.com/prometheus/[email protected]/prometheus#Collector, specifically the docs for Describe, and I noticed this:

This method idempotently sends the same descriptors throughout the lifetime of the Collector

I'm not sure if I am interpreting this correctly, but I think that might suggest a problem with this implementation. We will still have some ephemeral metrics that are not registered when the process starts. When those metrics are emitted our Describe method will start returning different descriptions. I'm not sure what problem that will cause, but it seems like the implementation in this PR might violate that requirement.

I have another idea for how we might be able to solve this issue, I'll comment in the linked Consul issue to see if that might work or not.

prometheus/prometheus.go Show resolved Hide resolved
@acpana
Copy link
Contributor Author

acpana commented Oct 13, 2021

We will still have some ephemeral metrics that are not registered when the process starts. When those metrics are emitted our Describe method will start returning different descriptions. I'm not sure what problem that will cause, but it seems like the implementation in this PR might violate that requirement.

Yes, that's spot on -- thanks for the call out!

So funcs like:

...
func (p *PrometheusSink) SetGaugeWithLabels(parts []string, val float32, labels []metrics.Label) {
...
func (p *PrometheusSink) AddSampleWithLabels(parts []string, val float32, labels []metrics.Label) {
...
func (p *PrometheusSink) IncrCounterWithLabels(parts []string, val float32, labels []metrics.Label) {
...

store metrics outside of the respective init* counterparts. If we call .Describe() before any of the funcs above are called (and we do call them over in consul land), the set of the return values will be different than when we call .Describe() after those funcs were called.

I'm interpreting this behavior in two ways:

  • it is consistent with the spec -- as long as the Collector is the same, Describe() idempotently returns a super set of the metrics. Since we modify/ mutate the Collector metrics in the funcs above, for these new mutations, the .Describe() impl still respects the spec

or

  • it is inconsistent with the spec -- if we call at T_0 and T_1, T_1 > T_0, there's a chance that .Describe() may return different super sets;, a very non idempotent behavior, regardless of the fact that the Collector metrics may have changed.

Pragmatic Hat

As far as I can see, we only really care about .Describe() when registering/ unregistering a .Collector, although others may relay more on the func, hence the need to make a change that doesn't break existing behavior.

Maybe then we could go with @markan's suggestion and just rely on the Name:

func (p *PrometheusSink) Describe(c chan<- *prometheus.Desc) {
	prometheus.NewGauge(prometheus.GaugeOpts{Name: p.name, Help: p.name}).Describe(c)
}

or something else that uniquely identifies the Collector and will not change as we continue to store metrics.

@banks
Copy link
Member

banks commented Oct 13, 2021

My read of the Prometheus docs aligns with Daniels. It would seem their expectations around Describe are that different collectors are collecting different sets of metrics and not just the same metrics but for a different instance of a sub-process.

Does the alternate solution posed elsewhere of having a separate Registerer per sub-process work better?

The Registry docs say:

 	// Register registers a new Collector to be included in metrics
	// collection. It returns an error if the descriptors provided by the
	// Collector are invalid or if they — in combination with descriptors of
	// already registered Collectors — do not fulfil the consistency and
	// uniqueness criteria described in the documentation of metric.Desc.

Which to me implies that the library expects all collectors within one registry to be "consistent" that is not emit the same metric from multiple places, while multiple custom Registerer implementations may be used in cases where there is a need for multiple sets of collectors emitting equivalent metrics which seems to be our case when we have multiple agents running in test processes.

Since the Registerer can already be passed in to the Prometheus sink , I think it's already possible to solve this in the calling code. If the boilerplate for doing that is significant and we think it's a pattern other users of this library are likely to need too, it may be worth adding a helper here that makes creating a new instance of the sink with a custom registerer simpler?

@acpana acpana force-pushed the describe-descriptors branch from 06ef1a3 to d847a29 Compare October 14, 2021 22:42
@acpana
Copy link
Contributor Author

acpana commented Oct 14, 2021

hey @dnephin @banks @markan thanks all for the engagement here.

updated the PR to only use the PrometheusSink.Name descriptor to have Describe() be idempotent. Also added a quick note to summarize our chats here in code to make it easier for the next person!

Copy link
Contributor

@dnephin dnephin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, LGTM! Just one small style recommendation

prometheus/prometheus_test.go Outdated Show resolved Hide resolved
prometheus/prometheus_test.go Outdated Show resolved Hide resolved
@banks banks merged commit d1e5690 into hashicorp:master Oct 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

collector collisions
5 participants