Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exemplars to counter and histogram #706

Merged
merged 2 commits into from
Jan 24, 2020
Merged

Add exemplars to counter and histogram #706

merged 2 commits into from
Jan 24, 2020

Conversation

beorn7
Copy link
Member

@beorn7 beorn7 commented Jan 20, 2020

This still needs tests. I'd like to show it to you already, though, to find out if this is the right approach.

If it works for you, I'll add tests before merging this.

@beorn7 beorn7 requested review from brian-brazil and cstyan January 20, 2020 16:07
Copy link
Contributor

@brian-brazil brian-brazil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a 64 character limit on exemplar label names and values to prevent abuse. That should probably be caught somewhere here, before it causes the scrape to fail.

prometheus/counter.go Outdated Show resolved Hide resolved
prometheus/histogram.go Outdated Show resolved Hide resolved
@beorn7
Copy link
Member Author

beorn7 commented Jan 20, 2020

Thanks, addressed both of the comments, but not yet the 64 character limit.

Is that just the length of the label names and values added up, or does it include the curly braces and quotes and commas and perhaps even escaping? Is it in runes or in bytes?

@brian-brazil
Copy link
Contributor

Is that just the length of the label names and values added up, or does it include the curly braces and quotes and commas and perhaps even escaping?

Just the names and values, before escaping.

Is it in runes or in bytes?

Runes. Everything else in OM is about UTF-8 characters, so it'd seem odd to use bytes here.

@beorn7
Copy link
Member Author

beorn7 commented Jan 21, 2020

Thanks. Working on implementing the limit…

@beorn7
Copy link
Member Author

beorn7 commented Jan 21, 2020

Limit implemented. I assume the general direction is fine and will add tests now.

@beorn7 beorn7 force-pushed the beorn7/exemplars branch 2 times, most recently from 719f3e9 to 502f6a5 Compare January 24, 2020 12:39
@beorn7 beorn7 marked this pull request as ready for review January 24, 2020 12:39
@beorn7
Copy link
Member Author

beorn7 commented Jan 24, 2020

This is now ready for review.

Copy link
Contributor

@brian-brazil brian-brazil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

prometheus/counter.go Outdated Show resolved Hide resolved
// Labels will lead to a valid (label-less) exemplar. But if Labels is
// nil, the current exemplar is left in place. This method panics if the
// value is < 0, if any of the provided labels are invalid, or if the
// provided labels contain more than 64 runes in total. AddWithExemplar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mightn't hurt to put numbers on this. What you and I consider costly for instrumentation isn't the same as everyone else, given that e.g. sending a udp packet on every event or using client-side quantiles is common.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, those numbers would be quite involved. It's not just runtime, but also that the creation of the Exemplar needs a number of allocations.

Re-reading the current wording of the doc comment, I find it now a bit confusing myself. It would probably best to explain the usual strategy for really hot paths (stochastically do some observations with and most without exemplar or even do some more sophisticated adaptive rate limiting), but then it would get way too involved for a doc comment.

I removed the sentence about the performance impact for now. This is after all still an experiment. (I'll mark it as such in the changelog.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought was mainly to avoid scaring people away from using this for reasons that they don't actually care about. If they really care about a hot path, they'll probably have their own micro-benchmarks anyway.

@@ -78,6 +89,9 @@ type counter struct {
desc *Desc

labelPairs []*dto.LabelPair
exemplar atomic.Value // *dto.Exemplar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

full stop

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that looks weird as this comment is not in English but in Go.

To resolve the ambiguity here, I changed it to actual English: Always a *dto.Exemplar.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or more precisely: Containing nil or a *dto.Exemplar.

case PanicOnError:
panic(err)
case ContinueOnError:
// Handled later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning here would make more sense now with the refactoring.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought so, too, at first. But it would be redundant with the return in L191. I had it first there, and also had a default clause to explicitly handle all possibilities (which then doesn't require the final return in the function anymore), but then thought that's also redundant.

On the other hand, to really do the shortest possible code, the ContinueOnError could be completely omitted. I did that now, with a short comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants