Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ttljob: only add labels when ttl_label_metrics is set #77567

Merged
merged 1 commit into from
Mar 14, 2022

Conversation

otan
Copy link
Contributor

@otan otan commented Mar 9, 2022

Follow up to this slack convo.

Release justification: high benefit change to new stuff

Release note (sql change): TTL metrics are labelled by relation name if
SET CLUSTER SETTING server.child_metrics.enabled=true; is set and
the ttl_label_metrics storage parameter is set to true. This is to
prevent a potentially unbounded cardinality on TTL related metrics.

Follow up to [this slack convo](https://cockroachlabs.slack.com/archives/C0168LW5THS/p1645556875126379).

Release justification: high benefit change to new stuff

Release note (sql change): TTL metrics are labelled by relation name if
`SET CLUSTER SETTING server.child_metrics.enabled=true;` is set and
the `ttl_label_metrics` storage parameter is set to true. This is to
prevent a potentially unbounded cardinality on TTL related metrics.
@otan otan requested review from rafiss and a team March 9, 2022 20:30
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@ajwerner
Copy link
Contributor

ajwerner commented Mar 9, 2022

this runs afoul of my general concern that letting users create label-values in a cloud setting is a big-time problem for the stability of the system as a whole. I'm willing to be overridden, but I want whatever party does that to understand the implications.

@otan
Copy link
Contributor Author

otan commented Mar 9, 2022

ah, i may have misread your slack message a little bit and this would implement what you suggested there:

My middle-ground suggestion would be to add some syntax (via one of those storage parameters we’ve embraced) to set the metric label

but yes i do understand the concern, but at the same time these metrics seem pretty important in understanding TTL performance (cc @vy-ton in case you want to weigh in about the acceptability of the tradeoff/implicaitions)

@ajwerner
Copy link
Contributor

ajwerner commented Mar 9, 2022

I did say it, but then I tried to walk it back with:

tl;dr I’d be happy with a constant number of tier labels or something and scared of anything more dynamic and downright opposed to anything that creates labels for users without them going out of their way to do it.

I'm scared of this, but not downright opposed.

My preference would be to create a finite number of labels and let users map tables to them.

@otan
Copy link
Contributor Author

otan commented Mar 10, 2022

My preference would be to create a finite number of labels and let users map tables to them.

i'm not a big fan of this, seems counterintuitive and not user friendly, but i understand why it's suggested :). my preference is the way presented in this PR.

anyone want to be the tiebreaker? but i bet i know what @vy-ton would say ;)

@vy-ton
Copy link
Contributor

vy-ton commented Mar 11, 2022

My preference would be to create a finite number of labels and let users map tables to them.

I actually don't quite understand this option so will schedule time to chat next week.

@otan
Copy link
Contributor Author

otan commented Mar 11, 2022

My preference would be to create a finite number of labels and let users map tables to them.

I assume this means we have some hardcoded set of labels , e.g. ttl_1, ttl_2, ... ttl_10
and if we want per-table metrics for TTL, we'd set something like SET (ttl_metric_label = 'ttl_1') for the metrics to be reported under a specific label.

@ajwerner
Copy link
Contributor

I prefer this to having the labels added automatically. I didn't realize you had already done the automatic thing.

@ajwerner
Copy link
Contributor

Okay, I'm on board. I have modest worries about people going nuts with this, but we can try to deal with that later.

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not a big fan of this, seems counterintuitive and not user friendly, but i understand why it's suggested :). my preference is the way presented in this PR.

fwiw, I agree. I didn't like the respective option, with a predefined list of metric names, in the changefeed case either.

But, I would be curious what options we have in CC for configuring our Prometheus for forget about timeseries that haven't been updated in a while (in particular, forget their names for the purposes of whatever indexes it maintains). I'm thinking about tables whose names keep changing. Oliver, it might worth be looking into it.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)

@otan
Copy link
Contributor Author

otan commented Mar 14, 2022

bors r+

maybe one day we can enjoy the pleasures of high cardinality systems :')

But, I would be curious what options we have in CC for configuring our Prometheus for forget about timeseries that haven't been updated in a while (in particular, forget their names for the purposes of whatever indexes it maintains). I'm thinking about tables whose names keep changing. Oliver, it might worth be looking into it.

i will follow up

@craig
Copy link
Contributor

craig bot commented Mar 14, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 14, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants