Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Add 'snmp_' prefix to metrics? #104

Open
seanhoughton opened this issue Jan 6, 2017 · 42 comments
Open

[Discussion] Add 'snmp_' prefix to metrics? #104

seanhoughton opened this issue Jan 6, 2017 · 42 comments

Comments

@seanhoughton
Copy link

seanhoughton commented Jan 6, 2017

The first item in the exporter metric naming best practices list is

A metric name ... should have a (single-word) application prefix relevant to the domain the metric belongs to.

None of the metric names in the current file adhere to this practice, it probably makes sense to just automatically prefix each metric with snmp_

@seanhoughton seanhoughton changed the title Add 'snmp' Add 'snmp' prefix to metrics Jan 6, 2017
@brian-brazil
Copy link
Contributor

brian-brazil commented Jan 6, 2017

In the case of SNMP, all the names have already been well-established over decades and follow another naming scheme. Adding snmp_ would likely cause more harm than good.

@RichiH
Copy link
Member

RichiH commented Jan 7, 2017 via email

@SuperQ
Copy link
Member

SuperQ commented Jan 7, 2017

Correct, the individual metrics have long-established names, but by Prometheus policy these should be grouped into their domain. We shouldn't modify the metric name, simply prepend them.

snmp_ifHCInOctets for example makes it much easier to match when doing things like federation.

@brian-brazil
Copy link
Contributor

It's a guideline not policy, and SNMP is the canonical example of where the balance tips the other way.

You wouldn't be federating names like that, you'd be federating on the aggregation prefix.

@seanhoughton
Copy link
Author

I'll leave it up to you guys to decide, but the curse of "sure it doesn't follow convention and makes things more difficult, but that's the way we've always done it" has caused untold pain in my professional career. All of the other exporters provide a very clear prefix that makes browsing the available metrics for the exporter you're concerned with very easy. The snmp exporter is the only one in the bunch that scatters metrics all around because of the lack of a namespacing prefix. However, the nice thing about open source is that I can make a fork with the prefix added and go about my business :)

@brian-brazil
Copy link
Contributor

All of the other exporters provide a very clear prefix that makes browsing the available metrics for the exporter you're concerned with very easy.

I mentioned this when this previously came up, but this is not true in general. It's normal for there to be quite a bit of overlap in the metrics between all of your applications, so this is not something you can depend on.

@RichiH
Copy link
Member

RichiH commented Jan 26, 2017

For the record, after some more deliberations, I see prefixing snmp_ as a valid request. Having on shared namespace helps avoid confusion down the line.

Either way, this should either be implemented, or the issue closed.

@SuperQ
Copy link
Member

SuperQ commented Jan 26, 2017

I'm thinking about adding an optional prefix config item to the generator. That would solve it for everyone.

@brian-brazil
Copy link
Contributor

This should not be a configurable option, a given metric should only have one name.

No new arguments have been presented since this first came up, so I'm going to close it.

@RichiH
Copy link
Member

RichiH commented Jan 26, 2017 via email

@NotAFile
Copy link

My frustration with this is that it decreases the usefulness of autocomplete. For example, searching for sys does not suggest me sysUpTime, while searching prometheus_sd suggests me prometheus_sd_updates_total.

Because it breaks convention, also causes some inconveniences in some other tools like Grafana:

image

As you can see, Grafana groups metrics by their underscore-seperated names. Any use of the snmp exporter makes this menu useless, as it is now cluttered with hundreds of snmp entries. Now, arguably this is a Grafana issue, but it won't be the first tool that makes assumptions about the format of metric names that becomes less useful when you use the snmp exporter.

@RichiH
Copy link
Member

RichiH commented Jan 19, 2020 via email

@brian-brazil
Copy link
Contributor

There's still no plans to ever do that, and it'd be a major breaking change.

@robotmay
Copy link

I just found this issue and I still do not understand why it isn't a supported feature. Not namespacing the metrics is directly against the Prometheus conventions for naming metrics. It doesn't matter if you're following the SNMP convention because this isn't SNMP, it's a Prometheus exporter.

@robotmay
Copy link

If anyone else finds this issue looking for an actual solution, put the following in your job config in prometheus.yml:

    metric_relabel_configs:
    - source_labels: [__name__]
      regex: '(.*)'
      replacement: 'snmp_${1}'
      target_label: __name__

@RichiH RichiH changed the title Add 'snmp' prefix to metrics [Discussion] Add 'snmp_' prefix to metrics? Feb 25, 2021
@RichiH
Copy link
Member

RichiH commented Feb 25, 2021

We are planning to have a breaking release. This is a once-in-a-decade (seriously...) opportunity to revisit this topic. I am undecided myself as there are strong arguments either way, but we should deliberate it one last time.

Pro:

  • Fits into Prometheus patterns
  • Makes discovery of SNMP values easier

Con:

  • Breaks literally everything. All alerts, all dashboards, all everything
  • Makes grouping and such harder as labels names are e.g. ifDescr and grouping with snmp_ifDescr is painful. This alone might mean we might need to extend PromQL.

@RichiH
Copy link
Member

RichiH commented Feb 25, 2021

Put differently: A networking-focused org would likely tend to prefer no prefixes while a more general org would likely tend to prefer a prefix.

There's no good answer here, only a potentially least-bad one.

If in doubt, not breaking ALL the things over night is the preferable option.

@robotmay
Copy link

Could it just be a configurable option? I guess it's not especially difficult to use the relabel config above, but having something simple like prefix_metrics: true might make it more obvious 🤔

Also I have just reread my comments above and they're grumpier than I intended, so I apologise for that 😅

@SuperQ
Copy link
Member

SuperQ commented Feb 25, 2021

I've been thinking about adding an in-exporter metric relabel config, similar to Prometheus itself.

It's duplication of work and code, but it allows for simpler downstream. Plus we can fix other issues like the recent nulls in labels issues.

@momorientes
Copy link

Have you considered to make the module part of the prefix? IMHO this would be a huge benefit with regards to discovery.
i.e. having a prefix of snmp_apcups_ and snmp_paloalto_fw_ may come in quite handy. Not sure on the implications on prometheus.

@xkilian
Copy link

xkilian commented Feb 25, 2021

That last suggestion will break even more stuff.. I would respectfully suggest against it. ;-)

@xkilian
Copy link

xkilian commented Feb 25, 2021

I have to admit that having an snmp_ prefix would fix the autocomplete stuff. But if ever there was a time, this is it. It would break some dashboards on my end. Not that many, but still. A company that have LOTS of snmp dashboard, ouch. I think there needs to be good feedback to get the community pulse on this.

@NotAFile
Copy link

Have you considered to make the module part of the prefix? IMHO this would be a huge benefit with regards to discovery.
i.e. having a prefix of snmp_apcups_ and snmp_paloalto_fw_ may come in quite handy. Not sure on the implications on prometheus.

I think this is a very bad idea, as it makes it impossible to aggregate metrics across modules. For example, paloalto_fw and if_mib both have ifInOctets metrics, the prefix would prevent any reuse over devices.

@SuperQ
Copy link
Member

SuperQ commented Feb 25, 2021

Again, this is why I think metric_relabel_configs in the generator/snmp config is the right way to go. Users can add their own prefix, suffix, whatever they want.

Something like this:

modules:
  # Default IF-MIB interfaces table with ifIndex.
  if_mib:
    walk: [sysUpTime, interfaces, ifXTable]
    metric_relabel_configs:
    - source_labels: [__name__]
      target_label: __name__
      replacement: 'snmp_$1'

@RichiH
Copy link
Member

RichiH commented Feb 26, 2021

Module-specific prefixes are not a good idea, but the job label can help here.

Relabeling is very powerful, but it's also one of the pain point for users.

For a highly scoped question of snmp_ yes/no, it seems better to have an option if the users should be able to choose.

But: I do not believe we can support both at the same time. Sharing dashboards, alert rules, etc becomes next to impossible even we assume Mixins were magically widely used over night.

All that is a long-winded way of saying I think we will need to pick one or the other and live with it.

As such, it's more about finding the least bad, holding our noses, and doing it forevermore.

@xkilian
Copy link

xkilian commented Feb 27, 2021

I would vote to do it in a consistent way, so add snmp_ to all. Then if someone has way too many dashboards, they can use proposed override regex to fix it until they correct their dashboards. This will address the consistency and moving forward all published dashboards would use the new prefix.
We all agree that snmp_ prefix is better, so I think this would be way to do it. In the doc we include an example to revert the prefix.. All bases covered.

@RichiH
Copy link
Member

RichiH commented Mar 15, 2021

After a lot of thinking, I am leaning towards keeping as-is forevermore. I am reaching out to a few people to get actively challenged on this as my thinking might be too network-centric.

@RichiH
Copy link
Member

RichiH commented Mar 15, 2021

Note to self: If we don't do prefixes, we need to create docs, FAQ, etc.

@LukeSiX
Copy link

LukeSiX commented Mar 15, 2021

Hi, if you started with prometheus recently, you probably would vote for metric prefix, but if you have snmp and prometheus running for many years because you have an 5+ years retention in your TSDB this is a totally no go. This will completely break your dashboards and history for your metrics.

For one metric in the dashboard, you need to add two metric queries, one without the prefix and one with the prefix, because you want your graph to continue if you change the time window. You don't wanna change dashboards because of date X you changed the metric name or you don't wann break the graph line into multiple metrics and labels, because this will also break your calculations for trends/results on longterm. If there is no chance to rename alle your "snmp" metrics in your tsdb within a short time, as a longterm snmp/prometheus user this is just a big mess.

We will have no gain of this cosmetic snmp_ prefix change, really. I'm working for about 5 years now with snmp/prometheus, we have 10 year retention on the TSDB, I have never used this auto grouping of metrics in Grafana or really missed that "snmp_" was missing. That a metric was imported by snmp_exporter is a part of my label and if there are two different exporters who export the same "snmp" metric but via different protocol, I would prefer they have the same metric, so I can still use the same dashboard and use a label to switch between the exporters.

The migration from the old snmp_exporter to the new one with the generator, messed up so many metrics and ruined many longterm statistics.

Anyway, I understand the problem of the missing snmp_ and it should be well documented, especially how to fix it (config flag, or metric relabel) and what it will break if you change this option.
prefix-flag not configured (not in config file) -> no prefix;
prefix-flag configured: displayed in all example configuration files with enabled as default with short warning and link to documentation that it will cause you pain because you will mess up your tsdb history for your metric history.

Did you checked all other exporters in the public internet if they are using this prefix method?

These are just my first thoughts on this topic.

@bastischubert
Copy link
Member

I also don't really see a big gain in prefixing the metrics name with snmp_.

Just like @brian-brazil and @SuperQ stated before - the names have a history for decades and from my point of view these metric names should be in sync with what the MIB files declare.
People tend to use other monitoring systems as well, changing metric names will make it harder to co-operate with these systems as well.

@hairyhenderson
Copy link

I'll post this as a data-point, though I have no strong opinion either way as I haven’t used the snmp_exporter in a while (my last SNMP-capable device died a couple years ago).

The lack of snmp_ prefix was always disorienting and annoying to me as a common pattern for discovery is to just start typing and let auto-complete guide me - I’m probably not unique in that 😉. Once I discovered all the metrics I needed for the dashboards and alerts I wanted to build it wasn't a huge deal. More of an aesthetic annoyance than anything else.

Bigger-picture, changing metric names is obviously not ideal (it is a major breaking change, after all), but isn't without precedent either. I've gotten past these before. IMHO the approach should tend more towards "let's make it easier for people to find uses of deprecated metric names so they can fix it" rather than "let's never change metric names ever because the tooling is lacking."

Again, my opinion on this is not strongly-held, and I just wanted to offer some personal experience.

@NotAFile
Copy link

NotAFile commented Mar 15, 2021

For me personally, it depends on what Grafana and other tools do. If Grafana can be made to handle the snmp metric names better than they currently are without adding a prefix, I don't see any need for this change. That didn't seem very likely to me in the past, but with @RichiH's recent focus on this area, as well as improved metadata APIs, things might be different now.

@PrestonTaylor
Copy link

PrestonTaylor commented Mar 18, 2021

I'm seriously having difficulty understanding why this discussion got past "why not make it configurable?", who cares why people want what they want and if it adheres to business religion or engineer religion, if it's configurable, it's totally irrelevant. This wouldn't be the first exporter to do it nor the last. As to sharing configs and dashboards, I don't think anyone should concern themselves with supporting blind copy pasting in something as technical as this, it's already well beyond impossible for all but the top echelon of most common scenarios.

@xkilian
Copy link

xkilian commented Mar 18, 2021

That was heart felt. I think the debate is more on the side of what should be the new default and document how to configure it to get the alternate way.

I think the new default should be with a prefix and that sample config be provided for those that want to keep it same as before. As you said, configurable.

@julian3xl
Copy link

Hi, I fully agree with @PrestonTaylor so this is my approach #639

@RichiH
Copy link
Member

RichiH commented Apr 8, 2021

@NotAFile I am wondering about a heuristic of job names starting with snmp* in the tooling.

@PrestonTaylor having done user support around this for half a decade, copy and paste is the way to go; and more ready-made configurations are needed for the current wave of adopters.

@sdboyer thoughts on medium term pain assuming mixins for snmp_exporter are widely available?

@sdboyer
Copy link

sdboyer commented Apr 8, 2021

Reading this over for the first time now. (Totally not a networking person). First, i agree with this as the fundamentally correct baseline:

This should not be a configurable option, a given metric should only have one name.

At the level of the exporter, a given metric should have only one name, and should not offer options for conditionally changing the name. Exporters are where names are canonically decided, and across expositions should vary at most in whether a metric is expressed, but never the name it is expressed under.

Mixins technically can "help" with this today - you can write the jsonnet such that the prefix is conditionally present on all dashboards, alerts, etc. But this is a major hack, to the point where it's plausible that a next-gen approach to mixins we're currently working on simply won't allow futzing with metric names in this way. That's because we view it as outside the scope of a mixin's responsibility to define what amounts to an alias. (It also...might end up being indirectly possible through different means. Hard to say rn. But point is, nobody really thinks "decide the metric name in the mixin" is a good idea.)

IMO, @xkilian has hit the best compromise here.

I think the new default should be with a prefix and that sample config be provided for those that want to keep it same as before. As you said, configurable.

This achieves configurability, but as a property of the processing pipeline, rather than the exporter itself. The exporter generates metric names that are consistent with prefixing standards, and a straightforward option is available to those who want backwards compatibility with the historical form. Mixin should then be built on the assumption of the presence of the prefix.

@NotAFile
Copy link

NotAFile commented Apr 8, 2021

@RichiH

I am wondering about a heuristic of job names starting with snmp* in the tooling.

That could definitely work, I had considered that. However, I wonder if that isn't a bit too undiscoverable and magic. There's some other ideas I had (in increasing effort):

  • Apply this generally to all metric names that contain mixedCase instead of underscores. Probably the best, unless there's some edge case that prevents it.
  • Allow grouping by job instead of metric prefix. This has other issues, but is frequently what I want.
  • Add a naming convention hint to the metric metadata. Meh.
  • Expand the metric metadata with an outright category name to aid browsing, like "go gc metrics". That removes the need to round-trip this information through the name at all. This is almost definitely excessive, but does solve the problem of dividing up the metric namespace more generally.

This all goes far beyond the scope of the snmp exporter though.

@PrestonTaylor
Copy link

At the level of the exporter, a given metric should have only one name, and should not offer options for conditionally changing the name. Exporters are where names are canonically decided, and across expositions should vary at most in whether a metric is expressed, but never the name it is expressed under.

Why? Who is comparing metrics between organizations and if they are why are they expecting them to be identical? What practical reason is there for this requirement?

@pphysch
Copy link

pphysch commented May 11, 2023

This should be an optional feature.

Users should not have to bend over backward and customize every SNMP exporter job with metric_relabel_configs to achieve consistency with the Prometheus/OM ecosystem.

Disable by default for backwards-compatibility, if necessary.

SNMP purists should have no trouble extracting verbatim names with their favorite text processing tools, should need arise. Massaging data to/from SNMP is practically a given in my experience, due to the wide variety of SNMP/MIB implementations. I even have to do some post-processing on my generated snmp.yml to fix some buggy OIDs (bad types/indices).

Also, it's possible to use metric_relabel_configs to go backwards and drop the prefixes.

@SuperQ
Copy link
Member

SuperQ commented May 11, 2023

Why? Who is comparing metrics between organizations and if they are why are they expecting them to be identical? What practical reason is there for this requirement?

Lots of people expect things to "just work". There are explicit tools for creating standard dashboards. For example in this repo there are dashboard mixins provided for this exact use case. Many users expect to download dashboards from the Internet and expect them to "just work". If RFC standard MIBs IF-MIB metric ifHCInOctets differ from org to org, those dashboards will break and users will be confused and complain.

That's why the opinion of "only one name" came to be.

However, I think users should be allowed to customize things for their own use. We can add features even if they mean users will get smacked in the face from time to time.

@julian3xl
Copy link

Two years using a fork because this silly discussion…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests