Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow configuring SNMP version and credentials through parameters #762

Open
daenney opened this issue May 30, 2022 · 22 comments
Open

Allow configuring SNMP version and credentials through parameters #762

daenney opened this issue May 30, 2022 · 22 comments

Comments

@daenney
Copy link
Contributor

daenney commented May 30, 2022

With the current design, the SNMP version to use, the community string and auth credentials (for v3) are part of the module that the generator creates. If my understanding of how this works is correct, it implies that if you want to scrape 2 devices but they have a different SNMP community (or one that has been reconfigured away from the 'public' default), you end up needing to add a new module.

This results in a ton of duplication in snmp.yml, where the only difference is the version field or some keys of the auth dictionary. For example a couple of my D-Link switches are perfectly happy to be queried using if_mib, but only support SNMPv1.

This would add a bunch of parameters on

snmp_exporter/main.go

Lines 72 to 87 in 8678b60

target := query.Get("target")
if len(query["target"]) != 1 || target == "" {
http.Error(w, "'target' parameter must be specified once", 400)
snmpRequestErrors.Inc()
return
}
moduleName := query.Get("module")
if len(query["module"]) > 1 {
http.Error(w, "'module' parameter must only be specified once", 400)
snmpRequestErrors.Inc()
return
}
if moduleName == "" {
moduleName = "if_mib"
}
like version, auth_community etc. if there's a need to override the default for a target.

I'm wondering if:

  • If there's a better way to go about this (i.e is there some way to do this without needing to go through separate modules)?
  • This is a change that the maintainers would be willing to accept?
@SuperQ
Copy link
Member

SuperQ commented May 30, 2022

I think we already have a proposal for splitting the auth and walk/metric mapping in the config. This would allow for less redundancy in the configuration.

We do not want to support passing community/auth params directly via the URL as this is extremely insecure as Prometheus does not protect scrape URLs as secrets.

@daenney
Copy link
Contributor Author

daenney commented May 30, 2022

Ah, I think you're referring to #619?

@SuperQ
Copy link
Member

SuperQ commented May 30, 2022

Yes, that's the one.

@daenney
Copy link
Contributor Author

daenney commented May 31, 2022

That looks to have stalled a bit from what I can gather.

I can completely understand not wanting to expose the SNMPv3 credentials through a URL, but given the version isn't secret and the community string is essentially public, would a PR to add those in the mean time be acceptable?

@RichiH
Copy link
Member

RichiH commented May 31, 2022

I disagree that "the community string is essentially public".

Another reason why this approach is cited as an anti-pattern in other places is that if you control the content sent to targets via URL, not via config access, you can exploit remote vulnerabilities; made even worse by that fact that exporters are usually in privileged networks / allowlisted in ACLs/firewalls, etc. Access to said exporters is commonly allowed from less privileged networks.

Would you be willing to work on a PR implementing #619 to pass auth & change auth that way?

@daenney
Copy link
Contributor Author

daenney commented May 31, 2022

So looking at #619 I'm a bit fuzzy on what would the solution would end up looking like.

The Google doc states the issue about auth being stored in snmp.yml, but no accepted solutions. I'm guessing some discussion happened somewhere, based on your comment on the 15th of March last year, but it's still not entirely clear what we'd actually want. From #85 I got a distinct impression of a lack of desire from maintainers to support multiple configuration files/loading, but it seems that's what #619 is currently proposing? And what's meant by "directory support"?

If someone could spec out an example of what a "new style" configuration would look like and how we envision overriding auth for a module would work, I might get a better sense of we're building towards.

@RichiH
Copy link
Member

RichiH commented May 31, 2022

I know we documented it somewhere, but I am unable to find it off-hand. Re-typing from memory:

Old:

if_mib:
  version: 2
  auth:
    community: SomeCommunityString
  walk:
  - 1.3.6.1.2.1.2
  - 1.3.6.1.2.1.31.1.1
  get:
  - 1.3.6.1.2.1.1.3.0
  metrics:
  - name: sysUpTime

New, after the PR:

auth:
  foobar:
    version: 2
    community: SomeCommunityString
if_mib: 
  walk:
  - 1.3.6.1.2.1.2
  - 1.3.6.1.2.1.31.1.1
  get:
  - 1.3.6.1.2.1.1.3.0
  metrics:
  - name: sysUpTime

As you can see this poses an interesting problem as auth and all modules are on the same level now. So a format which should be supported as well, and would become the newly recommended way with the next release, would be:

auth:
  foobar:
    version: 2
    community: SomeCommunityString
module:
  if_mib: 
    walk:
    - 1.3.6.1.2.1.2
    - 1.3.6.1.2.1.31.1.1
    get:
    - 1.3.6.1.2.1.1.3.0
    metrics:
    - name: sysUpTime

@daenney
Copy link
Contributor Author

daenney commented May 31, 2022

That definitely helps, thanks!

One thing I'm still wondering about, how do we make this reusable? Imagine a scenario where for whatever reason there's 2 sets of devices, both queryable using if_mib but one needs SNMPv1 and the other SNMPv2 (this happens to be my reality because some D-Link devices are terrible).

Based on the new proposed format, though we can lift the auth out from the OID configuration, it would seem we can still only have one, globally that needs to apply to all devices of a single snmp_exporter instance? Right now I can solve this by defining an additional module which is if_mib but with the version overriden, avoiding the need to run multiple SNMP exporters. How does that work in the new case, or is it not intended to support that?

@RichiH
Copy link
Member

RichiH commented Jun 1, 2022

You would use
http://localhost:9116/snmp?module=if_mib&target=1.2.3.4&auth=foo, and
http://localhost:9116/snmp?module=if_mib&target=2.3.4.5&auth=bar.

@daenney
Copy link
Contributor Author

daenney commented Jun 1, 2022

Right, yes. I have no idea why this didn't click for me earlier. This makes sense. I might take a stab at a PR for this.

On the format, you mentioned initially still supporting the format where the modules were at the root, i.e not children of the modules key.
Though I can see some benefit to it, it is a one-line change in the configuration, whereas having the code to support both formats would be more work and we'd potentially need to define what to do if we have a clash (such as if_mib at the root but also modules.if_mib). I'm wondering if it's really necessary to be that cautious, or if we can swap to the new format immediately and forego supporting both styles. The release notes of 0.20.0 already mention that future releases will have breaking configuration format changes, it seems we should take advantage of that?

@SuperQ
Copy link
Member

SuperQ commented Jun 2, 2022

We would make a new version that did not support the old style, only the new style. Users would be required to convert their configs.

@RichiH
Copy link
Member

RichiH commented Jun 2, 2022 via email

@daenney
Copy link
Contributor Author

daenney commented Jun 2, 2022

We seem to have the same problem with WalkParams in general. If we only extract Auth, then we're still keeping timeout, retries etc. inside a module. It feels like that too ought to be configurable by target as some devices can respond much slower than others etc. I don't think they should be part of the auth, since many devices might share the same credential set but need a different timeout value configured for the walk.

That seems to leave us with the need for another top-level key. I can't really come up with a descriptive name beyond (walk)parameters so suggestions are very welcome or if someone has an idea about a better approach for it.

The final format would then be:

auth:
  foo: {}
parmeters:
  foo: {}
modules:
  if_mib: {}

A combination of module, auth configuration and walk parameters can then be selected through URL args.

@daenney
Copy link
Contributor Author

daenney commented Jun 2, 2022

I'm also wondering if we should stick version in auth or in parameters. The version dictates something about auth, but SNMPv3 also introduced other capabilities (even though we don't use them in this case, so from our point of view it's functionally v2 + fancier auth).

@RichiH
Copy link
Member

RichiH commented Jun 2, 2022

Retries and timeout seem linked to the module, to me.

Even if it's the same hardware, some with full tables and some with route reflector clients, this is arguably a different model from the point of view of the operator and thus Prometheus.

@daenney
Copy link
Contributor Author

daenney commented Jun 2, 2022

I don't quite see how. From where I'm standing a module should define what to query, and potentially how to associate data with each other. Why would retries and timeouts be part of that?

@daenney
Copy link
Contributor Author

daenney commented Jun 7, 2022

Thinking more about this, it feels like retry and timeout should be query parameters on the target, instead of a named parameter group. So you'd end up with: http://localhost:9116/snmp?module=if_mib&target=1.2.3.4&auth=foo&timeout=10&retry=5 instead of &params=annoying-noncompliant-devices.

One of the main reasons the current format is frustrating and results in a lot of duplication is that it confounds the what to query and the how to query. Without cleanly splitting that up we'll continue to have tons of duplication in the modules. You could keep a default for timeout and retry in the module I suppose, but you need some kind of way to override it.

@nelsonjchen
Copy link

nelsonjchen commented Jul 27, 2022

Another reason why this approach is cited as an anti-pattern in other places is that if you control the content sent to targets via URL, not via config access, you can exploit remote vulnerabilities; made even worse by that fact that exporters are usually in privileged networks / allowlisted in ACLs/firewalls, etc. Access to said exporters is commonly allowed from less privileged networks.

Weird thought, but what if URLs or certain data could be pre-signed with encrypted credentials and the authentication data in them is also encrypted with those credentials? The authentication content sent to the targets could then not be modified or tampered with without access to some key that is there when the URL is generated.

Admittedly, I haven't really thought out or fleshed out this idea but it's like pre-signed AWS API, S3, and so on URLs/requests in my head.

@node17
Copy link

node17 commented Oct 23, 2022

With the current design, the SNMP version to use, the community string and auth credentials (for v3) are part of the module that the generator creates. If my understanding of how this works is correct, it implies that if you want to scrape 2 devices but they have a different SNMP community (or one that has been reconfigured away from the 'public' default), you end up needing to add a new module.

This results in a ton of duplication in snmp.yml, where the only difference is the version field or some keys of the auth dictionary. For example a couple of my D-Link switches are perfectly happy to be queried using if_mib, but only support SNMPv1.

This would add a bunch of parameters on

snmp_exporter/main.go

Lines 72 to 87 in 8678b60

target := query.Get("target")
if len(query["target"]) != 1 || target == "" {
http.Error(w, "'target' parameter must be specified once", 400)
snmpRequestErrors.Inc()
return
}
moduleName := query.Get("module")
if len(query["module"]) > 1 {
http.Error(w, "'module' parameter must only be specified once", 400)
snmpRequestErrors.Inc()
return
}
if moduleName == "" {
moduleName = "if_mib"
}

like version, auth_community etc. if there's a need to override the default for a target.

I'm wondering if:

* If there's a better way to go about this (i.e is there some way to do this without needing to go through separate modules)?

* This is a change that the maintainers would be willing to accept?

how to add new module ? and how it is works?

@JDA88
Copy link

JDA88 commented Nov 22, 2022

Thinking more about this, it feels like retry and timeout should be query parameters on the target, instead of a named parameter group. So you'd end up with: http://localhost:9116/snmp?module=if_mib&target=1.2.3.4&auth=foo&timeout=10&retry=5 instead of &params=annoying-noncompliant-devices.

Having timeout and retry as parameters would save us so many duplicate! If it's possible please introduce this change :)

daenney added a commit to daenney/snmp_exporter that referenced this issue Nov 23, 2022
This makes retries, timeout and SNMP version configurable per job. It
overrides the module's equivalent WalkParams if they're present as query
attributes.

Partially addresses: prometheus#762
@RichiH
Copy link
Member

RichiH commented Jan 27, 2023

Both timeout and retry might be used for DoS attacks so they shouldn't be part of the URL; but they could be split out into e.g. the auth module, or a new optional connection module.

@SuperQ
Copy link
Member

SuperQ commented Jul 10, 2023

We've now implemented splitting modules and auths in #859

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants