Scrapers - Capture host and port of monitored endpoint as resource attributes #7081

djaglowski · 2022-01-07T17:18:09Z

Metric scrapers typically pull data from a known host and port. The metrics produced by a scraper would be more useful if placed in the context of the resource they describe. Therefore, scrapers should capture these values as resource attributes, when possible.

The semantic conventions define the attributes net.host.name and net.host.port here, and I believe these are appropriate for this use case.

Additional Context

Prometheus produces telemetry with basically the same information via the instance label, although this is a combined field, defined as <host>:<port> part of the target's URL that was scraped.

The prometheusreceiver splits the instance and saves the host and port as host.name and port. We likely want to use the same attribute names in both the prometheus and non-prometheus cases, but perhaps some prometheus experts can shed light on whether there is a distinction here I'm missing.

The text was updated successfully, but these errors were encountered:

djaglowski · 2022-01-07T17:20:12Z

@Aneurysm9, @dashpole, as code owners of the prometheusreceiver, do you have an opinion on this?

dashpole · 2022-01-07T18:38:08Z

Are the net semantic conventions only for tracing, since they are in the trace directory?

host.name and net.host.name seem identical. Are there other scraping receivers in the collector that we can follow?

djaglowski · 2022-01-07T20:26:12Z

Are the net semantic conventions only for tracing, since they are in the trace directory?

This is a good point. I don't think we are necessarily beholden to them in the context of metrics. My assumption is that we should align across signal types wherever reasonable though.

host.name and net.host.name seem identical. Are there other scraping receivers in the collector that we can follow?

I'm not aware of any others.

Aneurysm9 · 2022-01-07T21:54:04Z

Resource attributes should be common to all signal types, I would think, as they describe the resource that produced the signal. I'm not sure that the host.name resource attribute is correct, since it is an address from the caller's perspective and may not relate to the hostname of the target system at all.

That said, there don't seem to be any resource attributes that are appropriate here and, to the extent that we're deriving them from information available on the scraping side and not having the resource identify itself, we probably will not have any good options.

djaglowski · 2022-01-07T23:07:48Z

an address from the caller's perspective and may not relate to the hostname of the target system at all.

That's true, but I think it would often be correct. When it's not, I would think it's still the closest we can get to meaningfully identifying the resource from which the metrics was pulled. To speculate a bit - is this why prometheus uses the more generic term instance, to allow for a degree of uncertainty?

jsuereth · 2022-01-10T14:23:36Z

Are the net semantic conventions only for tracing, since they are in the trace directory?

They're actually used for HTTP metrics as well.

In this instance, net.peer.ip and net.peer.port are more representative of what's going on (pulling from a remote process/server).

dashpole · 2022-01-10T15:29:48Z

They're actually used for HTTP metrics as well.

Ah, perfect. Then I think those conventions should apply to the prometheus receiver.

In this instance, net.peer.ip and net.peer.port are more representative of what's going on (pulling from a remote process/server).

I would agree if this were a metric about the prometheus receiver itself, but i'm not sure it makes sense for metrics scraped from an application.

If the collector is doing resource detection on a metric produced by an application, I think attributes should be detected from the perspective of the application, rather than the perspective of the collector. The prometheus endpoint is technically the peer of the collector, but I would find it confusing to see a net.peer.ip resource attribute on a go_goroutines metric (my goroutine has a peer?). Additionally, that would mean that changing where resource detection happens (application vs collector) would change the resources detected. That doesn't impact prometheus, as prometheus format doesn't have a notion of resource labels, and thus can't do resource detection in the application, but it would matter for OTLP, for example.

quentinmit · 2022-01-11T00:50:47Z

If the canonical labels are net.host and net.port, how would you annotate metrics retrieved from a Unix socket? The socket path in net.host and no port specified?

djaglowski · 2022-01-12T16:00:05Z

If the collector is doing resource detection on a metric produced by an application, I think attributes should be detected from the perspective of the application, rather than the perspective of the collector.

+1. I think this supports the original proposal of using net.host.name and net.host.port?

how would you annotate metrics retrieved from a Unix socket? The socket path in net.host and no port specified?

I would be in favor of setting net.host.name from os.Hostname, and also capturing the socket path in an appropriately specific attribute. Trace semantic conventions suggest using net.peer.name for a socket path (along with setting net.transport to unix).

…y#7081) This change simplifies the generated pdata code to not wrap orig fields in the internal package for structs that are not being used by other packages. The code generator is adjusted to generate wrapped or unwrapped code for only for structs that need it based on the package name. The only exception is `Slice` struct that was pulled from the generator because: - We don't have and don't expect to have any new slices that are used by other packages. - The `Slice` struct have two additional methods `AsRaw` and `FromRaw` that are not generated and defined in a separate file which is a bit confusing.

djaglowski added enhancement New feature or request spec:metrics labels Jan 7, 2022

dashpole mentioned this issue Feb 3, 2022

Initial Prometheus <-> OTLP datamodel specification. open-telemetry/opentelemetry-specification#2266

Merged

dashpole mentioned this issue Feb 24, 2022

Improve conversion of resource attributes to/from prometheus open-telemetry/opentelemetry-specification#2381

Merged

dashpole added the comp:prometheus Prometheus related issues label Mar 9, 2022

dashpole mentioned this issue Mar 9, 2022

[prometheusreceiver and prometheusremotewriteexporter] Update resource attributes used when translating to/from prometheus #8266

Merged

dashpole self-assigned this Mar 9, 2022

jpkrohling closed this as completed in #8266 Mar 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scrapers - Capture host and port of monitored endpoint as resource attributes #7081

Scrapers - Capture host and port of monitored endpoint as resource attributes #7081

djaglowski commented Jan 7, 2022

djaglowski commented Jan 7, 2022

dashpole commented Jan 7, 2022

djaglowski commented Jan 7, 2022

Aneurysm9 commented Jan 7, 2022

djaglowski commented Jan 7, 2022

jsuereth commented Jan 10, 2022

dashpole commented Jan 10, 2022

quentinmit commented Jan 11, 2022

djaglowski commented Jan 12, 2022

Scrapers - Capture host and port of monitored endpoint as resource attributes #7081

Scrapers - Capture host and port of monitored endpoint as resource attributes #7081

Comments

djaglowski commented Jan 7, 2022

djaglowski commented Jan 7, 2022

dashpole commented Jan 7, 2022

djaglowski commented Jan 7, 2022

Aneurysm9 commented Jan 7, 2022

djaglowski commented Jan 7, 2022

jsuereth commented Jan 10, 2022

dashpole commented Jan 10, 2022

quentinmit commented Jan 11, 2022

djaglowski commented Jan 12, 2022