Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should net.peer.* describe socket-level peer or "logical" peer (for HTTP and in general)? #2607

Closed
Oberon00 opened this issue Jun 3, 2022 · 13 comments · Fixed by #2614
Closed
Assignees
Labels
area:semantic-conventions Related to semantic conventions spec:trace Related to the specification/trace directory

Comments

@Oberon00
Copy link
Member

Oberon00 commented Jun 3, 2022

(from discussion with @lmolkova in #2469 (comment))

In the HTTP protocol, the peer you connect to can be unclear if an HTTP-level proxy is used. In that case, an application opens a TCP connection to the configured proxy host and port, and sends an n HTTP request with a full URL as target to it (GET http://hostname.example.com/some?thing=1 HTTP/1.1).

In this case, it is not clear whether net.peer.name should be the hostname from the requested URL (hostname.example.com) or of the HTTP proxy the application sends the HTTP request to.

One thing that seems clear is that neet.peer.name and net.peer.ip should be consistent which each other. If net.peer.name is set to a logical peer distinct from the socket-level peer, net.peer.ip would have to be left unset and there would currently be no way to store network-level connection information.

There is also an issue with gRPC which (usually?) goes over HTTP/2, but the RPC conventions require at least one of net.peer.ip or net.peer.name. Clearly, in absence of any other attribute to hold it, the destination hostname would be more useful than the hostname of an HTTP proxy.

An open question to me is, whether this problem is specific to HTTP (and HTTP-based protocols), or if this is common also in other protocols.

  • Option 1: Leave net.peer.* ambiguous, make it deprecated. Depending on the semantic convention used, it might be the logical peer, the socket-level peer or (usually) both. Introduce separate net.sock.* and net.app.* namespaces to have unambiguous options.
  • Option 2: Define net.peer.* to be either one or the other, introduce only the other namespace.
  • Option 3: Try to stay as close to the status quo as possible: Define net.peer.* to be the network-level peer. Protocols that have a logical peer canonically identified by a DNS-hostname that might be different from the network-level peer should have a special attribute that defines the role of this peer (http.host which already exists, rpc.endpoint, db.server_name, ...)
  • Option 4: ("Do nothing") Leave net.peer.* ambigous, accept that only one of network-level or logical peer can be set on a span in most semantic conventions (e.g. HTTP server conventions have both net.peer.ip and http.client_ip, while rpc client conventions only have net.peer.ip/name)

Option 1 would be the easiest. Option 3 would be the most complex, also for processing on the backend, but would be future-proof also in case some protocol defines multiple peers with different roles that you might connect to with the same span. That seems unlikely, so I don't think Option 3 is very useful.

Option 2 is similar to Option 1 and would be preferable if we find that net.peer.* is overwhelmingly used for either network-level or logical peer information, instead of being used for both in different instrumentations/languages. Choosing option 2 would need a survey of some sort to find how feasible it is and choose one or the other option.

@Oberon00 Oberon00 added area:semantic-conventions Related to semantic conventions spec:trace Related to the specification/trace directory labels Jun 3, 2022
@Oberon00 Oberon00 changed the title Should net.peer.* describe socket-level peer or "logical" peer (for HTTP and in general) Should net.peer.* describe socket-level peer or "logical" peer (for HTTP and in general)? Jun 3, 2022
@lmolkova
Copy link
Contributor

lmolkova commented Jun 3, 2022

the best known remote endpoint attribute
We need an attribute that would mean "the best known remote endpoint" for client spans. It's a *required *attribute for any semantic convention describing remote calls. This is a bare-minimum observability goal that allows to

  • query: select avg(duration) where the_attribute = 'a-service' (no or classes on Jaeger or Prometheus)
  • use the same attribute for any technology - db, http. rpc, messaging, etc.

If this call came through multiple proxies, service meshes, lbs, and whatnot on the way to the remote endpoint, the caller does not know. I.e. in a generic case, it's not guaranteed to be accurate, so that's why it's the best-known peer name.

socket-level attribute

Let's say another socket-level attribute describes the real destination- instrumentation can know about it, but not necessarily:

  • you can set up a proxy via potentially machine-wide config java , same in .net - does HTTP instrumentation need to parse this and support all variations in which proxy info can come? I don't think so.
  • JMS clients get abstract connection or factory - it does not even know if it's tunneled or proxied.

It's also not clear at all what such attribute means: proxy? DNS server? service-mesh side-car address? first router? Network issues can happen at any of them.

Would this attribute be required? No - in a lot of cases it will be exactly the same as the "best-known remote endpoint" attribute.

Summary

I suggest we do Option 4+:

  • net.peer.name keeps currently defined semantics as the best known remote endpoint". (Remove alternative attribute sets from HTTP semantic conventions #2469 (comment)), we clarify the spec a bit so it's crystal clear.

  • socket-level instrumentation and semantics need clarity before it can be spec-ed out:

    • should be a single attribute or a set of them? If you want to know about the proxy address, you want to know about the protocol too, auth mechanism, and proxy communication is more granular than what HTTP instrumentation can detect - it looks like a whole new convention. Same for DNS.
    • HTTP, DB, and messaging instrumentations sometimes don't have access to socket-level information, it's not the responsibility of such instrumentation to set them - specialized opt-in network-level instrumentation should be responsible for it.

@Oberon00
Copy link
Member Author

Oberon00 commented Jun 4, 2022

We need an attribute that would mean "the best known remote endpoint" for client spans.

By "best" you probably mean the logical destination. And "endpoint" would seem a good name for it. net.endpoint.* for logical stuff, net.peer.* for low-level stuff?

Would this attribute be required? No - in a lot of cases it will be exactly the same as the "best-known remote endpoint" attribute.

I agree with that net.peer.* in that low-level sense should be made non-required everywhere. It is a low-level debugging attribute. But it is more often available than you think, e.g. you might be able to use reflection to access the underlying TCP socket and call getpeername(2) on it https://man7.org/linux/man-pages/man2/getpeername.2.html (that' actually where the "peer" name comes from).

JMS clients get abstract connection or factory - it does not even know if it's tunneled or proxied.

JMS clients often only know the messaging.destination, which may not contain a hostname, e.g. in a messaging setup that is managed by an (EJB) application server. They could set neither a logical nor low-level net.peer.* attribute purely with arguments you would use for connecting.

query: select avg(duration) where the_attribute = 'a-service' (no or classes on Jaeger or Prometheus)

I don't get that, I don't know Jaeger nor Prometheus query syntax, but if you want to know the remote service name (as in service.name), you would need the peer.service attribute, which is unrelated to net.* but cannot usually be set automatically https://github.com/open-telemetry/opentelemetry-specification/blob/v1.3.0/specification/trace/semantic_conventions/span-general.md#general-remote-service-attributes

@reyang
Copy link
Member

reyang commented Jun 6, 2022

I agree with that net.peer.* in that low-level sense should be made non-required everywhere. It is a low-level debugging attribute. But it is more often available than you think, e.g. you might be able to use reflection to access the underlying TCP socket and call getpeername(2) on it https://man7.org/linux/man-pages/man2/getpeername.2.html (that' actually where the "peer" name comes from).

+1

@lmolkova
Copy link
Contributor

lmolkova commented Jun 7, 2022

net.endpoint. vs net.peer

I hear that peer.name is important and common for sockets, but then it should be clearly defined for this case only. Correct me if I'm wrong, but getpeername returns the sockaddr, and then for AF_INET(6) it's an IP address. Now it collides with net.peer.ip. Did I miss something? If not then we should only keep net.peer.ip or net.peer.name for sockets to avoid spreading this confusion further. Perhaps they should become one net.peer.addr and then it'll be precise for different address families?

It seems socket-level instrumentation should also be clarified beyond address (at least the address family should be there to understand how to understand this address). Until it's clarified

Endpoint is not a great choice for remote peer name as it's not limited to the host component. I really suggest keeping using net.peer.name as a logical remote endpoint (as it is already used) and leaving net.peer.addr for socket-level address

I don't get that, I don't know Jaeger nor Prometheus query syntax, but if you want to know the remote service name (as in service.name), you would need the peer.service attribute

I want to be able to search for (e.g.) http client request duration by the remote endpoint name across all languages and instrumentations. For this, I need one required attribute that all instrumentations set. And I want this to be available without a pipeline that normalizes my data (hence Jaeger or Prometheus, which honestly show you what instrumentations put on spans). Ideally, I want to use the same attribute name for my DB calls and other client communication.

Such attribute collected automatically will improve user experience, simplify instrumentation, remove redundancy, and would also optimize most of the pipelines removing the need to support multiple variations.

But it is more often available than you think, e.g. you might be able to use reflection to access the underlying TCP socket and call getpeername(2)

Using reflection should not be the approach we recommend for general-purpose instrumentations.

@Oberon00
Copy link
Member Author

Oberon00 commented Jun 7, 2022

I hear that peer.name is important and common for sockets, but then it should be clearly defined for this case only. Correct me if I'm wrong, but getpeername returns the sockaddr, and then for AF_INET(6) it's an IP address. Now it collides with net.peer.ip. Did I miss something?

That's right, the result of getpeername(2) would be used as net.peer.ip. On the other hand, the result of gethostname(2) (https://man7.org/linux/man-pages/man2/gethostname.2.html) could be used as net.host.name (but usually it should be enough to set it as host.name on the resource).
If you use a named pipe as transport, I think getpeername indeed returns something that contains the named pipe's name. So the mapping is not that trivial.

@Oberon00
Copy link
Member Author

Oberon00 commented Jun 7, 2022

Using reflection should not be the approach we recommend for general-purpose instrumentations.

I'm not sure about that. If the library does not have built-in tracing, then this often will be useful. I think it would be a useful guideline in the context of required attributes, that they need to be available on the API-level, i.e. they should be something that the application passes down to a library that does the requested operation, as opposed to something that the library "calculates" internally (and may or may not expose with public getters).

@Oberon00
Copy link
Member Author

Oberon00 commented Jun 7, 2022

I want to be able to search for (e.g.) http client request duration by the remote endpoint name across all languages and instrumentations. For this, I need one required attribute that all instrumentations set. And I want this to be available without a pipeline that normalizes my data

This is probably a policy decision that should be discussed more broadly. Ease-of-producing vs. ease-of-consuming, reduncancy, etc.

@lmolkova
Copy link
Contributor

lmolkova commented Jun 7, 2022

That's right, the result of getpeername(2) would be used as net.peer.ip. On the other hand, the result of gethostname(2) (https://man7.org/linux/man-pages/man2/gethostname.2.html) could be used as net.host.name (but usually it should be enough to set it as host.name on the resource).
If you use a named pipe as transport, I think getpeername indeed returns something that contains the named pipe's name. So the mapping is not that trivial.

So what you're saying is that there is no clear guidance on when and how net.peer.name would be set and sometimes getpeername maybe will return it, but in most rpc cases it will be in net.peer.ip. and then consumer has no way of knowing what it represents. This sounds like weak guidance. I suggest until we're ready to fix it we either:

  • remove net.peer.name and instead of it use new attribute in all cases where it's used - to represent logical remote endpoint
  • keep using net.peer.name as is

Update:

back to the subj of this issue:

  • net.peer.ip seems like socket-level attribute. It can be made more generic if renamed to net.peer.addr. There should be more socket level attributes (family name to name one)
  • net.peer.name lives in DNS/application layer space and there is nothing like remote endpoint name on socket level.

I'd propose renaming the namespace for the socket level things to sock. Having sock.peer.ip/addr would emphasize what they are for with no confusion.

@lmolkova
Copy link
Contributor

lmolkova commented Jun 7, 2022

I'm not sure about that. If the library does not have built-in tracing, then this often will be useful. I think it would be a useful guideline in the context of required attributes, that they need to be available on the API-level, i.e. they should be something that the application passes down to a library that does the requested operation, as opposed to something that the library "calculates" internally (and may or may not expose with public getters).

so it's either an optional attribute or opt-in specialized instrumentation. I'm happy we're on the same page here.

@sirzooro
Copy link

sirzooro commented Jun 15, 2022

HTTP proxy can report target peer address by adding X-Forwarded-For header to the response. Client could check for it and update peer address later, after response is received. Other protocols may provide similar functionality. This probably should be documented somewhere as a recommendation.

Edit: as you are changing existing spec for this in non-backward compatible way, please consider use cases which I reported recently:

@Oberon00
Copy link
Member Author

Client could check for it and update peer address later, after response is received.

You mean there should also be a net.peer.addr in addition to net.peer.sock.addr and net.peer.name? Could make sense...

@sirzooro
Copy link

Client could check for it and update peer address later, after response is received.

You mean there should also be a net.peer.addr in addition to net.peer.sock.addr and net.peer.name? Could make sense...

Yes, something like this - both proxy and target addresses may be useful.

BTW, I have realized that X-Forwarded-For header may contain multiple addresses when request goes thru multiple proxies:
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Forwarded-For#syntax

@Oberon00
Copy link
Member Author

Oberon00 commented Jun 15, 2022

A full proxy chain is something else.

An app wants to contact some logical host, that's net.peer.name (+ maybe net.peer.addr which is not currently proposed). To do that it connects to some host on the socket level, that's net.peer.sock.name/net.peer.sock.addr. So the app does have some kind of direct relation to these two hosts.

Any hops in-between are of much less relevance. Definitely something that could be added (arrays net.hop.addrs/net.peer.hop.names/net.hop.ports/maybe net.hop.addr_families?) but IMHO out of scope of this issue that mostly aims to clarify/improve how to provide information we already have in some form.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions spec:trace Related to the specification/trace directory
Projects
None yet
5 participants