-
Notifications
You must be signed in to change notification settings - Fork 904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include k8s.pod.hostname in semantic conventions #1072
Conversation
On the second thought, this got me wondering if maybe |
Thanks for the clarification @dmitryax I have opened a draft update for the collector (which is ready to be converted into a normal PR after this one gets merged). I am not sure how is it impacted by the focus on GA, though then it's a rather minimal change |
What is the difference of that attribute to host.name? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requesting changes because I think this is redundant with host.name (https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/resource/semantic_conventions/host.md) and I think this needs a proper explanation of what the difference would be.
@@ -49,6 +49,7 @@ containers on your cluster. | |||
|---|---|---| | |||
| k8s.pod.uid | The uid of the Pod. | `275ecb36-5aa8-4c2a-9c47-d8bb681b9aff` | | |||
| k8s.pod.name | The name of the Pod. | `opentelemetry-pod-autoconf` | | |||
| k8s.pod.hostname | The hostname of the Pod, if set, or [name, otherwise](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-hostname-and-subdomain-fields). | `opentelemetry-pod-autoconf` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why default it to the pod name which is already stored in pod.name? As a result you can never be sure if you actually got a hostname, or just the pod name (since the two are probably often but not always equal)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this becomes the effective hostname. Which leads to the conclusion that host.name
is probably a better fit for it, as suggested. What do you think @dmitryax ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Oberon00 if I am reading the K8s documentation correctly then that's how K8s behaves, it is not our choice. By default Pod's hostname fetched from K8s will be equal to the Pod's name unless the hostname
field is explicitly specified in the Pod's configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not what the semantic convention says. If I read this, I would have expected to implement it by
- Use gethostname to find my hostname
- If it's empty, use the same value as set for k8s.pod.name
Nowhere does it say that we should ask k8s for that hostname.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I introduced this PR, following the discussion I am thinking that simply using host.name
might work better... The only concern I have is how the hostname value of Pod
maps to compute instance. E.g. when the Kubernetes Node runs on, say, EC2, each of the pods running there would have a different host.name
value. Should the latter refer to EC2 instance hostname or Pod hostname?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I would say the "inner" hostname. With nested containerization and possibly nested VMs, it is hard to determine a "real" host. But maybe we should duplicate the host resource to a physical_host, which is the app's best guess at the "real" host. Or outer_host which is the app's best guess at the outer-next host. But that is something worth clarifying in the spec in any case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Node's host vs Pod's host is a general problem with how our resources/attributes are defined (not just for "hostname"). We commingle them without specifying which entity the attributes apply to. I deliberately avoided raising this topic close to GA and this is something I plan to address after GA since it may require conceptual changes in how we understand resources in 2.0.
For now the best we can do is to avoid the problem. "Avoiding" here means not to record ambiguous attributes. "hostname" is ambiguous in Kubernetes context.
I would advise to do one of the following:
- Don't record the host name at all in the Collector's k8s processor. This would be my preference unless there are other arguments.
- If we absolutely must record the host name then either accept k8s.pod.hostname+k8s.node.hostname as a semantic convention or just record it in the Collector as an experimental capability without corresponding semantic convention. Time will tell if it needs to stay and become a semantic convention in the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, the k8s.node.name
frequently holds the hostname of the machine where the node resides.
In either case - I wouldn't want this problem to take the focus out of the GA. I fully agree with @tigrannajaryan note. If nobody objects I am going to close this PR and skip adding the capability until after-GA (and with a solid discussion).
Closing this issue per the conversation. Lets go back to this after GA |
Include
k8s.pod.hostname
ink8s
semantic conventions.Since this is a relatively small change (and it follows the naming scheme used right now), I didn't open a separate issue. Will be happy to update the upcoming YAML definitions if this gets accepted.
Related issues #
open-telemetry/opentelemetry-collector-contrib#141