-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k8sprocessor: add container.name field #1167
Changes from 6 commits
9fc15db
2ea31b3
5a054f5
1f7b843
86dc2cb
7c5748f
dc18129
9502d95
0848767
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,6 +17,7 @@ package kube | |
import ( | ||
"fmt" | ||
"regexp" | ||
"sort" | ||
"strings" | ||
"sync" | ||
"time" | ||
|
@@ -235,6 +236,30 @@ func (c *WatchClient) extractPodAttributes(pod *api_v1.Pod) map[string]string { | |
tags[r.Name] = c.extractField(v, r) | ||
} | ||
} | ||
|
||
if c.Rules.HostName { | ||
// Basing on v1.17 Kubernetes docs, when a hostname is specified, | ||
// it takes precedence over the associated metadata name, see: | ||
// https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-hostname-and-subdomain-fields | ||
hostname := pod.Spec.Hostname | ||
if hostname == "" { | ||
hostname = pod.Name | ||
} | ||
tags[conventions.AttributeHostName] = hostname | ||
} | ||
|
||
if c.Rules.ContainerName { | ||
if len(pod.Spec.Containers) > 0 { | ||
var names []string | ||
for _, container := range pod.Spec.Containers { | ||
names = append(names, container.Name) | ||
} | ||
|
||
sort.Strings(names) | ||
tags[conventions.AttributeK8sContainer] = strings.Join(names, ",") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. k8s.container.name should be about a single container. I assume we can't do that here because we correlate based on IP and the IP is shared across all the containers in the pod. If we make this an array (seems better) I think we should also consider adding a new semantic convention of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I looked around semantic conventions to see if there's any other attribute troubled by a similar property (usually being a single value, but sometimes being a collection). The closest I get is process.command_line, but there are really no other examples I could find. The same problem will apply to Kubernetes Service - there might be several of them pointing to a single pod. I am not sure if this ambiguity could be solved in any way currently (@dashpole, perhaps you might have some suggestions?). This leads to several specification-related questions, such as:
For reference, there's a an ongoing discussion on pluralization guidelines for metrics There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this is a metric about a pod (e.g. is correlated with the Pod's IP), then IMO it shouldn't have a container attribute. If it is about a container, then it should have a single container name. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @dashpole so essentially when a single container cannot be accurately assigned then the information should be not included, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, at least in my opinion. From monitoring systems i'm familiar with, I haven't seen a good way to handle lists of items in label (attribute) values. The other thing to consider is that technically containers can be added to a running pod with the debug container feature, so this attribute wouldn't necessarily be constant over the lifetime of a pod. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Even though this seems like a niche use case I agree that it might introduce incorrect data when we'd tag as above.
As for this I was trying to approach tags rewrite to not be of This could make it easier as having a plural tag as key would already indicate there's an array in the value. |
||
} | ||
} | ||
|
||
return tags | ||
} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -102,13 +102,15 @@ type FieldFilter struct { | |
// ExtractionRules is used to specify the information that needs to be extracted | ||
// from pods and added to the spans as tags. | ||
type ExtractionRules struct { | ||
Deployment bool | ||
Namespace bool | ||
PodName bool | ||
PodUID bool | ||
Node bool | ||
Cluster bool | ||
StartTime bool | ||
Deployment bool | ||
Namespace bool | ||
PodName bool | ||
PodUID bool | ||
Node bool | ||
Cluster bool | ||
StartTime bool | ||
HostName bool | ||
ContainerName bool | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've changed this but taking into account @dmitryax's comment I'm not sure it will fit since the tag There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can keep it singular here for consistency, even if it may have multiple values There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍🏻 done. Singular it is. |
||
|
||
Annotations []FieldExtractionRule | ||
Labels []FieldExtractionRule | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's safe to reuse
host.name
attribute here. It can be easily confused with hostname of the node where the pod is running on. What is the use case? Do we have to reusehost.name
attribute? if not, we might want to use another attribute with a more specific name likek8s.pod.hostname
. what do you think?cc @tigrannajaryan @jrcamp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmitryax+1 on
k8s.pod.hostname
(I think there are several attributes that could be easily added to semantic conventions)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed @dmitryax.
We'll need to add it to https://github.com/open-telemetry/opentelemetry-collector/blob/master/translator/conventions/opentelemetry.go.
@pmm-sumo mentioned he'd like to contribute that so let's wait for that and use it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have started with opening a PR to specs: open-telemetry/opentelemetry-specification#1072
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a discussion under the PR. I am wondering if maybe
host.hostname
is a better fit here. What do you think?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pmm-sumo the
host.hostname
was recently removed from the conventions open-telemetry/opentelemetry-specification#838There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to recap, we had a discussion under the specification PR. Since
hostname
is ambiguous in this setting (containers host names vs the machine where it runs host name), it is unfortunately not a simple problem to solve. Additionally, it is present on all containerised environments and need to be addressed in a generic way.To avoid taking focus out of the GA, the suggestion is to skip adding this feature for now and go back to it later, with a solid discussion involving several parties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, most of the time, the node hostname will reside in
k8s.node.name
and pod hostname ink8s.pod.name
unless they are overwritten (which I believe is not a very common scenario).