-
Notifications
You must be signed in to change notification settings - Fork 641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
when the network interfaces come and go metric cardinality grows unbounded #659
Comments
For cases such as Calico, do we want to keep metrics for each individual interface? If not, a viable alternative would be to group these interfaces within the same dimension. We can use a config model like this: "net": {
"interfaceGroupings": {
[
"name": "calico",
"groupRegexp": "^cali[0-9a-f]+$"
]
}
} In the implementation, we would aggregate all interfaces that match the regex under the same name. Interfaces that don't match any grouping will be reported independently. |
/assign mmiranda96 |
will the aggregated metric have any value for us? Instead of an aggregate we probably can simply disable the metric collection completely |
I'm not sure how useful this metric is. If we want to keep it - the best long term solution would be to switch to OpenTelemetry delta metrics and use this exporter: https://github.com/GoogleCloudPlatform/opentelemetry-operations-go/tree/main/exporter/metric. @dashpole do you think OpenTelemetry is in a good state to adopt it here for metrics? |
The Metrics SDK is currently being (entirely) rewritten in preparation for the first stable release, so I wouldn't recommend it right now. That effort is tracked in https://github.com/orgs/open-telemetry/projects/22. After that is published, we will adapt the GCM exporter and publish a stable release of that as well. After that, I would recommend switching. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
This issue has been fixed via #675. This fix has been released in https://github.com/kubernetes/node-problem-detector/releases/tag/v0.8.11. /remove-lifecycle stale |
@mmiranda96: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Calico creates network interfaces for each pod with the unique name. Collecting net metrics (like
kubernetes.io/internal/node/guest/net/tx_packets
) uses the interface name as a dimensioninterface_name
. OpenCensus go library has no bounds for the metric cardinality and doesn't allow to clean up old values.Need to find a best way to reset label values for net interfaces that don't exist any longer.
The text was updated successfully, but these errors were encountered: