[receiver/kubeletstats] Add new percent-based cpu and memory metrics #25835

TylerHelmuth · 2023-08-15T22:12:15Z

Description:
This PR adds 4 new metrics to represent pod and container cpu and memory consumption as a percentage of the set limits. The metric is only emitted for pods/containers that have defined resource limits. It takes advantage of the pod metadata to acquire the container limits. A pod limit is computed as the sum of all the container limits and if any container limit is zero or undefined the pod limit is also considered undefined.

Link to tracking Issue:

Closes #24905

Testing:

Added unit tests. Tested locally using the otel demo:

TylerHelmuth · 2023-08-15T22:14:08Z

@dmitryax a pretty significant amount of these changes are generated files and tests. If it is too large I believe I could breakout the kubelet/metadata.go changes into their own PR.

TylerHelmuth · 2023-08-15T22:27:57Z

receiver/kubeletstatsreceiver/metadata.yaml

@@ -185,6 +185,13 @@ metrics:
    gauge:
      value_type: double
    attributes: [ ]
+  k8s.pod.cpu.usagePercent:
+    enabled: false


Can we enable these by default?

New metrics should be added as optional initially. See https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/scraping-receivers.md#changing-the-emitted-metrics

I'm not sure about the naming. I don't believe we have camelCase recommended anywhere in sem conv. The spec also recommends using [0,1] ratio instead. I was thinking of making it configurable generally in the metrics builder tho. e.g via scale option

Do you mean update mdatagen to allow the cpu.utilization and memory.usage to switch back and forth between the raw value and percentage value? Id prefer multiple metrics as it is useful to know both the exact value and how that compares to the limit.

I am open to any name changes

Ya I agree using feature gates and waiting for the spec finalization is important here. @mx-psi has the topic of utilization vs usage come up in the system metrics conversations?

@dmitryax since this is not a breaking change for memory, how about I close this PR and open one only to add memory.utilization? Then I'll open another for CPU with feature gates and we can have the hard discussion about the impacts and breaking changes there.

Given that the utilization metrics will require additional k8s API calls, we should keep them optional going forward along with cpu.utilization. So we can probably start by disabling them before applying the feature gate. Changing optional metrics is less disruptive.

If it makes sense to you, the first PR would be to add an if_enabled_not_set warning to the cpu utilization metrics with a message like "This metric will be disabled soon. After that, it'll be changed to report the ratio of usage/limit applied via a feature gate. Use container.cpu.usage metric instead."

Also please take a look at the metrics dockerstats metrics. We need to be consistent with them

dmitryax · 2023-08-15T22:40:40Z

Sorry, I missed #24905 when it was submitted. Let me think more about it and reply there

TylerHelmuth · 2023-08-18T15:36:01Z

Closing this based on our conversation. Will open more PRs to work towards proper names.

github-actions bot added the receiver/kubeletstats label Aug 15, 2023

github-actions bot requested a review from dmitryax August 15, 2023 22:12

TylerHelmuth commented Aug 15, 2023

View reviewed changes

TylerHelmuth marked this pull request as ready for review August 16, 2023 20:26

TylerHelmuth requested a review from a team August 16, 2023 20:26

github-actions bot assigned evan-bradley Aug 16, 2023

TylerHelmuth added 7 commits August 17, 2023 08:45

Add new percent-based metrics

1d279d1

add changelog

21b6f4d

Fix lint

fd96582

Switch metrics to scale

6d11819

Switch back to spec.containers

01d1be0

Fix lint

c6f6162

Update descriptions

47e4652

TylerHelmuth force-pushed the kubeletstats-percentage-metrics branch from adae910 to 47e4652 Compare August 17, 2023 14:46

TylerHelmuth added 2 commits August 17, 2023 11:28

Merge branch 'open-telemetry:main' into kubeletstats-percentage-metrics

449e2d5

Merge branch 'main' into kubeletstats-percentage-metrics

cc6baf6

TylerHelmuth closed this Aug 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[receiver/kubeletstats] Add new percent-based cpu and memory metrics #25835

[receiver/kubeletstats] Add new percent-based cpu and memory metrics #25835

TylerHelmuth commented Aug 15, 2023 •

edited

Loading

TylerHelmuth commented Aug 15, 2023

TylerHelmuth Aug 15, 2023

dmitryax Aug 15, 2023

dmitryax Aug 15, 2023

TylerHelmuth Aug 15, 2023

TylerHelmuth Aug 15, 2023

TylerHelmuth Aug 18, 2023

TylerHelmuth Aug 18, 2023

dmitryax Aug 18, 2023

dmitryax Aug 18, 2023

dmitryax Aug 18, 2023

dmitryax commented Aug 15, 2023

TylerHelmuth commented Aug 18, 2023

[receiver/kubeletstats] Add new percent-based cpu and memory metrics #25835

[receiver/kubeletstats] Add new percent-based cpu and memory metrics #25835

Conversation

TylerHelmuth commented Aug 15, 2023 • edited Loading

TylerHelmuth commented Aug 15, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmitryax commented Aug 15, 2023

TylerHelmuth commented Aug 18, 2023

TylerHelmuth commented Aug 15, 2023 •

edited

Loading