-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-2371: Add cgroup metrics + CRI implementation plan #3559
Conversation
danielye11
commented
Sep 27, 2022
- One-line PR description: Updating KEP with cgroup stats of cadvisor metrics, adding CRI implementation details
- Issue link: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2371-cri-pod-container-stats
- Other comments:
Welcome @danielye11! |
|
Hi @danielye11. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
| |N/A |container_memory_max_usage_bytes |N/A |cAdvisor |CRI or N/A | memory.max_usage_in_bytes | memory.max | ||
| |N/A |container_memory_swap |N/A |cAdvisor |CRI or N/A | (memory.stat) swap | memory.swap.current - memory.current | ||
|ProcessStats |ProcessCount |container_processes |Pod |cAdvisor |CRI | Process | ||
|AcceleratorStats |Make |N/A (too lazy to find the mapping) |Container |cAdvisor |cAdvisor or N/A | accelerators/nvidia.go | accelerators/nvidia.go |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK we still plan on dropping these. do we need to include them in this change?
table update LGTM, one question about a new addition. You also need to sign the CLA, and I would prefer if all of the commits were squashed together |
7a39434
to
822dae3
Compare
message ListPodSandboxMetricsResponse { | ||
repeated PodSandboxMetrics pod_metrics= 1; | ||
repeated ContainerMetrics container_metrics = 2; | ||
} | ||
|
||
message PodSandboxMetrics { | ||
string pod_sandbox_id = 1; | ||
repeated Metric metrics = 2; | ||
} | ||
|
||
message ContainerMetrics { | ||
string container_id = 1; | ||
repeated Metric metrics = 2; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
based on the structure of other CRI calls, I think I'd expect this more to be
essage ListPodSandboxMetricsResponse {
repeated PodSandboxMetrics pod_metrics= 1;
}
message PodSandboxMetrics {
string pod_sandbox_id = 1;
repeated Metric metrics = 2;
repeated ContainerMetrics container_metrics = 3;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restructured
message ListPodSandboxMetricsRequest {} | ||
|
||
message ListPodSandboxMetricsResponse { | ||
repeated PodSandboxMetrics pod_metrics= 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: missing space beteween s and =
@@ -286,8 +286,7 @@ as cAdvisor is fine tuned to perform in an adequate manner. | |||
### Stats Summary API | |||
|
|||
#### CRI Implementation | |||
The CRI implementation will need to be extended to support reporting the full set of container-level from the [Summary API](#summary-container-stats-object). | |||
|
|||
The CRI implementation will need to be extended to support reporting the full set of container-level from the [Summary API](#summary-container-stats-object). A new GRPC call will also be added to the CRI that allows reporting for metrics currently exported by cAdvisor, but are outside the scope of the Summary API. This new GRPC call will return a Prometheus metric based response which Kubelet can export. Additionally, a feature gate will be added to only report Prometheus based metrics from the CRI when calling /stats endpoint. The additional metrics we support will need to be added to the individual container runtimes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, a feature gate will be added to only report Prometheus based metrics from the CRI when calling /stats endpoint.
1: I thought we'd reuse PodAndContainerStatsFromCRI
for this?
2: isn't it /metrics/cadvisor
not /stats
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3: nit, I believe the capitalization is gRPC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, updated in commit
/ok-to-test |
linter is saying to run |
/lgtm |
Few small comments. Thanks for updating the KEP, LGTM on the changes proposed. /lgtm |
One more thing, can you please also update https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2371-cri-pod-container-stats/kep.yaml#L20 to update This will address comment in #2371 (comment) Thanks! |
Updated latest-milestone to 1.26 |
Thanks for updating /lgtm |
/cc @dashpole per SIG instrumentation |
@@ -286,8 +287,7 @@ as cAdvisor is fine tuned to perform in an adequate manner. | |||
### Stats Summary API | |||
|
|||
#### CRI Implementation | |||
The CRI implementation will need to be extended to support reporting the full set of container-level from the [Summary API](#summary-container-stats-object). | |||
|
|||
The CRI implementation will need to be extended to support reporting the full set of container-level from the [Summary API](#summary-container-stats-object). A new gRPC call will also be added to the CRI that allows reporting for metrics currently exported by cAdvisor, but are outside the scope of the Summary API. This new gRPC call will return a Prometheus metric based response which Kubelet can export. Additionally, `PodAndContainerStatsFromCRI` feature gate support will be added to only report Prometheus based metrics from the CRI when calling `/metrics/cadvisor` endpoint when the feature gate is enabled. The additional metrics we support will need to be added to the individual container runtimes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From previous release, the PRR Questionnaire section for "Does enabling the feature change any default behavior?" said "Enabling this behavior means some stats endpoints will not be filled: some entries in /metrics/cadvisor"
Is this still accurate with this change? Is there anything about the structure of the metrics or what metrics are available that should be included in that PRR section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this is still true. accelerator metrics, for instance, are being dropped. The table at the top has a column dedicated to saying whether we're aiming to support them
} | ||
|
||
message Metric { | ||
int64 timestamp = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are container runtimes expected to return cached (with timestamp in the past) metrics?
What if a container runtime wants to return "fresh" metrics each time this is called? Is there a way to omit the timestamp from the metric. I believe it is recommended to omit the timestamp from the prometheus exposition when that is the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe container runtimes will return "fresh" metrics whenever the gRPC call ListPodSandboxMetrics
is called (at least on containerd side) so I think it makes sense to have the timestamp. I suppose depending on container runtime implementations there can be cached metrics. We keep Metric
and make another type similar to it that does not include timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also could interpret a timestamp of 0 to mean omitted and leave it up to the CRI to say when they were collected (or say it was instantaneous)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If metrics are fresh, there is no need to attach timestamps. The timestamp will not be meaningfully different from the scrape time, and should not be attached to prometheus metrics.
Disk usage metrics in particular can be very expensive to collect, and are likely to be cached.
A timestamp of 0 meaning no timestamp works for me, but should be documented in the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to enforce they must be fresh. a CRI impl may want to cache them (cri-o may...). I think timestamp 0 as fresh is a good way to express it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a reasonable approach. A few drawbacks that are worth including in the Drawbacks section:
- This doesn't enforce that container runtimes continue to support the full /metrics/cadvisor endpoint. CRI implementations appear to be free to deviate and produce different metrics as they see fit.
- Compared with the container runtime exposing these metrics directly, there is significant complexity (I have to implement a new CRI function using prometheus metrics as input), and some overhead from converting between prometheus, the CRI format, and back.
Update README.md Update README.md Add cpu stat linux cgroups v1 Add additional cgroup stats Add some more v1 stats Add v2 cpu metrics Add spacing Add additional v2 stats Add spacing Add network stats Fix spacing remove column one network stat commit add Add spacing Add spacing Add network stats Add stats Add v2 process usage stats Add cri implementation plan Update KEP with CRI API Refactor CRI implementation Resolve reviewer comments Fix capitalization Add backticks Fix linting Update latest-milestone Clarify fresh metrics
for sig-node /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danielye11, derekwaynecarr The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |