Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add metrics for insecure backend proxy #97814

Merged
merged 1 commit into from
Jan 21, 2021

Conversation

deads2k
Copy link
Contributor

@deads2k deads2k commented Jan 7, 2021

add metrics to measure usage and frequency of backend tls failures on pod/logs calls.

/priority important-soon
/kind cleanup
@kubernetes/sig-auth-pr-reviews

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/auth Categorizes an issue or PR as relevant to SIG Auth. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 7, 2021
Namespace: namespace,
Subsystem: subsystem,
Name: "pods_logs_insecure_backend",
Help: "Total number of requests for pods/logs sliced by usage type: enforce_tls, skip_tls_allowed, skip_tls_denied",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's talk about metric shape. Do we like this. Cardinality is fairly low and I think it tells me whether people are using or want to use it and aren't allowed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that seems okay to me.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. labels Jan 7, 2021
Namespace: namespace,
Subsystem: subsystem,
Name: "pods_logs_backend_tls_failure",
Help: "Total number of requests for pods/logs that failed due to kubelet server TLS verification",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can only catch these for the cases where a secure connection is requested (the default), but it provides a max number of cases where the call would have been useful.

This being high indicates something is misconfigured in your cluster. Users should not be expected set a skip backend tls verify .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this seems alertable, so lgtm

@deads2k
Copy link
Contributor Author

deads2k commented Jan 7, 2021

/assign @logicalhan @tkashem

@deads2k
Copy link
Contributor Author

deads2k commented Jan 7, 2021

/retest

@fedebongio
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 7, 2021
pkg/registry/core/pod/rest/metrics.go Outdated Show resolved Hide resolved
pkg/registry/core/pod/rest/metrics.go Outdated Show resolved Hide resolved
pkg/registry/core/pod/rest/metrics.go Outdated Show resolved Hide resolved
Namespace: namespace,
Subsystem: subsystem,
Name: "pods_logs_backend_tls_failure",
Help: "Total number of requests for pods/logs that failed due to kubelet server TLS verification",
Copy link
Contributor

@tkashem tkashem Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is it pods/log or pods/logs or are both accepted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is it pods/log or pods/logs or are both accepted?

pods/logs

pkg/registry/core/pod/rest/log.go Show resolved Hide resolved
@tkashem
Copy link
Contributor

tkashem commented Jan 13, 2021

@deads2k question: is this something that can expand to other resources in the future? If there is a possibility then perhaps we can rename the metrics to backend_tls_failure with resource and subresource as labels?

@deads2k
Copy link
Contributor Author

deads2k commented Jan 14, 2021

@deads2k question: is this something that can expand to other resources in the future? If there is a possibility then perhaps we can rename the metrics to backend_tls_failure with resource and subresource as labels?

Good question. I think that we're fairly unlikely to support other subresources skipping a backend tls failure because clients send data out. If we do so, then we should tweak for consistency.

@tkashem
Copy link
Contributor

tkashem commented Jan 21, 2021

LGTM

(giving others chance to review)

Copy link
Member

@logicalhan logicalhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from an instrumentation perspective, these look okay to me.
/lgtm

Namespace: namespace,
Subsystem: subsystem,
Name: "pods_logs_insecure_backend",
Help: "Total number of requests for pods/logs sliced by usage type: enforce_tls, skip_tls_allowed, skip_tls_denied",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that seems okay to me.

Namespace: namespace,
Subsystem: subsystem,
Name: "pods_logs_backend_tls_failure",
Help: "Total number of requests for pods/logs that failed due to kubelet server TLS verification",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this seems alertable, so lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 21, 2021
@k8s-ci-robot k8s-ci-robot merged commit 661eae7 into kubernetes:master Jan 21, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Jan 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants