Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capture metrics for external volume resizing #37

Closed
gnufied opened this issue Apr 24, 2019 · 9 comments · Fixed by #67
Closed

Capture metrics for external volume resizing #37

gnufied opened this issue Apr 24, 2019 · 9 comments · Fixed by #67
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@gnufied
Copy link
Contributor

gnufied commented Apr 24, 2019

No description provided.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 23, 2019
@bertinatto
Copy link
Contributor

/remove-lifecycle stale.

@bertinatto
Copy link
Contributor

What metrics should the external-resizer emit?

If I read correctly, in-tree CSI resizing already emits metrics for duration, errors, statuses etc.

I suppose the duration there isn't exactly accurate (because the resizing is performed externally), is that correct? If so, would recording the duration of the operation enough here?

@msau42
Copy link
Collaborator

msau42 commented Aug 7, 2019

@yuxiangqian is planning to add a metrics library to csi-lib-utils that all the sidecars can use

@msau42
Copy link
Collaborator

msau42 commented Aug 7, 2019

The main benefit of sidecar metrics is to get per operation latency and error metrics which cannot be done in the intree controller for CSI plugins

@gnufied
Copy link
Contributor Author

gnufied commented Aug 7, 2019

@bertinatto we do not emit resizing metrics for CSI drivers from in-tree code. Where did you see the metric being emitted? The in-tree controller simply hands over the control to the external controller.

@msau42 Will the library capture just operation metrics or total latecy metrics?

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 6, 2019
@msau42
Copy link
Collaborator

msau42 commented Sep 6, 2019

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Sep 6, 2019
@yuxiangqian
Copy link

@gnufied @msau42 I have not got a chance to work on the lib yet. I will come back to that after the snapshot to beta project is done. But the plan is to have it in csi-lib-utils. My very initial thought is to have common utilities, like metric registration with prometheus, timestamp cache struct etc. defined for sharing. The real reporting would still happen likely in sidecar controllers. Might worth a KEP?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
6 participants