Support of Job and CronJob monitoring #987

AndrasSandor · 2024-05-27T07:45:05Z

Currently kube job metrics, such as kube_job_status_failed or kube_job_status_succeeded are not made available for monitoring.
List of metrics:
https://github.com/kubernetes/kube-state-metrics/blob/main/docs/metrics/workload/job-metrics.md

lyanco · 2024-05-28T14:19:41Z

You can manually deploy kube-state-metrics and scrape these metrics. Instructions here: https://cloud.google.com/stackdriver/docs/managed-prometheus/exporters/kube_state_metrics

We had to limit the number of kube-state metrics we collect by default so that costs are minimal. That being said, if there are enough +1s on this, we can definitely add a few more metrics, especially if they are not high-volume metrics.

maxamins · 2024-05-30T16:01:38Z

@AndrasSandor let us know if @lyanco suggestion works for you. Closing this issue for now.

Future readers feel free to +1 or reopen this thread if there is demand for this feature.

ksoftirqd · 2024-06-27T15:28:46Z

It would be great to add jobs/cronjobs related metrics to the list. The volume would likely be insignificant. However, I assume, it would need to be explicitly enabled to impact costs.

The alternative options are not appealing:

Deploying self-managed kube-state-metrics instead of the managed one does not offer the ability to “honor” reserved labels, like namespace/pod etc, leading to a confusing set of labels (e.g. namespace/exported_namespace, etc) and the need to modify existing rules.
Deploying pushgateway and modifying all jobs to push metrics to pushgateway would add complexity and require additional effort

pintohutch · 2024-06-27T18:22:58Z

Hey @ksoftirqd - thanks for reaching out.

Deploying self-managed kube-state-metrics instead of the managed one does not offer the ability to “honor” reserved labels, like namespace/pod etc, leading to a confusing set of labels (e.g. namespace/exported_namespace, etc) and the need to modify existing rules.

Actually by specifying a ClusterPodMonitoring like we show in examples/, you should have those labels honored by the kube-state-metrics exporter. Have you tried this?

ksoftirqd · 2024-06-27T20:50:33Z

Hi @pintohutch,

Thank you, this is very helpful! I must have missed this example and it indeed works as expected.

If we cannot add jobs/cronjobs to the list of supported resources, this can be a great alternative to the managed kube-state-metrics exporter.

Still it would be great to see those resources added, as the current solution seems to support the majority of resources and it's possible to toggle their metrics collection in the cluster config.

github-actions bot assigned maxamins May 27, 2024

maxamins closed this as completed May 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support of Job and CronJob monitoring #987

Support of Job and CronJob monitoring #987

AndrasSandor commented May 27, 2024

lyanco commented May 28, 2024

maxamins commented May 30, 2024

ksoftirqd commented Jun 27, 2024 •

edited

Loading

pintohutch commented Jun 27, 2024 •

edited

Loading

ksoftirqd commented Jun 27, 2024

Support of Job and CronJob monitoring #987

Support of Job and CronJob monitoring #987

Comments

AndrasSandor commented May 27, 2024

lyanco commented May 28, 2024

maxamins commented May 30, 2024

ksoftirqd commented Jun 27, 2024 • edited Loading

pintohutch commented Jun 27, 2024 • edited Loading

ksoftirqd commented Jun 27, 2024

ksoftirqd commented Jun 27, 2024 •

edited

Loading

pintohutch commented Jun 27, 2024 •

edited

Loading