A set of Grafana dashboards and Prometheus alerts for Ceph for use in Prometheus & Alertmanager.
Initially this project started before ceph/ceph-mixin existed.
We have contributed a lot of our alerting configuration to ceph/ceph-mixins repository, so this repository provides the core configuration.
But currently, the state of play is that maintainers don't allow us adding grafana_url
and runbook
annotations as they wouldn't look good in Open Shift Container Platform UI and doesn't fit their needs. See the issue ceph/ceph-mixins#54.
This repository adds an opinionated annotations to the alerting rules for running them in Prometheus and Alertmanager.
Our alerting configuration looks like this:
- alert: CephMgrIsMissingReplicas
expr: |
sum(up{job="ceph"}) < 3
for: 5m
annotations:
description: Ceph Manager is missing replicas.
grafana_url: ""
runbook_url: https://github.com/devopyio/ceph-monitoring-mixin/tree/master/runbook.md#alert-name-cephmgrismissingreplicas
labels:
severity: warning
This conventions is loosely based on description provided by Alerting Rules blogpost
If you are looking to use Prometheus Alerts with Open Shift Container Platform UI, please use ceph/ceph-mixins directly.
This mixin is designed to be vendored into the repo with your infrastructure config. To do this, use jsonnet-bundler:
Generate the config files and deploy them yourself
You can manually generate the alerts files, but first you must install some tools:
$ make setup
Mac:
$ brew install jsonnet
Linux:
sudo snap install jsonnet
Then, grab the mixin and its dependencies:
$ git clone https://github.com/devopyio/ceph-monitoring-mixin
$ cd ceph-monitoring-mixin
$ jb install
Finally, build the mixin:
$ make prometheus_alerts.yaml
The prometheus_alerts.yaml
file then need to passed to your Prometheus server,
The exact details will depending on how you deploy your Prometheus.