The KubeVirt metrics should align with the Kubernetes metrics names.
The KubeVirt Users should have the same experience when searching for a node, container, pod and virtual machine metrics.
Naming requirements:
- Check if a similar Kubernetes metric, for node, container or pod, exists and try to align to it.
- KubeVirt metric for a running VM should have a
kubevirt_vmi_
prefix
For Example, see the following Kubernetes network metrics:
- node_network_receive_packets_total
- node_network_transmit_packets_total
- container_network_receive_packets_total
- container_network_transmit_packets_total
The KubeVirt metrics for vmi should be:
- kubevirt_vmi_network_receive_packets_total
- kubevirt_vmi_network_transmit_packets_total
The Prometheus recording rules appear in Prometheus as metrics.
In order to easily identify the KubeVirt recording rules, they should have a kubevirt_
prefix.
When creating a KubeVirt alert rule, please see the following :
-
Use recording rules when doing calculations.
-
Create an alert runbook at KubeVirt runbooks.
-
Alert rule must include
runbook_url
with the link to your runbook from step #2. -
Alert rule must include
severity
. One of:critical
,warning
,info
.NOTE:
- Critical alerts - When the service is down and you loss critical functionality, an action is required immediately.
- Warning alerts - When an alert require user intervention. A more serious issue may develop if this is not resolved soon.
- Info alerts - When a minor problem has been detected. It should be resolved relatively soon and not ignored.
-
Alert
message
must be verbose, since it is being propagated to the observability/metrics.md file, when runningmake-generate
.