Skip to content

Commit

Permalink
Merge pull request #253 from arnongilboa/improve_cdi_alert_runbooks
Browse files Browse the repository at this point in the history
Improve CDI alert runbooks
  • Loading branch information
sradco authored Jun 27, 2024
2 parents ad1f0e5 + a524a88 commit 7adb2c1
Show file tree
Hide file tree
Showing 4 changed files with 55 additions and 51 deletions.
52 changes: 24 additions & 28 deletions docs/runbooks/CDIDataImportCronOutdated.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ For golden images, _latest_ refers to the latest operating system of the
distribution. For other disk images, _latest_ refers to the latest hash of the
image that is available.

In case there is no default (Kubernetes or virtualization) storage class, and the
`DataImportCron` import PVC is `Pending` for one, the alert is suppressed as the
root cause is already alerted by CDINoDefaultStorageClass.

## Impact

VMs might be created from outdated disk images.
Expand All @@ -23,61 +27,53 @@ VMs might fail to start because no boot source is available for cloning.

## Diagnosis

1. Check the cluster for a default storage class:
1. Check the cluster for a default Kubernetes storage class:
```bash
$ kubectl get sc -o jsonpath='{.items[?(.metadata.annotations.storageclass\.kubernetes\.io\/is-default-class=="true")].metadata.name}'
```

Check the cluster for a default virtualization storage class:
```bash
$ kubectl get sc
$ kubectl get sc -o jsonpath='{.items[?(.metadata.annotations.storageclass\.kubevirt\.io\/is-default-virt-class=="true")].metadata.name}'
```

The output displays the storage classes with `(default)` beside the name of
the default storage class. You must set a default storage class, either on
the cluster or in the `DataImportCron` specification, in order for the
`DataImportCron` to poll and import golden images. If no storage class is
defined, the DataVolume controller fails to create PVCs and the following
event is displayed: `DataVolume.storage spec is missing accessMode and no
storageClass to choose profile`.
The output displays the default (Kubernetes and/or virtualization) storage
class. You must either set a default storage class on the cluster, or ask for
a specific storage class in the `DataImportCron` specification, in order for
the `DataImportCron` to poll and import golden images. If the default
storage class does not exist, the created import DataVolume and PVC will be
in `Pending` phase.

2. Obtain the `DataImportCron` namespace and name:
2. Obtain the `DataImportCrons` which are not up-to-date:

```bash
$ kubectl get dataimportcron -A -o json | jq -r '.items[] | select(.status.conditions[] | select(.type == "UpToDate" and .status == "False")) | .metadata.namespace + "/" + .metadata.name'
$ kubectl get dataimportcron -A -o jsonpath='{range .items[*]}{.status.conditions[?(@.type=="UpToDate")].status}{"\t"}{.metadata.namespace}{"/"}{.metadata.name}{"\n"}{end}' | grep False
```

3. If a default storage class is not defined on the cluster, check the
`DataImportCron` specification for a default storage class:
`DataImportCron` specification for optional storage class:

```bash
$ kubectl get dataimportcron <dataimportcron> -o yaml | grep -B 5 storageClassName
```

Example output:

```yaml
url: docker://.../cdi-func-test-tinycore
storage:
resources:
requests:
storage: 5Gi
storageClassName: rook-ceph-block
$ kubectl -n <namespace> get dataimportcron <dataimportcron> -o jsonpath='{.spec.template.spec.storage.storageClassName}{"\n"}'
```

4. Obtain the name of the `DataVolume` associated with the `DataImportCron`
object:

```bash
$ kubectl -n <namespace> get dataimportcron <dataimportcron> -o json | jq .status.lastImportedPVC.name
$ kubectl -n <namespace> get dataimportcron <dataimportcron> -o jsonpath='{.status.lastImportedPVC.name}{"\n"}'
```

5. Check the `DataVolume` log for error messages:
5. Check the `DataVolume` status:

```bash
$ kubectl -n <namespace> get dv <datavolume> -o yaml
$ kubectl -n <namespace> get dv <datavolume> -o jsonpath-as-json='{.status}'
```

6. Set the `CDI_NAMESPACE` environment variable:

```bash
$ export CDI_NAMESPACE="$(kubectl get deployment -A | grep cdi-operator | awk '{print $1}')"
$ export CDI_NAMESPACE="$(kubectl get deployment -A -o jsonpath='{.items[?(.metadata.name=="cdi-operator")].metadata.namespace}')"
```

7. Check the `cdi-deployment` log for error messages:
Expand Down
27 changes: 16 additions & 11 deletions docs/runbooks/CDIDefaultStorageClassDegraded.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,16 @@

## Meaning

This alert fires when there is no default storage class that supports smart
cloning (CSI or snapshot-based) or the ReadWriteMany access mode.
This alert fires when there is/are default (Kubernetes and/or virtualization)
storage class(es), but none of them supports both smart cloning (CSI or snapshot
based) and ReadWriteMany access mode.

A default virtualization storage class has precedence over a default Kubernetes
storage class for creating a VirtualMachine disk image.

<!--DS: In case of single-node OpenShift, the alert is suppressed if there is a default
storage class that supports smart cloning, but not ReadWriteMany.-->

## Impact

If the default storage class does not support smart cloning, the default cloning
Expand All @@ -18,31 +22,32 @@ If the default storage class does not support ReadWriteMany, virtual machines

## Diagnosis

1. Get the default KubeVirt storage class by running the following command:
1. Get the default virtualization storage class by running the following
command:

```bash
$ export CDI_DEFAULT_VIRT_SC="$(kubectl get sc -o json | jq -r '.items[].metadata|select(.annotations."storageclass.kubevirt.io/is-default-virt-class"=="true")|.name')"
$ export CDI_DEFAULT_VIRT_SC="$(kubectl get sc -o jsonpath='{.items[?(.metadata.annotations.storageclass\.kubevirt\.io\/is-default-virt-class=="true")].metadata.name}')"
```

2. If a default KubeVirt storage class exists, check that it supports
2. If a default virtualization storage class exists, check that it supports
ReadWriteMany by running the following command:

```bash
$ kubectl get storageprofile $CDI_DEFAULT_VIRT_SC -o json | jq '.status.claimPropertySets'| grep ReadWriteMany
$ kubectl get storageprofile $CDI_DEFAULT_VIRT_SC -o jsonpath='{.status.claimPropertySets}' | grep ReadWriteMany
```

3. If there is no default KubeVirt storage class, get the default Kubernetes
storage class by running the following command:
3. If there is no default virtualization storage class, get the default
Kubernetes storage class by running the following command:

```bash
$ export CDI_DEFAULT_K8S_SC="$(kubectl get sc -o json | jq -r '.items[].metadata|select(.annotations."storageclass.kubernetes.io/is-default-class"=="true")|.name')"
$ export CDI_DEFAULT_K8S_SC="$(kubectl get sc -o jsonpath='{.items[?(.metadata.annotations.storageclass\.kubernetes\.io\/is-default-class=="true")].metadata.name}')"
```

4. If a default Kubernetes storage class exists, check that it supports
ReadWriteMany by running the following command:

```bash
$ kubectl get storageprofile $CDI_DEFAULT_K8S_SC -o json | jq '.status.claimPropertySets'| grep ReadWriteMany
$ kubectl get storageprofile $CDI_DEFAULT_VIRT_SC -o jsonpath='{.status.claimPropertySets}' | grep ReadWriteMany
```

<!--USstart-->
Expand All @@ -52,7 +57,7 @@ for details about smart clone prerequisites.

## Mitigation

Ensure that you have a default storage class, either Kubernetes or KubeVirt, and
Ensure that you have a default (Kubernetes or virtualization) storage class, and
that the default storage class supports smart cloning and ReadWriteMany.

<!--USstart-->
Expand Down
2 changes: 1 addition & 1 deletion docs/runbooks/CDIMultipleDefaultVirtStorageClasses.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Obtain a list of default virtualization storage classes by running the following
command:

```bash
$ kubectl get sc -o json | jq '.items[].metadata|select(.annotations."storageclass.kubevirt.io/is-default-virt-class"=="true")|.name'
$ kubectl get sc -o jsonpath='{.items[?(.metadata.annotations.storageclass\.kubevirt\.io\/is-default-virt-class=="true")].metadata.name}'
```

## Mitigation
Expand Down
25 changes: 14 additions & 11 deletions docs/runbooks/CDINoDefaultStorageClass.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,37 +2,39 @@

## Meaning

This alert fires when there is no default (Kubernetes or KubeVirt) storage
class, and a data volume is pending for one.
This alert fires when there is no default (Kubernetes or virtualization) storage
class, and a data volume is `Pending` for one.

A default KubeVirt storage class has precedence over a default Kubernetes
A default virtualization storage class has precedence over a default Kubernetes
storage class for creating a VirtualMachine disk image.

## Impact

If there is no default Kubernetes or KubeVirt storage class, a data volume that
does not have a specified storage class remains in a "pending" state.
If there is no default (Kubernetes or virtualization) storage class, a data
volume that does not have a specified storage class remains in a `Pending`
phase.

## Diagnosis

1. Check for a default Kubernetes storage class by running the following
command:

```bash
$ kubectl get sc -o json | jq '.items[].metadata|select(.annotations."storageclass.kubernetes.io/is-default-class"=="true")|.name'
$ kubectl get sc -o jsonpath='{.items[?(.metadata.annotations.storageclass\.kubernetes\.io\/is-default-class=="true")].metadata.name}'
```

2. Check for a default KubeVirt storage class by running the following command:
2. Check for a default virtualization storage class by running the following
command:

```bash
$ kubectl get sc -o json | jq '.items[].metadata|select(.annotations."storageclass.kubevirt.io/is-default-virt-class"=="true")|.name'
$ kubectl get sc -o jsonpath='{.items[?(.metadata.annotations.storageclass\.kubevirt\.io\/is-default-virt-class=="true")].metadata.name}'
```

## Mitigation

Create a default storage class for either Kubernetes or KubeVirt or for both.
Create a default (Kubernetes and/or virtualization) storage class.

A default KubeVirt storage class has precedence over a default Kubernetes
A default virtualization storage class has precedence over a default Kubernetes
storage class for creating a virtual machine disk image.

* Create a default Kubernetes storage class by running the following command:
Expand All @@ -41,7 +43,8 @@ storage class for creating a virtual machine disk image.
$ kubectl patch storageclass <storage-class-name> -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
```

* Create a default KubeVirt storage class by running the following command:
* Create a default virtualization storage class by running the following
command:

```bash
$ kubectl patch storageclass <storage-class-name> -p '{"metadata": {"annotations":{"storageclass.kubevirt.io/is-default-virt-class":"true"}}}'
Expand Down

0 comments on commit 7adb2c1

Please sign in to comment.