Skip to content

Commit

Permalink
Merge pull request #100 from leelavg/user-guide
Browse files Browse the repository at this point in the history
docs: add usage guide for lvmo [skip ci]
  • Loading branch information
nbalacha authored Feb 16, 2022
2 parents 06afc8f + 345de00 commit 1a0aeb9
Showing 1 changed file with 386 additions and 0 deletions.
386 changes: 386 additions & 0 deletions doc/user-guide/usage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,386 @@
## Deploying LVMCluster CR

This guide assumes you followed steps in [Readme][repo-readme] and LVM operator
(hereafter LVMO) is running in your cluster.

Below are the available disks in our test kubernetes cluster node and there are
no existing LVM Physical Volumes, Volume Groups and Logical Volumes

``` console
sh-4.4# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 893.8G 0 disk
|-sdb1 8:17 0 1M 0 part
|-sdb2 8:18 0 127M 0 part
|-sdb3 8:19 0 384M 0 part /boot
`-sdb4 8:20 0 893.3G 0 part /sysroot
sr0 11:0 1 987M 0 rom
nvme0n1 259:0 0 1.5T 0 disk
nvme1n1 259:1 0 1.5T 0 disk
nvme2n1 259:2 0 1.5T 0 disk
sh-4.4# pvs
sh-4.4# vgs
sh-4.4# lvs
```

Here LVMO is installed in `lvm-operator-system` namespace via `make deploy`
target and operations will not change if LVMO is installed in any other namespaces.

``` console
kubectl get pods
NAME READY STATUS RESTARTS AGE
controller-manager-8bf864c85-8zjlp 3/3 Running 0 96s
```

After all containers in above listing is in `READY` state we can proceed with
deploying LVMCluster CR

``` yaml
cat <<EOF | kubectl apply -f -
apiVersion: lvm.topolvm.io/v1alpha1
kind: LVMCluster
metadata:
name: lvmcluster-sample
# set the namespace to the operator namespace
namespace: lvm-operator-system
spec:
deviceClasses:
- name: vg1
EOF
```

- Ability to respect Node Selector, Tolerations and Device Selectors in above CR
is coming soon.
- After deploying above CR all free disks will be used to create a volume
group, topolvm csi is deployed and all the corresponding resources are
created, including storage class for ready consumption of Persistent Volumes
based on Topolvm

- Wait for all the pods to be in `READY` state
- `topolvm-controller` and `topolvm-node` will be in `init` stage for a while
awaiting creation of certs and creation of volume groups by `vg-manager`
respectively

``` console
kubectl get pods
NAME READY STATUS RESTARTS AGE
controller-manager-8bf864c85-8zjlp 3/3 Running 0 31m
topolvm-controller-694bc98b87-x6dxn 4/4 Running 0 10m
topolvm-node-pbcth 4/4 Running 0 10m
vg-manager-f979f 1/1 Running 0 10m
```
When all pods are ready, the LVM physical volumes and volume groups should be
created on the host.

``` console
sh-4.4# pvs
PV VG Fmt Attr PSize PFree
/dev/nvme0n1 vg1 lvm2 a-- <1.46t <1.46t
/dev/nvme1n1 vg1 lvm2 a-- <1.46t <1.46t
/dev/nvme2n1 vg1 lvm2 a-- <1.46t <1.46t
sh-4.4# vgs
VG #PV #LV #SN Attr VSize VFree
vg1 3 0 0 wz--n- <4.37t <4.37t
```

Confirm that the following resources have been created on the cluster:

- When LVMCluster CR is deployed which contains details of all volume groups that
needs to be created across multiple nodes, two supporting Custom Resources
are created by LVMO
- LVMVolumeGroup: CR created and managed by LVMO which tracks individual volume
groups across multiple nodes
``` yaml
# kubectl get lvmvolumegroup vg1 -oyaml
apiVersion: lvm.topolvm.io/v1alpha1
kind: LVMVolumeGroup
metadata:
creationTimestamp: "2022-02-02T05:16:42Z"
generation: 1
name: vg1
namespace: lvm-operator-system
resourceVersion: "17242461"
uid: 88e8ad7d-1544-41fb-9a8e-12b1a66ab157
spec: {}
```
- LVMVolumeGroupNodeStatus: CR created and managed by VG Manager which tracks
node status corresponding to multiple volume groups
``` yaml
# kubectl get lvmvolumegroupnodestatuses.lvm.topolvm.io kube-node -oyaml
apiVersion: lvm.topolvm.io/v1alpha1
kind: LVMVolumeGroupNodeStatus
metadata:
creationTimestamp: "2022-02-02T05:17:59Z"
generation: 1
name: kube-node
namespace: lvm-operator-system
resourceVersion: "17242882"
uid: 292de9bb-3a9b-4ee8-946a-9b587986dafd
spec:
nodeStatus:
- devices:
- /dev/nvme0n1
- /dev/nvme1n1
- /dev/nvme2n1
name: vg1
status: Ready
```
- A storage class called `topolvm-<deviceclassname>` will be created
``` console
# kubectl get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
topolvm-vg1 topolvm.cybozu.com Delete WaitForFirstConsumer true 31m
```

Note:
- Reconciling multiple LVMCluster CRs by LVMO is not supported
- Custom resources LVMVolumeGroup and LVMVolumeGroupNodeStatus are managed by
LVMO and users should not edit them.

## Deploying PVC and App Pod

- A successful reconciliation of LVMCluster CR will setup all the underlying
resources needed to be able to create a PVC and use it in app pod, the same
can be verified from LVMCluster CR status field
```json
# kubectl get lvmclusters.lvm.topolvm.io -ojsonpath='{.items[*].status.deviceClassStatuses[*]}' | python3 -mjson.tool
{
"name": "vg1",
"nodeStatus": [
{
"devices": [
"/dev/nvme0n1",
"/dev/nvme1n1",
"/dev/nvme2n1"
],
"node": "kube-node",
"status": "Ready"
}
]
}
```
- Create a PVC using the StorageClass created for the deviceClass. The PVC will
not be bound until a pod claims the storage as the volume binding mode is set
to WaitForFirstConsumer.
```yaml
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: lvm-file-pvc
spec:
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: topolvm-vg1
EOF
```
``` console
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
lvm-file-pvc Pending topolvm-vg1 12s
```
- In a multi node setup the `WaitForFirstConsumer` volume binding is used to
take scheduling decisions of pod placement which usually prefers the node
having highest amount of available storage space
- Deploy an app pod that uses the PVC created earlier.
``` yaml
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: app-file
spec:
containers:
- name: app-file
image: bash
imagePullPolicy: IfNotPresent
command: ["/usr/local/bin/bash", "-c", "/usr/bin/tail -f /dev/null"]
volumeMounts:
- mountPath: "/mnt/file"
name: lvm-file-pvc
volumes:
- name: lvm-file-pvc
persistentVolumeClaim:
claimName: lvm-file-pvc
EOF
```
- App pod should claim the PVC and be in `READY`, PVC will get bound to this
app
``` console
kubectl get pod | grep -E '^(app|NAME)'
NAME READY STATUS RESTARTS AGE
app-file 1/1 Running 0 2m6s
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
lvm-file-pvc Bound pvc-04a4f5f7-8665-4008-818a-490503c859f5 5Gi RWO topolvm-vg1 5m55s
```
- To create a PVC with `volumeMode: Block`, do the following:
``` yaml
cat <<EOF | kubectl apply -f -
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: lvm-block-pvc
spec:
volumeMode: Block
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: topolvm-vg1
---
apiVersion: v1
kind: Pod
metadata:
name: app-block
spec:
containers:
- name: app-block
image: bash
imagePullPolicy: IfNotPresent
command: ["/usr/local/bin/bash", "-c", "/usr/bin/tail -f /dev/null"]
volumeDevices:
- devicePath: "/dev/xvda"
name: lvm-block-pvc
volumes:
- name: lvm-block-pvc
persistentVolumeClaim:
claimName: lvm-block-pvc
EOF
```
- Check the status of both PVCs and App pods:
``` console
kubectl get pod | grep -E '^(app|NAME)'
NAME READY STATUS RESTARTS AGE
app-block 1/1 Running 0 99s
app-file 1/1 Running 0 9m34s
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
lvm-block-pvc Bound pvc-56fe959a-1184-4fa6-9d66-b69c06d43cf4 5Gi RWO topolvm-vg1 104s
lvm-file-pvc Bound pvc-04a4f5f7-8665-4008-818a-490503c859f5 5Gi RWO topolvm-vg1 13m
```

### Monitoring and Metrics

- LVMO currently surfaces only Topolvm metrics and those can be viewed either
from UI or port-forwarding the service
- On Openshift clusters, set the label "openshift.io/cluster-monitoring" on the
namespace the LVMO is running in.
``` console
kubectl patch namespace/lvm-operator-system -p '{"metadata": {"labels": {"openshift.io/cluster-monitoring": "true"}}}'
```
``` console
# port-forward service in one terminal
kubectl port-forward svc/topolvm-node-metrics :8080
Forwarding from 127.0.0.1:41685 -> 8080
Forwarding from [::1]:41685 -> 8080
...
...
# in another terminal view the metrics at above port in localhost
curl -s 127.0.0.1:41685/metrics | grep -Ei 'topolvm_volumegroup_.*?_bytes\{'
topolvm_volumegroup_available_bytes{device_class="vg1",node="kube-node"} 4.790222323712e+12
topolvm_volumegroup_size_bytes{device_class="vg1",node="kube-node"} 4.800959741952e+12
```

## Cleanup

- Feature to auto cleanup volume groups after removing LVMCluster CR is coming
soon
- Until then, please perform the following steps to cleanup the resources
created by the operator

### Steps

1. Remove all the apps which are using PVCs created with topolvm
``` console
# delete App pod first
kubectl delete pod app-file app-block
pod "app-file" deleted
pod "app-block" deleted
# delete PVCs which were used by above App pods
kubectl delete pvc lvm-file-pvc lvm-block-pvc
persistentvolumeclaim "lvm-file-pvc" deleted
persistentvolumeclaim "lvm-block-pvc" deleted
```
2. Make sure there are no `logicalvolumes` CRs which were created by topolvm
``` console
kubectl get logicalvolumes.topolvm.cybozu.com
No resources found
```
3. Take a json dump of LVMCluster CR contents. It has the list of VGs and PVs
created on the nodes.
``` console
kubectl get lvmclusters.lvm.topolvm.io -ojson > /tmp/lvmcr.json
```
4. Either parse contents from above json file via jq or store status of the CR
``` console
kubectl get lvmclusters.lvm.topolvm.io -ojsonpath='{.items[*].status.deviceClassStatuses[*]}' | python3 -mjson.tool
```
``` json
{
"name": "vg1",
"nodeStatus": [
{
"devices": [
"/dev/nvme0n1",
"/dev/nvme1n1",
"/dev/nvme2n1"
],
"node": "kube-node",
"status": "Ready"
}
]
}
```
5. Remove LVMCluster CR and wait for deletion of all resources in the namespace
except the operator (controller-manager pod)
``` console
kubectl delete lvmclusters.lvm.topolvm.io lvmcluster-sample
lvmcluster.lvm.topolvm.io "lvmcluster-sample" deleted
kubectl get po
NAME READY STATUS RESTARTS AGE
controller-manager-8bf864c85-8zjlp 3/3 Running 0 125m
```
6. Login to the node and remove the LVM volume groups and physical volumes
``` console
sh-4.4# vgremove vg1 --nolock
WARNING: File locking is disabled.
Volume group "vg1" successfully removed
sh-4.4# pvremove /dev/nvme0n1 /dev/nvme1n1 /dev/nvme2n1 --nolock
WARNING: File locking is disabled.
Labels on physical volume "/dev/nvme0n1" successfully wiped.
Labels on physical volume "/dev/nvme1n1" successfully wiped.
Labels on physical volume "/dev/nvme2n1" successfully wiped.
```
7. Remove the lvmd.yaml config file from the node
``` console
sh-4.4# rm /etc/topolvm/lvmd.yaml
```
Note:
- Removal of volume groups, physical volumes and lvm config file is necessary
during cleanup or else LVMO fails to deploy Topolvm in next iteration.

## Uninstalling LVMO

- LVMO can be removed from the cluster via one of the following methods based
on your installation
``` console
# if deployed via manifests
make undeploy
# if deployed via olm
make undeploy-with-olm
```

[repo-readme]: ../../README.md

0 comments on commit 1a0aeb9

Please sign in to comment.