Skip to content

Commit

Permalink
k3d: Introduce k3d SR-IOV provider
Browse files Browse the repository at this point in the history
Since kubernetes-sigs/kind#2999 blocks us from updating
to new k8s versions using kind, use k3d instead of kind.

Signed-off-by: Or Shoval <[email protected]>
  • Loading branch information
oshoval committed Feb 28, 2023
1 parent e1cf770 commit ddf34f4
Show file tree
Hide file tree
Showing 24 changed files with 5,338 additions and 1 deletion.
10 changes: 10 additions & 0 deletions cluster-up/cluster/k3d-1.25-sriov/OWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
filters:
".*":
reviewers:
- qinqon
- oshoval
- phoracek
- ormergi
approvers:
- qinqon
- phoracek
74 changes: 74 additions & 0 deletions cluster-up/cluster/k3d-1.25-sriov/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# K8s 1.25.x with SR-IOV in a K3d cluster

Provides a pre-deployed containerized k8s cluster with version 1.25.x that runs
using [K3d](https://github.com/k3d-io/k3d)
The cluster is completely ephemeral and is recreated on every cluster restart. The KubeVirt containers are built on the
local machine and are then pushed to a registry which is exposed at
`127.0.0.1:5000`.

This version requires to have SR-IOV enabled nics (SR-IOV Physical Function) on the current host, and will move
physical interfaces into the `K3d`'s cluster agent node(s) (agent node is a worker node on k3d terminology)
so that they can be used through multus and SR-IOV
components.

This provider also deploys [multus](https://github.com/k8snetworkplumbingwg/multus-cni)
, [sriov-cni](https://github.com/k8snetworkplumbingwg/sriov-cni)
and [sriov-device-plugin](https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin).

## Bringing the cluster up

```bash
export KUBEVIRT_PROVIDER=k3d-1.25-sriov
export KUBECONFIG=$(realpath _ci-configs/k3d-1.25-sriov/.kubeconfig)
make cluster-up
```
```
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k3d-sriov-server-0 Ready control-plane,master 67m v1.25.6+k3s1
k3d-sriov-agent-0 Ready worker 67m v1.25.6+k3s1
k3d-sriov-agent-1 Ready worker 67m v1.25.6+k3s1
$ kubectl get pods -n kube-system -l app=multus
NAME READY STATUS RESTARTS AGE
kube-multus-ds-z9hvs 1/1 Running 0 66m
kube-multus-ds-7shgv 1/1 Running 0 66m
kube-multus-ds-l49xj 1/1 Running 0 66m
$ kubectl get pods -n sriov -l app=sriov-cni
NAME READY STATUS RESTARTS AGE
kube-sriov-cni-ds-amd64-4pndd 1/1 Running 0 66m
kube-sriov-cni-ds-amd64-68nhh 1/1 Running 0 65m
$ kubectl get pods -n sriov -l app=sriovdp
NAME READY STATUS RESTARTS AGE
kube-sriov-device-plugin-amd64-qk66v 1/1 Running 0 66m
kube-sriov-device-plugin-amd64-d5r5b 1/1 Running 0 65m
```

### Conneting to a node
```bash
export KUBEVIRT_PROVIDER=k3d-1.25-sriov
./cluster-up/ssh.sh <node_name> /bin/sh
```

## Bringing the cluster down

```bash
export KUBEVIRT_PROVIDER=k3d-1.25-sriov
make cluster-down
```

This destroys the whole cluster, and gracefully moves the SR-IOV nics to the root network namespace.

Note: killing the containers / cluster without gracefully moving the nics to the root ns before it,
might result in unreachable nics for few minutes.
`find /sys/class/net/*/device/sriov_numvfs` can be used to see when the nics are reachable again.

### Bumping calico
Fetch new calico yaml and:
1. Enable `allow_ip_forwarding` (See https://k3d.io/v5.0.1/usage/advanced/calico)
2. Prefix the images in the yaml with `quay.io/`

Note: For the initial k3d provider, used the yaml that appears on the link above, and did step [2]
on top of that.
58 changes: 58 additions & 0 deletions cluster-up/cluster/k3d-1.25-sriov/TROUBLESHOOTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# How to troubleshoot a failing k3d job

If logging and output artifacts are not enough, there is a way to connect to a running CI pod and troubleshoot directly from there.

## Pre-requisites

- A working (enabled) account on the [CI cluster](shift.ovirt.org), specifically enabled to the `kubevirt-prow-jobs` project.
- The [mkpj tool](https://github.com/kubernetes/test-infra/tree/master/prow/cmd/mkpj) installed

## Launching a custom job

Through the `mkpj` tool, it's possible to craft a custom Prow Job that can be executed on the CI cluster.

Just `go get` it by running `go get k8s.io/test-infra/prow/cmd/mkpj`

Then run the following command from a checkout of the [project-infra repo](https://github.com/kubevirt/project-infra):

```bash
mkpj --pull-number $KUBEVIRT_PR_NUMBER -job pull-kubevirt-e2e-k3d-1.25-sriov -job-config-path github/ci/prow/files/jobs/kubevirt/kubevirt-presubmits.yaml --config-path github/ci/prow/files/config.yaml > debugkind.yaml
```

You will end up having a ProwJob manifest in the `debugkind.yaml` file.

It's strongly recommended to replace the job's name, as it will be easier to find and debug the relative pod, by replacing `metadata.name` with something more recognizeable.

The `$KUBEVIRT_PR_NUMBER` can be an actual PR on the [kubevirt repo](https://github.com/kubevirt/kubevirt).

In case we just want to debug the cluster provided by the CI, it's recommended to override the entry point, either in the test PR we are instrumenting (a good sample can be found [here](https://github.com/kubevirt/kubevirt/pull/3022)), or by overriding the entry point directly in the prow job's manifest.

Remember that we want the cluster long living, so a long sleep must be provided as part of the entry point.

Make sure you switch to the `kubevirt-prow-jobs` project, and apply the manifest:

```bash
kubectl apply -f debugkind.yaml
```

You will end up with a ProwJob object, and a pod with the same name you gave to the ProwJob.

Once the pod is up & running, connect to it via bash:

```bash
kubectl exec -it debugprowjobpod bash
```

### Logistics

Once you are in the pod, you'll be able to troubleshoot what's happening in the environment CI is running its tests.

Run the follow to bring up a [k3d](https://github.com/k3d-io/k3d) cluster with SR-IOV installed.

```bash
KUBEVIRT_PROVIDER=k3d-1.25-sriov make cluster-up
```

Use `k3d kubeconfig print sriov` to extract the kubeconfig file.
The `kubectl` binary is already on board and in `$PATH`.
See `README.md` for more info.
74 changes: 74 additions & 0 deletions cluster-up/cluster/k3d-1.25-sriov/config_sriov_cluster.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
#!/bin/bash

[ $(id -u) -ne 0 ] && echo "FATAL: this script requires sudo privileges" >&2 && exit 1

set -xe

PF_COUNT_PER_NODE=${PF_COUNT_PER_NODE:-1}
[ $PF_COUNT_PER_NODE -le 0 ] && echo "FATAL: PF_COUNT_PER_NODE must be a positive integer" >&2 && exit 1
[ $PF_COUNT_PER_NODE != 1 ] && echo "FATAL: only 1 PF per node is supported for now" >&2 && exit 1

SCRIPT_PATH=$(dirname "$(realpath "$0")")

source ${SCRIPT_PATH}/sriov-node/node.sh
source ${SCRIPT_PATH}/sriov-components/sriov_components.sh

CONFIGURE_VFS_SCRIPT_PATH="$SCRIPT_PATH/sriov-node/configure_vfs.sh"

SRIOV_COMPONENTS_NAMESPACE="sriov"
SRIOV_NODE_LABEL_KEY="sriov_capable"
SRIOV_NODE_LABEL_VALUE="true"
SRIOV_NODE_LABEL="$SRIOV_NODE_LABEL_KEY=$SRIOV_NODE_LABEL_VALUE"
SRIOVDP_RESOURCE_PREFIX="kubevirt.io"
SRIOVDP_RESOURCE_NAME="sriov_net"
VFS_DRIVER="vfio-pci"
VFS_DRIVER_KMODULE="vfio_pci"
VFS_COUNT="6"

function validate_nodes_sriov_allocatable_resource() {
local -r resource_name="$SRIOVDP_RESOURCE_PREFIX/$SRIOVDP_RESOURCE_NAME"
local -r sriov_nodes=$(_kubectl get nodes -l $SRIOV_NODE_LABEL -o custom-columns=:.metadata.name --no-headers)

local num_vfs
for sriov_node in $sriov_nodes; do
num_vfs=$(node::total_vfs_count "$sriov_node")
sriov_components::wait_allocatable_resource "$sriov_node" "$resource_name" "$num_vfs"
done
}

worker_nodes=($(_kubectl get nodes -l node-role.kubernetes.io/worker -o custom-columns=:.metadata.name --no-headers))
worker_nodes_count=${#worker_nodes[@]}
[ "$worker_nodes_count" -eq 0 ] && echo "FATAL: no worker nodes found" >&2 && exit 1

pfs_names=($(node::discover_host_pfs))
pf_count="${#pfs_names[@]}"
[ "$pf_count" -eq 0 ] && echo "FATAL: Could not find available sriov PF's" >&2 && exit 1

total_pf_required=$((worker_nodes_count*PF_COUNT_PER_NODE))
[ "$pf_count" -lt "$total_pf_required" ] && \
echo "FATAL: there are not enough PF's on the host, try to reduce PF_COUNT_PER_NODE
Worker nodes count: $worker_nodes_count
PF per node count: $PF_COUNT_PER_NODE
Total PF count required: $total_pf_required" >&2 && exit 1

## Move SR-IOV Physical Functions to worker nodes
PFS_IN_USE=""
node::configure_sriov_pfs "${worker_nodes[*]}" "${pfs_names[*]}" "$PF_COUNT_PER_NODE" "PFS_IN_USE"

## Create VFs and configure their drivers on each SR-IOV node
node::configure_sriov_vfs "${worker_nodes[*]}" "$VFS_DRIVER" "$VFS_DRIVER_KMODULE" "$VFS_COUNT"

## Deploy Multus and SRIOV components
sriov_components::deploy_multus
sriov_components::deploy \
"$PFS_IN_USE" \
"$VFS_DRIVER" \
"$SRIOVDP_RESOURCE_PREFIX" "$SRIOVDP_RESOURCE_NAME" \
"$SRIOV_NODE_LABEL_KEY" "$SRIOV_NODE_LABEL_VALUE"

# Verify that each sriov capable node has sriov VFs allocatable resource
validate_nodes_sriov_allocatable_resource
sriov_components::wait_pods_ready

_kubectl get nodes
_kubectl get pods -n $SRIOV_COMPONENTS_NAMESPACE
47 changes: 47 additions & 0 deletions cluster-up/cluster/k3d-1.25-sriov/conformance.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
{
"Description": "DEFAULT",
"UUID": "",
"Version": "v0.56.9",
"ResultsDir": "/tmp/sonobuoy/results",
"Resources": null,
"Filters": {
"Namespaces": ".*",
"LabelSelector": ""
},
"Limits": {
"PodLogs": {
"Namespaces": "kube-system",
"SonobuoyNamespace": true,
"FieldSelectors": [],
"LabelSelector": "",
"Previous": false,
"SinceSeconds": null,
"SinceTime": null,
"Timestamps": false,
"TailLines": null,
"LimitBytes": null
}
},
"QPS": 30,
"Burst": 50,
"Server": {
"bindaddress": "0.0.0.0",
"bindport": 8080,
"advertiseaddress": "",
"timeoutseconds": 21600
},
"Plugins": null,
"PluginSearchPath": [
"./plugins.d",
"/etc/sonobuoy/plugins.d",
"~/sonobuoy/plugins.d"
],
"Namespace": "sonobuoy",
"WorkerImage": "sonobuoy/sonobuoy:v0.56.9",
"ImagePullPolicy": "IfNotPresent",
"ImagePullSecrets": "",
"AggregatorPermissions": "clusterAdmin",
"ServiceAccountName": "sonobuoy-serviceaccount",
"ProgressUpdatesPort": "8099",
"SecurityContextMode": "nonroot"
}
39 changes: 39 additions & 0 deletions cluster-up/cluster/k3d-1.25-sriov/provider.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
#!/usr/bin/env bash

set -e

export CLUSTER_NAME="sriov"
export HOST_PORT=5000

function print_sriov_data() {
nodes=$(_get_agent_nodes)
echo "STEP: Print SR-IOV data"
for node in $nodes; do
echo "Node: $node"
echo "VFs:"
${CRI_BIN} exec $node /bin/sh -c "ls -l /sys/class/net/*/device/virtfn*"
echo "PFs PCI Addresses:"
${CRI_BIN} exec $node /bin/sh -c "grep PCI_SLOT_NAME /sys/class/net/*/device/uevent"
done
echo
}

function print_sriov_info() {
echo 'STEP: Available NICs'
# print hardware info for easier debugging based on logs
${CRI_BIN} run --rm --cap-add=SYS_RAWIO quay.io/phoracek/lspci@sha256:0f3cacf7098202ef284308c64e3fc0ba441871a846022bb87d65ff130c79adb1 sh -c "lspci | egrep -i 'network|ethernet'"
echo
}

function up() {
print_sriov_info
k3d_up

${KUBEVIRTCI_PATH}/cluster/$KUBEVIRT_PROVIDER/config_sriov_cluster.sh

print_sriov_data
version=$(_kubectl get node k3d-sriov-server-0 -o=custom-columns=VERSION:.status.nodeInfo.kubeletVersion --no-headers)
echo "$KUBEVIRT_PROVIDER cluster '$CLUSTER_NAME' is ready ($version)"
}

source ${KUBEVIRTCI_PATH}/cluster/k3d/common.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: sriov
resources:
- sriov-ns.yaml
- sriov-cni-daemonset.yaml
- sriovdp-daemonset.yaml
- sriovdp-config.yaml
patchesJson6902:
- target:
group: apps
version: v1
kind: DaemonSet
name: kube-sriov-cni-ds-amd64
path: patch-node-selector.yaml
- target:
group: apps
version: v1
kind: DaemonSet
name: kube-sriov-device-plugin-amd64
path: patch-node-selector.yaml
- target:
group: apps
version: v1
kind: DaemonSet
name: kube-sriov-device-plugin-amd64
path: patch-sriovdp-resource-prefix.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- multus.yaml
images:
- name: ghcr.io/k8snetworkplumbingwg/multus-cni
newTag: v3.8
patchesJson6902:
- path: patch-args.yaml
target:
group: apps
version: v1
kind: DaemonSet
name: kube-multus-ds
Loading

0 comments on commit ddf34f4

Please sign in to comment.