Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing ci #1

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions ci/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
## CI configurations and examples
This folder holds vendors CI configurations and examples. Configurations are used to control the vendors CI behaviors, and examples are used as a reference for other vendors to be able to setup a CI of their own.

### Admin list
The admin list contains the list of github users and organizations that have permission to trigger the vendors CI. Only trusted users who have merge permissions should be on the list. The vendors should be responsible for how the admin list on their CI is updated, but to keep everything organized the vendors CI should at least update their admin list in response to a PR comment with the following phrase `/update-admins`.

### CI Examples
The examples folder contains configuration examples for vendor CI. It can be used as a reference for vendors to setup their CI. For more information on an example refer to that folder README.

### Vendors CI triggers convention
A vendor CI trigger phrases should follow the following convention:

```
/test-<test-type>-<vendor>-<sub-test>
/skip-<test-type>-<vendor>-<sub-test>
```

where:
1. `test-type`: The type of test to conduct on the vendor's setups, for example: e2e. It can be replaced with `all` to test all tests types and vendors, currently the following values are supported:
* `e2e`: Runs the project's e2e make rule on the specific vendor setup.
* `all`: Runs all test types and all their tests for all vendors.

2. `vendor`: The vendor that implemented the CI, currently the following values are supported:
* `nvidia`: Runs tests implemented by NVIDIA, the following test types are supported: e2e.
* `all`: Runs the specified test type and all its sub tests for all vendors.

3. `sub-test`: In case there are many tests for the `test-type`, this field specify what sub-test to run. currently the following values are supported:
* `all`: Runs all subtest of the specified vendor.

Note that *all fields are required* except if `all` is added to the phrase, in that case, there must not be any field after the `all`. This means that the following phrases are supported when `all` is used:
1. `/test-all`
2. `/test-<test-type>-all`
3. `/test-<test-type>-<vendor>-all`

But the following are not:
1. `/test-all-<vendor>`
2. `/test-all-<vendor>-<sub-test>`
3. `/test-<test-type>-all-<sub-test>`

The skip would report pass without running the test, and its triggers follow the above convention as well.

7 changes: 7 additions & 0 deletions ci/admin-list.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
admin-list:
- adrianchiris
- pliurh
- zshi-redhat
#org-list:
# - <org>

5 changes: 5 additions & 0 deletions ci/examples/jenkins/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
## Jenkins CI examples
This folder holds examples for jenkins CI.

### sriov-network-operator-ci.yaml
This file holds an example jenkins-job-builder configuration that would be triggered on PRs by the admin list, and would simply run the `hack/run-e2e-test-kind.sh` script with `system-service` netns device switcher.
92 changes: 92 additions & 0 deletions ci/examples/jenkins/sriov-network-operator-ci.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
- project:
name: sriov-network-operator-github-ci
jobs:
- 'sriov-network-operator-ci':
project: sriov-network-operator
disabled_var: false
concurrent: false
node: <node label>
git-site: https://github.com
git-root: k8snetworkplumbingwg
git-project: sriov-network-operator

- job-template:
name: 'sriov-network-operator-ci'
node: '{node}'
builders:
- inject:
properties-content: |
KUBECONFIG=/etc/kubernetes/admin.conf
INTERFACES_SWITCHER=system-service
- run-e2e-test
concurrent: false
description: <!-- Managed by Jenkins Job Builder -->
disabled: false
project-type: freestyle
properties:
- build-discarder:
artifact-days-to-keep: 60
artifact-num-to-keep: 100
days-to-keep: 60
num-to-keep: 100
- github:
url: '{git-site}/{git-root}/{git-project}'
scm:
- git:
branches: ["${{sha1}}"]
credentials-id: '{credentials-id}'
name: '{git-project}'
refspec: +refs/pull/*:refs/remotes/origin/pr/*
url: '{git-site}/{git-root}/{git-project}'
wipe-workspace: true
triggers:
- github-pull-request:
admin-list:
- mellanox-ci
allow-whitelist-orgs-as-admins: true
org-list:
- Mellanox
auth-id: '{auth-id}'
auto-close-on-fail: false
build-desc-template: null
cron: H/5 * * * *
github-hooks: false
only-trigger-phrase: true
cancel-builds-on-update: true
permit-all: false
status-url: --none--
success-status: "Build Passed"
failure-status: "Build Failed, comment `/test-e2e-all`, /test-e2e-nvidia-all, or `/test-all` to retrigger"
error-status: "Build Failed, comment `/test-e2e-all`, /test-e2e-nvidia-all, or `/test-all` to retrigger"
status-context: '{project} CI'
trigger-phrase: ".*/test-(all|e2e-all|e2e-nvidia-all(,| |$)).*"
white-list:
- '*'
white-list-target-branches:
- master
- github
wrappers:
- timeout:
timeout: 120
fail: true
- timestamps

- builder:
name: run-e2e-test
builders:
- shell: |
#!/bin/bash
status=0
./hack/teardown-e2e-kind-cluster.sh
sleep 5
# This line is vendor specific, it should be changed according to hardware.
mlnx_pci=$(lspci | grep Mellanox | grep -Ev 'MT27500|MT27520|Virt' | head -n 1 | awk '{print $1}')
./hack/run-e2e-test-kind.sh 0000:${mlnx_pci}
let status=$status+$?
./hack/teardown-e2e-kind-cluster.sh
sleep 5
exit $status
50 changes: 47 additions & 3 deletions doc/testing-kind.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,51 @@
## E2E test with KinD
Kubernetes IN Docker (KIND) is a tool to deploy Kubernetes inside Docker containers. It is used to test multi nodes scenarios on a single baremetal node.
To run the E2E tests inside a KIND cluster, `./hack/run-e2e-test-kind.sh` can be used. The script performs the following operations:

* Deploys a 2 node KIND cluster (master and worker)
* Moves the specified SR-IOV capable PCI net device to KIND worker namespace
* Deploys the operator
* Runs E2E tests

There are two modes of moving the specified SR-IOV capable PCI net device to the KIND worker namespace:

* `test-suite` (default): In this mode, the E2E test suite handle the PF and its VFs switching to the test namespace.
* `system-service` mode: In this mode a dedicated system service is used to switch the PF and VFs to the test namespace.

The mode can be selected using the `INTERFACES_SWITCHER` environment variable, or by passing the mode to the `./hack/run-e2e-test-kind.sh` script using the `--device-netns-switcher` flag.

### How to test
To execute E2E tests, a SR-IOV Physical Function device is required and will be added to a KinD workers network namespace.
To execute E2E tests, a SR-IOV Physical Function device is required and will be added to a KinD workers network namespace, depending on the device netns switcher method, the testing steps can defer.

Note: Test device will remain in KinD worker node until cluster is terminated.

#### Device netns switcher mode `test-suite`
```
$ git clone https://github.com/k8snetworkplumbingwg/sriov-network-operator.git
$ cd sriov-network-operator/
$ source hack/get-e2e-kind-tools.sh
$ export TEST_PCI_DEVICE=0000:02:00.0
$ sudo ./hack/run-e2e-test-kind.sh $TEST_PCI_DEVICE
$ sudo ./hack/run-e2e-test-kind.sh --pf-pci-address $TEST_PCI_DEVICE
```

#### Device netns switcher mode `system-service`
The `system-service` mode uses a linux service to handle the interface switching. To prepare the service, the following needs to be done as root:
```
cp ./hack/vf-netns-switcher.sh /usr/bin/
cp ./hack/vf-switcher.service /etc/systemd/system/
systemctl daemon-reload
```
For the service to work properly the `jq` tool is needed.

To run the E2E tests do:
```
$ git clone https://github.com/k8snetworkplumbingwg/sriov-network-operator.git
$ cd sriov-network-operator/
$ source hack/get-e2e-kind-tools.sh
$ KUBECONFIG=/etc/kubernetes/admin.conf
$ INTERFACES_SWITCHER=system-service
$ ./hack/run-e2e-test-kind.sh --pf-pci-address <interface pci>
```
Note: Test device will remain in KinD worker node until cluster is terminated.

### How to repeat test using existing KinD cluster
Export test PCI device used to set up KinD cluster and export KinD worker network namespace path:
Expand All @@ -25,5 +62,12 @@ $ sudo make test-e2e-k8s
$ ./hack/teardown-e2e-kind-cluster.sh
```

#### Cleaning up the `system-service` service files
```
$ sudo rm -f /etc/systemd/system/vf-switcher.service
$ sudo rm -f /usr/bin/vf-netns-switcher.sh
$ sudo systemctl daemon-reload
```

### Known limitations / issues
* Webhooks are disabled by default when testing
67 changes: 57 additions & 10 deletions hack/run-e2e-test-kind.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,41 @@ here="$(dirname "$(readlink --canonicalize "${BASH_SOURCE[0]}")")"
root="$(readlink --canonicalize "$here/..")"
export SRIOV_NETWORK_OPERATOR_IMAGE="${SRIOV_NETWORK_OPERATOR_IMAGE:-sriov-network-operator:latest}"
export SRIOV_NETWORK_CONFIG_DAEMON_IMAGE="${SRIOV_NETWORK_CONFIG_DAEMON_IMAGE:-origin-sriov-network-config-daemon:latest}"
export KUBECONFIG="${KUBECONFIG:-${HOME}/.kube/config}"
INTERFACES_SWITCHER="${INTERFACES_SWITCHER:-"test-suite"}"
SUPPORTED_INTERFACE_SWTICHER_MODES=("test-suite", "system-service")
RETRY_MAX=10
INTERVAL=10
TIMEOUT=300
MULTUS_CNI_DS="https://raw.githubusercontent.com/intel/multus-cni/master/images/multus-daemonset.yml"
test_pf_pci_addr="$1"
MULTUS_CNI_DS="https://raw.githubusercontent.com/intel/multus-cni/master/deployments/multus-daemonset.yml"

while test $# -gt 0; do
case "$1" in
--device-netns-switcher)
INTERFACES_SWITCHER="$2"
if [[ ! "${SUPPORTED_INTERFACE_SWTICHER_MODES[@]}" =~ "${INTERFACES_SWITCHER}" ]]; then
echo "Error: unsupported interface switching mode: ${INTERFACES_SWITCHER}!"
echo "Supported modes are: ${SUPPORTED_INTERFACE_SWTICHER_MODES[@]}"
exit 1
fi
shift
shift
;;

--pf-pci-address)
test_pf_pci_addr="$2"
shift
shift
;;

*)
if [[ -z "$test_pf_pci_addr" ]];then
test_pf_pci_addr=$1
fi
shift
;;
esac
done

check_requirements() {
for cmd in docker kind kubectl ip; do
Expand Down Expand Up @@ -50,13 +80,14 @@ retry() {
echo "## checking requirements"
check_requirements
echo "## delete any existing cluster, deploy control & data plane cluster with KinD"
retry kind delete cluster && cat <<EOF | kind create cluster --config=-
retry kind delete cluster && cat <<EOF | kind create cluster --kubeconfig=${KUBECONFIG} --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
EOF
sudo chmod 644 ${KUBECONFIG}
echo "## build operator image"
retry docker build -t "${SRIOV_NETWORK_OPERATOR_IMAGE}" -f "${root}/Dockerfile" "${root}"
echo "## load operator image into KinD"
Expand All @@ -67,8 +98,6 @@ echo "## load daemon image into KinD"
kind load docker-image "${SRIOV_NETWORK_CONFIG_DAEMON_IMAGE}"
echo "## export kube config for utilising locally"
kind export kubeconfig
echo "## exporting KUBECONFIG environment variable to access KinD K8 API server"
export KUBECONFIG="${HOME}/.kube/config"
echo "## wait for coredns"
retry kubectl -n kube-system wait --for=condition=available deploy/coredns --timeout=${TIMEOUT}s
echo "## install multus"
Expand All @@ -85,14 +114,32 @@ echo "## label KinD's control-plane-node as sriov capable"
kubectl label node kind-worker feature.node.kubernetes.io/network-sriov.capable=true --overwrite
echo "## label KinD worker as worker"
kubectl label node kind-worker node-role.kubernetes.io/worker= --overwrite
echo "## retrieving netns path from container"
netns_path="$(docker inspect --format '{{ .NetworkSettings.SandboxKey }}' "${kind_container}")"
echo "## exporting test device '${test_pf_pci_addr}' and test netns path '${netns_path}'"
export TEST_PCI_DEVICE="${test_pf_pci_addr}"
export TEST_NETNS_PATH="${netns_path}"
if [[ "${INTERFACES_SWITCHER}" == "system-service" ]];then
pf="$(ls /sys/bus/pci/devices/${test_pf_pci_addr}/net)"
mkdir -p /etc/vf-switcher
cat <<EOF > /etc/vf-switcher/vf-switcher.yaml
[
{
"netns": "${kind_container}",
"pfs": [
"${pf}"
]
}
]
EOF
sudo systemctl restart vf-switcher.service
else
echo "## retrieving netns path from container"
netns_path="$(sudo docker inspect --format '{{ .NetworkSettings.SandboxKey }}' "${kind_container}")"
echo "## exporting test device '${test_pf_pci_addr}' and test netns path '${netns_path}'"
export TEST_PCI_DEVICE="${test_pf_pci_addr}"
export TEST_NETNS_PATH="${netns_path}"
fi
echo "## disabling webhooks"
export ENABLE_ADMISSION_CONTROLLER=false
echo "## deploying SRIOV Network Operator"
make --directory "${root}" deploy-setup-k8s
echo "## wait for sriov-network-config-daemon to be ready"
retry kubectl -n sriov-network-operator wait --for=condition=ready -l app=sriov-network-config-daemon pod --timeout=${TIMEOUT}s
echo "## Executing E2E tests"
make --directory "${root}" test-e2e-k8s
5 changes: 5 additions & 0 deletions hack/teardown-e2e-kind-cluster.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,9 @@ if ! command -v kind &> /dev/null; then
exit 1
fi

if systemctl is-active vf-switcher.service -q;then
sudo systemctl stop vf-switcher.service
fi
sudo rm -f /etc/vf-switcher/vf-switcher.yaml
kind delete cluster

Loading