Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Jenkins CI #163

Merged
merged 1 commit into from
Dec 1, 2021

Conversation

abdallahyas
Copy link
Contributor

This patch adds a Jenkins file example, it also modifies the E2E
kind script to support a new mode of operation: the bash service.
The modification was done to support running the scripts in a limited
privilege shell.

@abdallahyas
Copy link
Contributor Author

Another jenkins job is configured to update the admin list. It needs to be run from a PR and can be triggered using /update-admins

@abdallahyas
Copy link
Contributor Author

abdallahyas commented Jul 15, 2021

currently the jobs are disabled, tested them on my fork.

@abdallahyas
Copy link
Contributor Author

@martinkennelly @adrianchiris can you guys PTAL, thanks.

@adrianchiris
Copy link
Collaborator

Thanks @AbdYsn ! will review this soon

Copy link
Collaborator

@adrianchiris adrianchiris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting this !

I still need to give vf-netns-switcher.sh a deeper look. provided some comments/questions/nits so we can get the ball rolling :)

im thinking after this we should split the kind cluster setup from tests, this will allow to reuse this for development environment setup as well.

hack/e2e.md Outdated Show resolved Hide resolved
hack/e2e.md Outdated Show resolved Hide resolved
hack/e2e.md Outdated Show resolved Hide resolved
hack/e2e.md Outdated Show resolved Hide resolved
hack/e2e.md Outdated Show resolved Hide resolved
hack/vf-netns-switcher.sh Show resolved Hide resolved
hack/vf-netns-switcher.sh Outdated Show resolved Hide resolved
done
sleep $TIMEOUT
done
return $status
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will we ever reach this point ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, removed it.

fi

variables_check
let status=$status+$?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless i missed something, status is first defined here why do not use $? directly ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are correct, done out of habit, fixed it.

hack/vf-netns-switcher.sh Show resolved Hide resolved

There are two modes of moving the specified SR-IOV capable PCI net device to the KIND worker namespace:

* `test-suit` (default): In this mode, the E2E test suit handle the PF and its VFs switching to the test namespace.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

ci/README.md Outdated
@@ -0,0 +1,5 @@
## Jenkins CI configurations and examples
This folder holds jenkins CI configurations and examples. The configurations currently are only limited to the admin-list. The admin list contains the list of github users and organizations that have permision to trigger the Jenkins CI, it is updated on the PR by commenting `/update-admins` on it by a current admin.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configurations currently are only limited to the admin-list

what does it mean ? is this information on the examples below ?

can you structure this README so we have:

  1. General informaiton about this folder
  2. Information on admin list - what it is, who should use is, how its updated
  3. information on the examples
  4. information on the format of the build triggers for e2e tests (as presented in NPWG resource mgmt meeting)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, can you have a look?

### How to test
#### `test-suit` SRIOV device switcher
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: test-suite

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested reword : Device netns switcher mode test-suite

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

cp ./hack/vf-switcher.service /etc/systemd/system/
systemctl daemon-reload
```
For the service to work probably the `jq` tool is needed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: probably -> properly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

doc/testing-kind.md Show resolved Hide resolved

get_pci_from_net_name(){
local interface_name=$1
local worker_netns="${2:-$netns}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just $2 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

hack/vf-netns-switcher.sh Outdated Show resolved Hide resolved
hack/vf-netns-switcher.sh Outdated Show resolved Hide resolved
hack/vf-netns-switcher.sh Outdated Show resolved Hide resolved
hack/vf-netns-switcher.sh Outdated Show resolved Hide resolved
@abdallahyas abdallahyas force-pushed the testing-ci branch 2 times, most recently from 6aada80 to ca56ec9 Compare October 26, 2021 06:57
export TEST_NETNS_PATH="${netns_path}"
if [[ "${INTERFACES_SWITCHER}" == "system-service" ]];then
pf="$(ls /sys/bus/pci/devices/${test_pf_pci_addr}/net)"
cat <<EOF > /etc/vf-switcher/vf-switcher.yaml
Copy link
Collaborator

@adrianchiris adrianchiris Nov 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to make sure directory exists

Getting the following:

## make KinD's sysfs writable (required to create VFs)                                                                                                                                                         ## label KinD's control-plane-node as sriov capable                                                                                                                                                              node/kind-worker labeled                                                                                                                                                                                         ## label KinD worker as worker                                                                                                                                                                                   node/kind-worker labeled                                                                                                                                                                                         ./hack/run-e2e-test-kind.sh: line 112: /etc/vf-switcher/vf-switcher.yaml: No such file or directory 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

#### Cleaning up the `system-service` service files
```
$ sudo rm -f /etc/systemd/system/vf-switcher.service
$ rm -f /etc/vf-switcher/vf-switcher.yaml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is deleted in teardown-e2e-kind-cluster.sh so i dont think you need it here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@@ -4,11 +4,34 @@ here="$(dirname "$(readlink --canonicalize "${BASH_SOURCE[0]}")")"
root="$(readlink --canonicalize "$here/..")"
export SRIOV_NETWORK_OPERATOR_IMAGE="${SRIOV_NETWORK_OPERATOR_IMAGE:-sriov-network-operator:latest}"
export SRIOV_NETWORK_CONFIG_DAEMON_IMAGE="${SRIOV_NETWORK_CONFIG_DAEMON_IMAGE:-origin-sriov-network-config-daemon:latest}"
export KUBECONFIG="${KUBECONFIG:-${HOME}/.kube/config}"
INTERFACES_SWITCHER="${INTERFACES_SWITCHER:-"test-suit"}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suit -> suite

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -4,11 +4,34 @@ here="$(dirname "$(readlink --canonicalize "${BASH_SOURCE[0]}")")"
root="$(readlink --canonicalize "$here/..")"
export SRIOV_NETWORK_OPERATOR_IMAGE="${SRIOV_NETWORK_OPERATOR_IMAGE:-sriov-network-operator:latest}"
export SRIOV_NETWORK_CONFIG_DAEMON_IMAGE="${SRIOV_NETWORK_CONFIG_DAEMON_IMAGE:-origin-sriov-network-config-daemon:latest}"
export KUBECONFIG="${KUBECONFIG:-${HOME}/.kube/config}"
INTERFACES_SWITCHER="${INTERFACES_SWITCHER:-"test-suit"}"
SUPPORTED_INTERFACE_SWTICHER_MODES=("test-suit", "system-service")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suit -> suite

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@abdallahyas abdallahyas force-pushed the testing-ci branch 3 times, most recently from ef2664b to a9a625a Compare November 4, 2021 08:03
ci/README.md Outdated
@@ -0,0 +1,23 @@
## CI configurations and examples
This folder holds vendors CIes configurations and examples. Configurations are used to control the vendors CIes behaviors, and examples are used as a reference for other vendors to be able to setup a CI of their own.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first time seeing "CIes", can you just use CI throughout this file (or CIs)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, changed it to Vendors CI

@abdallahyas abdallahyas force-pushed the testing-ci branch 2 times, most recently from b60185a to 10974b1 Compare November 4, 2021 12:03
* `sub-test`: In case there are many tests for the `test-type`, this field specify what sub-test to run.

In addition to the convention, all vendors CI should be triggered on the general phrase `/test-all`

Copy link
Collaborator

@adrianchiris adrianchiris Nov 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a couple of things that are unclear to me.

can you elaborate on the expected CI behaviour when <sub-test> is not provided in test/skip and when both <vendor> and <sub-test> are not provided in test/skip ?

also what is the expected behaviour with skip in general ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing would happen, essentially a job gets triggered on a specific trigger phrase, if you replace one section with all all the tests for that section will get triggered.
To elaborate, consider a test called e2e-nvidia-cx5, if you want to test it, you should comment one of the following:

  • /test-e2e-nvidia-cx5
  • /test-e2e-nvidia-all
  • /test-e2e-all
  • /test-all

For the skip, it would be a separate job which gets triggered on the /skip phrases and the same trigger convention as the /test convention above. The job would just exit 0 to have a green report on the PR with the same status-context as the test job.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see if my understanding is correct for the /test and /skip usage:

  1. comment /test-e2e-nvidia-all, it runs on all sub-tests (e.g. cx-5 and cx-6).
  2. assume cx-5 sub-test fails, the PR is blocked due to CI failure
  3. then comment skip-e2e-nvidia-cx5, it updates the CI failure to green (success)
  4. the PR is unblocked due to the above skip comment

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sub-test is used to specify the device model, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite, the patch should be analyze, and determined whether it should be tested or not against the vendor CI. If it should then the /test is used to trigger the job and merge will be blocked until the CI is green (either by retesting the test, or uploading a new patchset), ideally the CI would only fail because of a change in the patch and logs should be provided to be able to debug why the patch failed.
Only when it is determined that there is no need to test this test against this patch, then the /skip should be used to pass the test.

well it is not just device, it can also be other stuff, say openshift, kind, and baremetal can also be added as a sub-test. A note here is that another github workflow will be added after this patch is merged to comment available CI triggers on PR creation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for the explanation!
I think we don't define openshift, kind or baremetal yet, right? they are just possible examples, could be well be a different type (platform for example) in the test- convention.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't define openshift, kind or baremetal yet, right?

right

could be well be a different type (platform for example) in the test- convention

yes

@@ -7,3 +7,8 @@ if ! command -v kind &> /dev/null; then
fi

kind delete cluster
if [[ "${INTERFACES_SWITCHER}" == "system-service" ]];then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not work if you use --device-netns-switcher

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, can you have a look.

@abdallahyas abdallahyas force-pushed the testing-ci branch 2 times, most recently from b22f5b4 to a3568f4 Compare November 11, 2021 07:59
@adrianchiris
Copy link
Collaborator

testing mellanox CI:

/test-all

@adrianchiris
Copy link
Collaborator

/test-all

@adrianchiris
Copy link
Collaborator

/test-all

@adrianchiris
Copy link
Collaborator

/skip-all

Copy link
Collaborator

@adrianchiris adrianchiris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI triggers work. and Nvidia CI is running.

@zshi-redhat
Copy link
Collaborator

/test-e2e-nvidia-cx5

@zshi-redhat
Copy link
Collaborator

/test-e2e-nvidia-cx5

It seems the CI is not triggered, maybe I need to wait until this PR is merged so that admin list takes effect?

@zshi-redhat
Copy link
Collaborator

Is there a way to check the jenkins log? I didn't see the Details link in the Nvidia e2e test CI (maybe it is because the test is marked as skipped now)?

@Eoghan1232
Copy link
Collaborator

/test-all

@Eoghan1232
Copy link
Collaborator

/test-all

I assume as I am not on Admin list, I cannot run Jenkins job?
@AbdYsn

@abdallahyas
Copy link
Contributor Author

/test-e2e-nvidia-cx5

It seems the CI is not triggered, maybe I need to wait until this PR is merged so that admin list takes effect?

That actually because there is no such trigger. currently because there is only a single test, and i did not want to tie it to CX5 devices, i did not add a subtest to job trigger, and it can be trigger using the /test-e2e-nvidia-all trigger. Also since the NVIDIA jenkins instance can not be accessed outside the company, the polling mechanism is used to trigger the jobs against the webhook mechanism. so there is a delay of at most 5 minutes until the trigger will take effet.

@abdallahyas
Copy link
Contributor Author

Is there a way to check the jenkins log? I didn't see the Details link in the Nvidia e2e test CI (maybe it is because the test is marked as skipped now)?

my mistake, forgot to enable them in the CI configuration, will do that shortly

@abdallahyas
Copy link
Contributor Author

abdallahyas commented Nov 30, 2021

/test-all

I assume as I am not on Admin list, I cannot run Jenkins job? @AbdYsn

Correct, If you want to be added to the admin-list, then a patch should be proposed to add you to the admin list, and for you to actually be added, on the proposed patch a current admin should comment /update-admins and then merge the commit. A note here is that there is a daily job that would read the main branch admin-list and update the job admin list accordingly, so the admin list will be overridden once each day to align it with main branch. The other way to update it is to contact the maintainer of the verndor CI and ask him to update it manually.

@Eoghan1232
Copy link
Collaborator

Eoghan1232 commented Nov 30, 2021

/test-all

I assume as I am not on Admin list, I cannot run Jenkins job? @AbdYsn

Correct, If you want to be added to the admin-list, then a patch should be proposed to add you to the admin list, and for you to actually be added, on the proposed patch a current admin should comment /update-admins and then merge the commit. A note here is that there is a daily job that would read the main branch admin-list and update the job admin list accordingly, so the admin list will be overridden once each day to align it with main branch. The other way to update it is to contact the maintainer of the verndor CI and ask him to update it manually.

Ah, I see.
Just to clarify.
Steps:
PR -> adding user to the admins list. (admin-list.yaml)
Admin comments: '/update-admins' before it is merged.
Then merge after.

Copy link
Collaborator

@Eoghan1232 Eoghan1232 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Thank you for working on this!

@zshi-redhat
Copy link
Collaborator

/test-e2e-nvidia-all

1 similar comment
@abdallahyas
Copy link
Contributor Author

/test-e2e-nvidia-all

@abdallahyas
Copy link
Contributor Author

/test-all

1 similar comment
@abdallahyas
Copy link
Contributor Author

/test-all

@abdallahyas
Copy link
Contributor Author

/test-e2e-nvidia-all

2 similar comments
@abdallahyas
Copy link
Contributor Author

/test-e2e-nvidia-all

@abdallahyas
Copy link
Contributor Author

/test-e2e-nvidia-all

@abdallahyas
Copy link
Contributor Author

@zshi-redhat can you try triggering it now with any of the supported phrases?

@zshi-redhat
Copy link
Collaborator

/test-e2e-nvidia-all

This patch adds a Jenkins file example, it also modifies the E2E
kind script to support a new mode of operation: the bash service.
The modification was done to support running the scripts in a limited
privilege shell.
@abdallahyas
Copy link
Contributor Author

abdallahyas commented Dec 1, 2021

/test-all

I assume as I am not on Admin list, I cannot run Jenkins job? @AbdYsn

Correct, If you want to be added to the admin-list, then a patch should be proposed to add you to the admin list, and for you to actually be added, on the proposed patch a current admin should comment /update-admins and then merge the commit. A note here is that there is a daily job that would read the main branch admin-list and update the job admin list accordingly, so the admin list will be overridden once each day to align it with main branch. The other way to update it is to contact the maintainer of the verndor CI and ask him to update it manually.

Ah, I see. Just to clarify. Steps: PR -> adding user to the admins list. (admin-list.yaml) Admin comments: '/update-admins' before it is merged. Then merge after.

@Eoghan1232 Yes, if you want to have the admin list updated immediately.

@abdallahyas
Copy link
Contributor Author

Checking the logs, /test-all

@adrianchiris adrianchiris merged commit 5802d2f into k8snetworkplumbingwg:master Dec 1, 2021
@adrianchiris
Copy link
Collaborator

@AbdYsn BIG thanks for working on it ! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants