Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for dummy kernel module #7348

Merged
merged 1 commit into from
Mar 9, 2021

Conversation

lentzi90
Copy link
Contributor

@lentzi90 lentzi90 commented Mar 5, 2021

What type of PR is this?

/kind feature

What this PR does / why we need it:
The dummy kernel module seems to be required at least for node-local-dns, but there is no check to detect if this module is missing. In fact, I found that kubespray runs just fine without it but you get weird network problems in the cluster. For example, the node-local-dns will crash loop since it cannot create any dummy interfaces.

This adds a simple test to abort the installation if the module is not present.

Which issue(s) this PR fixes:

Fixes #7307

Special notes for your reviewer:
I'm not that well versed in kernel modules and networking, so there could well be better ways to do this that I just don't know about. Please let me know in that case and I'll try to address it.

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 5, 2021
@k8s-ci-robot k8s-ci-robot requested review from bozzo and EppO March 5, 2021 06:06
@k8s-ci-robot
Copy link
Contributor

Hi @lentzi90. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 5, 2021
@lentzi90
Copy link
Contributor Author

lentzi90 commented Mar 5, 2021

I see that the CI failed on the test I added. Does anyone know if the cluster that is set up in that test "actually works" or if it is just checking that "kubespray works"?

@oomichi
Copy link
Contributor

oomichi commented Mar 5, 2021

I see that the CI failed on the test I added. Does anyone know if the cluster that is set up in that test "actually works" or if it is just checking that "kubespray works"?

On another PR which passes the job packet_ubuntu20-calico-aio https://gitlab.com/kargo-ci/kubernetes-sigs-kubespray/-/jobs/1075009454 the job checks network connectivity between 2 pods with ping command as https://github.com/kubernetes-sigs/kubespray/blob/master/tests/testcases/030_check-network.yml#L134

/cc @oomichi

@k8s-ci-robot k8s-ci-robot requested a review from oomichi March 5, 2021 21:30
@floryut
Copy link
Member

floryut commented Mar 8, 2021

Ubuntu 20 kvm kernel doesn't have dummy module afaik
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1833375

@lentzi90
Copy link
Contributor Author

lentzi90 commented Mar 8, 2021

Thanks for the comments!

Ok so ping tests work without the dummy module. To me the most obviously broken thing was node-local-dns. Is this used in the pipeline? I saw that there is a test for making sure pods are running so if node-local-dns is used I guess this test should catch it if it is broken. But I didn't manage to figure out what group_vars are used in the pipelines. 🤔

@floryut
Copy link
Member

floryut commented Mar 8, 2021

@lentzi90
Copy link
Contributor Author

lentzi90 commented Mar 8, 2021

Great thanks! Do you think it would make sense to check for the dummy module if nodelocaldns is used?

@champtar
Copy link
Contributor

champtar commented Mar 8, 2021

Great thanks! Do you think it would make sense to check for the dummy module if nodelocaldns is used?

Yes

The dummy module is needed for nodelocaldns.
@lentzi90 lentzi90 force-pushed the check-dummy-module branch from 9c0bb77 to 60f43aa Compare March 9, 2021 06:54
Copy link
Member

@floryut floryut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 9, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: floryut, lentzi90

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 9, 2021
Copy link
Contributor

@oomichi oomichi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 9, 2021
@k8s-ci-robot k8s-ci-robot merged commit 5a54db2 into kubernetes-sigs:master Mar 9, 2021
champtar pushed a commit to champtar/kubespray that referenced this pull request Mar 11, 2021
The dummy module is needed for nodelocaldns.

(cherry picked from commit 5a54db2)
k8s-ci-robot pushed a commit that referenced this pull request Mar 15, 2021
The dummy module is needed for nodelocaldns.

(cherry picked from commit 5a54db2)
LuckySB pushed a commit to southbridgeio/kubespray that referenced this pull request Apr 6, 2021
The dummy module is needed for nodelocaldns.
r3dsm0k3 added a commit to reynencourt/kubespray that referenced this pull request May 3, 2021
* Only use stat get_checksum: yes when needed (kubernetes-sigs#7270)

By default Ansible stat module compute checksum, list extended attributes and find mime type
To find all stat invocations that really use one of those:
git grep -F stat. | grep -vE 'stat.(islnk|exists|lnk_source|writeable)'

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit de1d9df)

Conflicts:
	roles/etcd/tasks/check_certs.yml

* Add kube-ipvs0/nodelocaldns to NetworkManager unmanaged-devices (kubernetes-sigs#7315)

On CentOS 8 they seem to be ignored by default, but better be extra safe
This also make it easy to exclude other network plugin interfaces

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit e442b1d)

* Stop using kubeadm to update server in kubeconfigs (kubernetes-sigs#7338)

Using `kubeadm init phase kubeconfig all` breaks kubelet client certificate rotation
as we are missing `kubeadm init phase kubelet-finalize all` to point to `kubelet-client-current.pem`

kubeconfig format is stable so let's just use lineinfile,
this will avoid other future breakage

This revert to the logic before 6fe2248

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit c9c0c01)

* kubeadm-config.v1beta2.yaml.j2: etcd log level arg (kubernetes-sigs#7339)

According to [etcd's docs](https://etcd.io/docs/v3.4.0/op-guide/configuration/#--log-package-levels), argument 'log-package-levels' should not contain underscores.

(cherry picked from commit b7c2265)

* Remove pre kubeadm cert migration tasks

apiserver.pem is not used since ddffdb6

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit fedd671)

Conflicts:
	roles/kubernetes/master/tasks/kubeadm-cleanup-old-certs.yml
	roles/kubernetes/master/tasks/kubeadm-migrate-certs.yml

* Remove useless call to 'kubeadm version'

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit a6e1f5e)

* Remove admin.conf removal

kubeadm is the default for a long time now,
and admin.conf is created by it, so let kubeadm handle it

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit 280036f)

* Remove rotate_tokens logic

kubeadm never rotates sa.key/sa.pub, so there is no need to delete tokens/restart pods

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit 8800b5c)

* Always backup both certs and kubeconfig

There are no reasons not to backup during upgrade

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit 53e5ef6)

Conflicts:
	roles/kubernetes/master/tasks/kubeadm-backup.yml
	roles/kubernetes/master/tasks/kubeadm-certificate.yml

* Delete misnammed kubeadm-version.yml

The important action in kubeadm-version.yml is the templating of the configuration,
not finding / setting the version

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit a9c97e5)

Conflicts:
	roles/kubernetes/master/tasks/kubeadm-version.yml

* Add privileged_without_host_devices support (kubernetes-sigs#7343)

When privileged is enabled for a container, all the `/dev/*` block
devices from the host are mounted into the guest. The
`privileged_without_host_devices` flag prevents host devices from
being passed to privileged containers.

More information:
* containerd/cri#1225
* cri-o/cri-o@1d0f681

(cherry picked from commit dc5df57)

* ansible and jinja2 updates (kubernetes-sigs#7357)

* Update ansible to v2.9.18

Signed-off-by: Maciej Wereski <[email protected]>

* Update jinja2 to v2.11.3

Signed-off-by: Maciej Wereski <[email protected]>
(cherry picked from commit b07c596)

* Fixup kubelet.conf to point to kubelet-client-current.pem (kubernetes-sigs#7347)

c9c0c01 only fix the problem for new clusters

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit 14b63ed)

Conflicts:
	roles/kubernetes/master/tasks/kubelet-fix-client-cert-rotation.yml

* Check for dummy kernel module (kubernetes-sigs#7348)

The dummy module is needed for nodelocaldns.

(cherry picked from commit 5a54db2)

* Fixup one more missing kubespray-defaults (kubernetes-sigs#7375)

"The error was: 'proxy_disable_env' is undefined\n\nThe error appears to
be in '<censored>scale.yml': line 72, column 7"

Fixes 067db68

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit 057e8b4)

* Upgrade openSUSE Leap to 15.2 (kubernetes-sigs#7331)

15.1 has reached EOL on 2021-02-02.

Signed-off-by: Maciej Wereski <[email protected]>
(cherry picked from commit 69d11da)

* Update kube-ovn to 1.6.0 (kubernetes-sigs#7240)

(cherry picked from commit edc4bb4)

* Minor update to cilium and calico

(cherry picked from commit de46f86)

* Update nodelocaldns to 1.17.1

(cherry picked from commit 5f2c8ac)

* Download Calico KDD CRDs (kubernetes-sigs#7372)

* Download Calico KDD CRDs

* Replace kustomize with lineinfile and use ansible assemble module

* Replace find+lineinfile by sed in shell module to avoid nested loop

* add condition on sed

* use block for kdd tasks + remove supernumerary kdd manifest apply in start "Start Calico resources"

(cherry picked from commit 1c62af0)

Conflicts:
        roles/network_plugin/calico/tasks/install.yml

* Update CNI (calico, kubeovn, multus) and Helm

(cherry picked from commit 05f132c)

* Fix calico crds missing 3.16.9 (kubernetes-sigs#7386)

(cherry picked from commit ead8a4e)

* Update hashes for 1.20.5/1.19.9/1.18.17

(cherry picked from commit 6d3dbb4)

* Set K8S default to v1.19.9

Signed-off-by: Etienne Champetier <[email protected]>

* Auto renew control plane certificates (kubernetes-sigs#7358)

While at it remove force_certificate_regeneration
This boolean only forced the renewal of the apiserver certs
Either manually use k8s-certs-renew.sh or set auto_renew_certificates

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit efa1803)

Conflicts:
	roles/kubernetes/master/templates/k8s-certs-renew.service.j2
	roles/kubernetes/master/templates/k8s-certs-renew.sh.j2
	roles/kubernetes/master/templates/k8s-certs-renew.timer.j2

* Add cryptography installation (kubernetes-sigs#7404)

To avoid ModuleNotFoundError due to no module named 'setuptools_rust',
this adds cryptography installation to requirements.txt.

Created by jfc-evs originally as kubernetes-sigs#7264

(cherry picked from commit 49abf60)

* Allow connecting to bastion via non-standard SSH port (kubernetes-sigs#7396)

* Allow connecting to bastion via non-standard port

* Fix bastion connection when ansible_port is not provided

(cherry picked from commit 6fa3565)

* Correct Jinja Syntax for etcd-unsupported-arch (kubernetes-sigs#6919)

`-%` causes `etcd-unsupported-arch: arm64` to print on COL 1 instead of
COL 6.

Signed-off-by: anthr76 <[email protected]>
(cherry picked from commit edfa3e9)

* Fix k8s-certs-renew for k8s < 1.20 (kubernetes-sigs#7410)

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit 2d1597b)

* Remove ignore_errors from drain tasks and enable retires (kubernetes-sigs#7151)

* Remove ignore_errors from drain tasks and enable retires

* Fix lint error by checking if stdout length is not 0, ie string is not empty.

(cherry picked from commit ccd3aee)

* Fix remove-node by removing jq usage (kubernetes-sigs#7405)

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit 36a3a78)

* Remove left over nodes_to_drain

Signed-off-by: Etienne Champetier <[email protected]>

* remove local lb privileged (kubernetes-sigs#7437) (kubernetes-sigs#7454)

Co-authored-by: Samuel Liu <[email protected]>

* Add new kubernetes hashes (1.19.10, 1.20.6)

* Default to latest kubernetes patch version (1.19.10)

* Update k8s-certs-renew.sh.j2 (kubernetes-sigs#7422)

fix undefinedElse

(cherry picked from commit cce9d31)

* reset roles need flush iptables:raw (kubernetes-sigs#7426)

(cherry picked from commit 7f52c1d)

* Remove calico-rr from local inventory hosts file (kubernetes-sigs#7439)

(cherry picked from commit 596d028)

Conflicts:
	inventory/local/hosts.ini

* Replace deprecated 'with_dict' with 'loop' (kubernetes-sigs#7442)

(cherry picked from commit 6479e26)

* local provisioner 'useNodeNameOnly' option can be configured (kubernetes-sigs#7421)

(cherry picked from commit 7e75d48)

* fix scale (kubernetes-sigs#7449)

(cherry picked from commit 7340a16)

* remove-node roles: fix kubectl absolute path (kubernetes-sigs#7469)

* kubelet absolute path

* kubelet absolute path

(cherry picked from commit e2a7f3e)

* add CI test for auto_renew_certificates (kubernetes-sigs#7472)

* add CI test for auto_renew_certificates

* change timer value

fix typo error in rotate cert script

(cherry picked from commit cce0940)

Conflicts:
	roles/kubernetes/master/templates/k8s-certs-renew.timer.j2

* Remove dead code from kubeadm-etcd (kubernetes-sigs#7470)

(cherry picked from commit aa086e5)

* format ansible output (kubernetes-sigs#7482)

(cherry picked from commit 90c643f)

* Regenerate apiserver.crt on all control-plane nodes (kubernetes-sigs#7463)

We were regenerating only the cert of the first node
While at it speed up the check step

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit e444b3c)

Conflicts:
	roles/kubernetes/master/tasks/kubeadm-setup.yml

* Add auto_renew_certificates_systemd_calendar (kubernetes-sigs#7490)

This allow to configure when K8S certificates renewal runs

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit bf6a39e)

Conflicts:
        inventory/sample/group_vars/k8s-cluster/k8s-cluster.yml
        roles/kubernetes/master/defaults/main/main.yml
        roles/kubernetes/master/templates/k8s-certs-renew.timer.j2

* Check if python netaddr and recent enough jinja are installed (kubernetes-sigs#7486)

CentOS 7 provides up to date Ansible with really old jinja version

Signed-off-by: Etienne Champetier <[email protected]>
(cherry picked from commit 332cc1c)

* Add missing proxy environment in crio_repo.yml (kubernetes-sigs#7492)

(cherry picked from commit 2a2fb68)

Co-authored-by: Etienne Champetier <[email protected]>
Co-authored-by: Du9L.com <[email protected]>
Co-authored-by: Victor Morales <[email protected]>
Co-authored-by: Maciej <[email protected]>
Co-authored-by: Lennart Jern <[email protected]>
Co-authored-by: Florian Ruynat <[email protected]>
Co-authored-by: Erwan Miran <[email protected]>
Co-authored-by: Kenichi Omichi <[email protected]>
Co-authored-by: Kaleb Elwert <[email protected]>
Co-authored-by: Anthony Rabbito <[email protected]>
Co-authored-by: David Louks <[email protected]>
Co-authored-by: bleech1 <[email protected]>
Co-authored-by: Samuel Liu <[email protected]>
Co-authored-by: Fredrik Liv <[email protected]>
Co-authored-by: Helmut Januschka <[email protected]>
Co-authored-by: Maxime Lavandier <[email protected]>
Co-authored-by: orange-llajeanne <[email protected]>
Co-authored-by: Sergey <[email protected]>
Co-authored-by: Krystian Młynek <[email protected]>
@floryut floryut mentioned this pull request May 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Check for dummy kernel module
5 participants