Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General fixes #625

Merged
merged 1 commit into from
May 17, 2023
Merged

General fixes #625

merged 1 commit into from
May 17, 2023

Conversation

r-tierney
Copy link
Contributor

@r-tierney r-tierney commented Apr 27, 2023

This was tested on Debian bookworm with kubernetes version 1.26 and 1.27, calico v3.25

UdpIdleTimeout has been deprecated:

Kubernetes has moved the registry to:

Calico requires a v before the version number without it you get a 404 Example:

container-runtime remote has been deprecated as the only possible value was remote

Apr 27 15:04:22 kubec01-mel kubelet[3790]: Flag --container-runtime-endpoint has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-co>
Apr 27 15:04:22 kubec01-mel kubelet[3790]: Flag --pod-infra-container-image has been deprecated, will be removed in a future release. Image garbage collector will get sandbox image information from CRI.
Apr 27 15:04:22 kubec01-mel kubelet[3790]: E0427 15:04:22.062236    3790 run.go:74] "command failed" err="failed to parse kubelet flag: unknown flag: --container-runtime"

discovery token from kubetool didnt work ( found that i needed to change rsa to pkey ) as we can see from 2 different clusters using the command with rsa gives the same result.

kubec01-rya.ops:/home/users/ryant# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
kubec01-rya.ops:/home/users/ryant# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl pkey -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
adcc248bb5ab39eb750a85f941ccfc6bfd1eef133a5aca57989ccda0eacedbdd
kubec01-rya.ops:/home/users/ryant#
kubec01-san:/home/users/ryant# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
kubec01-san:/home/users/ryant# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl pkey -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
84dcf2e02d0c93f1cfb2fc7af1f7757dcd204c26a6ca286a620a2bd8dc6796c2
kubec01-san:/home/users/ryant#

I found that on Debian the kubelet would constantly crash as the kubelet's default cgroupDriver on Debian is set to systemd

ryant@kubec01-san:~$ cat /var/lib/kubelet/config.yaml | grep -i cgroup
cgroupDriver: systemd
ryant@kubec01-san:~$

and this modules default sets containerd's cgroup_driver to cgroupfs if its not running on redhat ( found in init.pp )
The fix for Debian ( Should this just be the default for both Debian and Redhat now? ) as recommended by kubernetes reference

class { '::kubernetes':
    cgroup_driver => 'systemd',
}

The above change sets the following in containerd's config which causes kubelet and containerd to work on Debian

ryant@kubec01-san:~$ cat /etc/containerd/config.toml | grep -i 'plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options' -A 1
          [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
            SystemdCgroup = true
ryant@kubec01-san:~$

And lastly calico required the mount to be shared:
Error reported

Apr 28 17:00:56 kubec01-san kubelet[11687]: E0428 17:00:56.926812   11687 remote_runtime.go:302] "CreateContainer in sandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to generate container \"866d1ce88b3fc72e4a455ec485dfee0e3ef7cdeb84bc8146dfb949d7c87ce267\" spec: failed to generate spec: path \"/sys/fs/\" is mounted on \"/sys\" but it is not a shared mount" podSandboxID="85df3c3dfa95f1878a4b328f93cbeb021a1df366fdc8214440cd64d44451a364"

The fix ( This solution requires the puppet module mount_core ):

  # For calico requires mount to be shared
  mount { "/" : atboot => yes, options => "rshared", name => "/", ensure => mounted, remounts => true, pass => "0" } ~>
  exec { "/usr/bin/mount --make-rshared /" : refreshonly => true }

Fixes #584

@r-tierney r-tierney requested a review from a team as a code owner April 27, 2023 06:16
@puppet-community-rangefinder
Copy link

kubernetes is a class

Breaking changes to this file MAY impact these 5 modules (near match):

This module is declared in 0 of 580 indexed public Puppetfiles.


These results were generated with Rangefinder, a tool that helps predict the downstream impact of breaking changes to elements used in Puppet modules. You can run this on the command line to get a full report.

Exact matches are those that we can positively identify via namespace and the declaring modules' metadata. Non-namespaced items, such as Puppet 3.x functions, will always be reported as near matches only.

@CLAassistant
Copy link

CLAassistant commented Apr 27, 2023

CLA assistant check
All committers have signed the CLA.

@r-tierney
Copy link
Contributor Author

r-tierney commented May 15, 2023

Discovery token fixed in:
1871b5c

Using pkey instead of rsa

@jordanbreen28
Copy link
Contributor

@r-tierney this is brilliant - can you rebase the PR?

@r-tierney
Copy link
Contributor Author

Thanks @jordanbreen28, I've updated this branch from main

@jordanbreen28
Copy link
Contributor

@r-tierney apologies... I missed the notification.
Can you rebase once again and clean up the merge commits? Then we can get this progressed.
Thanks

@r-tierney
Copy link
Contributor Author

r-tierney commented May 17, 2023

@jordanbreen28 Sure thing, updating now

@r-tierney
Copy link
Contributor Author

rebase complete

@jordanbreen28
Copy link
Contributor

Nice one @r-tierney - I'll merge in once green!
thanks again for this massive effort.

@r-tierney
Copy link
Contributor Author

r-tierney commented May 17, 2023

Not a problem at all, glad to help.

The issue which took the longest to troubleshoot was actually this modules default setting for the cgroup_driver located on line 741 of init.pp which had it set to cgroupfs by default instead of systemd and would cause a conflict with kubelet as kubelets default setting on Debian is systemd.

With the pods and kubelet crashlooping it took some time to work out that was the issue as without the kubelet running it's hard to run a kubectl describe etc to figure out why a pod is crashlooping.

I understand changing the default for a setting like this may break those not running systemd so I left it out of this pull request but thought I'd mention it anyway and leave the decision up to your team whether or not to change it or add a mention in some docs somewhere.

@jordanbreen28
Copy link
Contributor

Not a problem at all, glad to help.

The issue which took the longest to troubleshoot was actually this modules default setting for the cgroup_driver located on line 741 of init.pp which had it set to cgroupfs by default instead of systemd and would cause a conflict with kubelet as kubelets default setting on Debian is systemd.

With the pods and kubelet crashlooping it took some time to work out that was the issue as without the kubelet running it's hard to run a kubectl describe etc to figure out why a pod is crashlooping.

I understand changing the default for a setting like this may break those not running systemd so I left it out of this pull request but thought I'd mention it anyway and leave the decision up to your team whether or not to change it or add a mention in some docs somewhere.

Yeah the removal of cgroupfs as the default driver would need to be part of a major release due to the high possibility it may break things, we would need to document this also.
Systemd is now the recommended for both debian and rhel based distros, so should probably be progressed in the next major release.

If you want to go ahead and create a seperate PR for that, I will try to ensure its included in the next major release (which should be in the next week or two due to puppet 8).

Anyways, happy to merge this! 🥇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Calico install doesn't work
4 participants