-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3.27.2 mount-bpffs init container fails to load libpcap on Rockylinux9.3/Arm64 #8542
Comments
When you try 3.27.0 does the ebpf dataplane come up correctly or the init containers only do not fail? There was definitely a regression in 3.27.0 not building ebpf for arm correctly. That got fixed #8470 but it may not completely bring it back. This said, 3.27.0 may be just a false positive. Could you share calico-node logs from 3.27.0 just for verification? Would you be able to provide more logs from the failed 3.27.2 init container? |
Okay, I've collected some of the logs, so hopefully that will be helpful |
I tried v3.27.1 later and got the same results as v3.27.2 |
First of all, you did not enable BPF dataplane, right? The logs show that BFPEnabled is false. But it seems like 3.27.0 is not quite healthy either:
Is Let me check why is 3.27.2 trying to run mount-bpffs when bpf is disabled 🤔 |
Yes, I didn't enable BPF dataplane because I didn't find the relevant configuration item in my previous deployment method, but the cluster's network is able to forward data traffic normally. Pod status: [root@k8s-master ~]# kubectl get svc Traffic test: Eq: [root@k8s-master ~]# curl http://$(hostname -i):$(kubectl get svc nginx -o jsonpath={.spec.ports..nodePort}) Eq: But I'm sure the /var/run/calico directory is readable and writable. This is the result of viewing it with the command: [root@k8s-master ~]# ls /var/run/calico/ [root@k8s-master ~]# ls /var/run/calico/ -l And the SELINUX is disable: |
Could you provide us with logs for the failing |
Sorry, I found it. [root@k8s-master docker.io]# crictl logs 5a1 |
libpcap issue is related to #8541. |
Okay, it looks like it's a problem when building the image, and this is the library used by RockyLinux 9.3 |
[root@k8s-slave1 lib64]# ls | grep libpca |
Guys, I found a temporary workaround, you can install an old version of the libpcap package via yum and then modify calico.yaml to get him running: Note that this step replaces the libpcap.so.1.9.1 library on the system:yum install -y https://dl.rockylinux.org/pub/rocky/8/Devel/aarch64/os/Packages/l/libpcap-1.9.1-5.el8.aarch64.rpm Modify calico.yaml
PhotosFinally, it works |
Maybe it's the best solution.
It also run it on v3.27.1 |
@NorthSkybk thank you for sharing your workaround and reporting that issue. We will try to come up with a proper fix for that. |
When I install the calico network plugin after initializing the kubernetes cluster the following occurs:
[root@k8s-master docker.io]# kubectl create -f calico.yaml
......
[root@k8s-master docker.io]# kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-68cdf756d9-r75jj 0/1 Pending 0 3s
kube-system calico-node-25ctj 0/1 Init:Error 0 3s
kube-system calico-node-7t8qv 0/1 Init:2/3 0 3s
kube-system calico-node-qrnsr 0/1 Init:1/3 0 3s
kube-system coredns-857d9ff4c9-j6jmj 0/1 Pending 0 105s
kube-system coredns-857d9ff4c9-nh8tf 0/1 Pending 0 98s
......
You can see that the state quickly switches to Init:Error in a very short time.
By describing the analysis, I found the keyword:
[root@k8s-master docker.io]# kubectl describe -n kube-system pods calico-node-25ctj
Events:
Type Reason Age From Message
Warning BackOff 4m59s (x24 over 9m58s) kubelet Back-off restarting failed container mount-bpffs in pod calico-node-25ctj_kube-system(ec997881-48b9-4bc0-9203-d25ef3171052)
......
When I looked at the logs I found one error that appeared more frequently:
[root@k8s-master docker.io]# cat /var/log/messages | grep "qrnsr"
......
Feb 22 01:48:00 localhost kubelet[15596]: E0222 01:48:00.536276 15596 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to "StartContainer" for "mount-bpffs" with CrashLoopBackOff: "back-off 5m0s restarting failed container=mount-bpffs pod=calico-node-qrnsr_kube-system(5b0c855d-482c-4cc0-98f7-da9ae03070c1)"" pod="kube-system/calico-node-qrnsr" podUID="5b0c855d-482c-4cc0-98f7-da9ae03070c1"
......
I've used the mount -l command to check that my system has the bpffs device.
[root@k8s-master docker.io]# mount -l | grep bpf
bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
......
I've tried many things, but nothing works.
Then I tried switching the calico version to v3.27.0 and it worked!
[root@k8s-master docker.io]# kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5fc7d6cf67-lcczd 0/1 Pending 0 53s
kube-system calico-node-ptnz8 1/1 Running 0 53s
kube-system calico-node-sthqf 1/1 Running 0 53s
kube-system calico-node-vhwxn 1/1 Running 0 53s
[root@k8s-master docker.io]# cat calico.yaml | grep image
image: docker.io/calico/cni:v3.27.0
image: docker.io/calico/node:v3.27.0
image: docker.io/calico/kube-controllers:v3.27.0
......
I'm puzzled by this, is it the operating system problem? Or is it the kernel version? I hope the officials can answer my question.
Possible Solution
Rolling back calico to v3.27.0
My Environment
The text was updated successfully, but these errors were encountered: