Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Calico fails to start in latest k3s versions #1375

Open
alpeb opened this issue Nov 14, 2023 · 3 comments
Open

[Docs] Calico fails to start in latest k3s versions #1375

alpeb opened this issue Nov 14, 2023 · 3 comments
Labels
docs Documentation
Milestone

Comments

@alpeb
Copy link

alpeb commented Nov 14, 2023

What did you do

  • How was the cluster created?
k3d cluster create --k3s-arg '--disable=local-storage,metrics-server@server:0' --no-lb \
  --k3s-arg --write-kubeconfig-mode=644 --k3s-arg --flannel-backend=none \
  --k3s-arg --cluster-cidr=192.168.0.0/16 --k3s-arg '--disable=servicelb,traefik@server:0' \
  --image +v1.27
  • What did you do afterwards?
    I applied the calico manifests as instructed in the k3d docs:
kubectl apply -f https://k3d.io/v5.6.0/usage/advanced/calico.yaml

What did you expect to happen

The calico workloads to come up fine.

Screenshots or terminal output

Instead, after about a minute the calico-node pod starts failing. Its log is filled with these entries repeated:

2023-11-14 22:38:15.309 [INFO][2337] felix/ipsets.go 356: Finished resync family="inet" numInconsistenciesFound=0 resyncDuration=792.897µs
2023-11-14 22:38:15.309 [WARNING][2337] felix/ipsets.go 309: Failed to resync with dataplane error=exit status 1 family="inet"
2023-11-14 22:38:15.565 [INFO][2337] felix/ipsets.go 301: Retrying after an ipsets update failure... family="inet"
2023-11-14 22:38:15.565 [INFO][2337] felix/ipsets.go 306: Resyncing ipsets with dataplane. family="inet"
2023-11-14 22:38:15.566 [ERROR][2337] felix/ipsets.go 561: Bad return code from 'ipset list'. error=exit status 1 family="inet" stderr="ipset v7.1: Kernel and userspace incompatible: settype hash:ip with revision 5 not supported by userspa

Which OS & Architecture

$ k3d runtime-info
arch: x86_64
cgroupdriver: systemd
cgroupversion: "2"
endpoint: /var/run/docker.sock
filesystem: extfs
infoname: riemann
name: docker
os: NixOS 23.05 (Stoat)
ostype: linux
version: 24.0.5

Which version of k3d

k3d version v5.6.0
k3s version v1.27.4-k3s1 (default)

Which version of docker

$ docker version
Client:
 Version:           24.0.5
 API version:       1.43
 Go version:        go1.20.8
 Git commit:        v24.0.5
 Built:             Thu Jan  1 00:00:00 1970
 OS/Arch:           linux/amd64
 Context:           default

Server:
 Engine:
  Version:          24.0.5
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.8
  Git commit:       v24.0.5
  Built:            Tue Jan  1 00:00:00 1980
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.7.7
  GitCommit:        v1.7.7
 runc:
  Version:          1.1.8
  GitCommit:
 docker-init:
  Version:          0.19.0
  GitCommit:
alpeb@riemann 17:40:42 ~ (⎈|k3d-k3s-default:N/A)


$ docker info
Client:
 Version:    24.0.5
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.11.2
    Path:     /nix/store/alx8f3z9mm870ak397j1wyrh2m9smj6b-docker-plugins/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  2.21.0
    Path:     /nix/store/alx8f3z9mm870ak397j1wyrh2m9smj6b-docker-plugins/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 5
  Running: 1
  Paused: 0
  Stopped: 4
 Images: 840
 Server Version: 24.0.5
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: journald
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: v1.7.7
 runc version:
 init version:
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.1.58
 Operating System: NixOS 23.05 (Stoat)
 OSType: linux
 Architecture: x86_64
 CPUs: 24
 Total Memory: 15.31GiB
 Name: riemann
 ID: 75fe5c20-9c24-49c3-8cec-db7a65167d45
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: true
@alpeb alpeb added the bug Something isn't working label Nov 14, 2023
@alpeb alpeb changed the title Calico fails to start in latest k3d versions Calico fails to start in latest k3s versions Nov 14, 2023
@alpeb
Copy link
Author

alpeb commented Nov 15, 2023

Further testing revealed this started to happen with k8s v1.27.7-k3s1.

alpeb added a commit to linkerd/linkerd2 that referenced this issue Nov 15, 2023
Fixes #11567

The trick is to run the test under k8s `v1.27.6-k3s1` as the following
versions break Calico in k3s (see k3d-io/k3d#1375).

Also removed the `continue-on-error: true` directive in the integration
workflow because it was hiding this problem.
@iwilltry42
Copy link
Member

Hi @alpeb , thanks for opening this issue and following up!
It seems the general problem is not new in Calico.
It seems like that's an incompatibility between the K3s image version and the version of Calico.
Please note, that the docs link to a fairly old manifest of Calico - v3.15.0 (they're at v3.26.3) right now and we should probably remove the direct manifest link.
I just quickly gave it a try installing the latest Calico release, but that hits issue projectcalico/calico#8025.

@iwilltry42 iwilltry42 changed the title Calico fails to start in latest k3s versions [Docs] Calico fails to start in latest k3s versions Nov 16, 2023
@iwilltry42 iwilltry42 added docs Documentation and removed bug Something isn't working labels Nov 16, 2023
@iwilltry42 iwilltry42 added this to the v5.7.0 milestone Nov 16, 2023
alpeb added a commit to linkerd/linkerd2 that referenced this issue Nov 17, 2023
Fixes #11567

The trick is to run the test under k8s `v1.27.6-k3s1` as the following
versions break Calico in k3s (see k3d-io/k3d#1375).

Also removed the `continue-on-error: true` directive in the integration
workflow because it was hiding this problem.
alpeb added a commit to linkerd/linkerd2 that referenced this issue Nov 20, 2023
* Reenable cni-calico-deep integration test

Fixes #11567

The trick is to run the test under k8s `v1.27.6-k3s1` as the following
versions break Calico in k3s (see k3d-io/k3d#1375).

Also removed the `continue-on-error: true` directive in the integration
workflow because it was hiding this problem.
alpeb added a commit to linkerd/linkerd2-proxy-init that referenced this issue Apr 4, 2024
Fixes linkerd/linkerd2#11597

When the cni plugin is triggered, it validates that the proxy has been
injected into the pod before setting up the iptables rules. It does so
by looking for the "linkerd-proxy" container. However, when the proxy is
injected as a native sidecar, it gets added as an _init_ container, so
it was being disregarded here.

We don't have integration tests for validating native sidecars when
using linkerd-cni because [Calico doesn't work in k3s since k8s
1.27](k3d-io/k3d#1375), and we require k8s
1.29 for using native sidecars.
I did nevertheless successfully test this fix in an AKS cluster.
mateiidavid pushed a commit to linkerd/linkerd2-proxy-init that referenced this issue Apr 16, 2024
Fixes linkerd/linkerd2#11597

When the cni plugin is triggered, it validates that the proxy has been
injected into the pod before setting up the iptables rules. It does so
by looking for the "linkerd-proxy" container. However, when the proxy is
injected as a native sidecar, it gets added as an _init_ container, so
it was being disregarded here.

We don't have integration tests for validating native sidecars when
using linkerd-cni because [Calico doesn't work in k3s since k8s
1.27](k3d-io/k3d#1375), and we require k8s
1.29 for using native sidecars.
I did nevertheless successfully test this fix in an AKS cluster.
@iwilltry42 iwilltry42 modified the milestones: v5.7.0, v5.8.0 Jul 10, 2024
@frozenprocess
Copy link
Contributor

@iwilltry42 for native nftable support you must run Calico v3.29+ or set felix backend to nftable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation
Projects
None yet
Development

No branches or pull requests

3 participants