-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
not able to communicate to pods from node-1 to the pods on node-2 #9601
Comments
ok, it looks like some bug in the network playbook! Please fix the network bug in the playbook, currently it is useless, atleast on the latest Debian 11. |
Thank you for submitting this issue. |
@oomichi see i have just now pulled the latest changes on the git repo and installed freshly new kubernetes cluster with 3 vms, 1 master and 2 worker nodes. Here output on freshly installed master: root@master0:~# kubectl run -it --rm --restart=Never --image=infoblox/dnstools:latest dnstools Routing table node-1: Here the routing from the cluster, where i installed calico manually: node-1 On the nodes with ansible calico, the routing table is wrong! On the working nodes without ansible calico: You can see the problem now right? The ansible playbook is doing something not correctly. |
@kerryeon i see you worked also on the calico ansible task. Can you check the above issue? |
Hello, could you attach your |
@kerryeon attached, please check. I wounder that no one else noticed this bug. |
I've got the same issue on Ubuntu 22.04. Do we have any workarounds? |
@marekk1717 my workaround is to select cni instead of calico and then install calico manually from yaml. https://raw.githubusercontent.com/projectcalico/calico/v3.24.5/manifests/calico.yaml |
Thx rufy2022. |
I tried the same, was trying to replace with flannel, but same behavior networking. The ansible playbook is doing some additional network setup thats why it has wrong route. |
It must be something wrong with Debian/Ubuntu. I switched to Rocky Linux 8.7 and it works with the same config as on Ubuntu. |
fyi. @kerryeon @oomichi please fix it ;) |
FYI
[all]
-k8s-master-1.example.com ansible_host=10.0.10.21
-k8s-master-2.example.com ansible_host=10.0.10.22
-k8s-master-3.example.com ansible_host=10.0.10.23
-k8s-node-1.example.com ansible_host=10.0.10.31
-k8s-node-2.example.com ansible_host=10.0.10.32
+k8s-master-1 ansible_host=10.0.10.21
+k8s-master-2 ansible_host=10.0.10.22
+k8s-master-3 ansible_host=10.0.10.23
+k8s-node-1 ansible_host=10.0.10.31
+k8s-node-2 ansible_host=10.0.10.32
...
...
... |
I spoke too soon. |
This looks like an issue in the vxlan component of Debian ... if you try redeploying kubernetes via kubespray and set the calico variables: These variables are set in the inventory group_vars/k8s_cluster/k8s-net-calico.yml file. |
Thank you! I've been dealing with DNS resolution issues for several days and this resolved it (all my nodes are on Ubuntu 22.04) |
this looks like #10436 which was recently fixed. |
@VannTen: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Issue still remains when deploying on ubuntu 22.04 nodes. Solution provided by @jonny-ervine does the trick, but deployment does not work fully with the defaults(and personally took some time to figure it out) |
You can also change it in working kubernetes cluster, you don't have to redeploy. Change in calico_node in configmap Change in default ippool |
@mrzysztof you can propose a PR or open issue to discuss the defaults |
We have already way too much distro specific stuff. Let's identify the exact problem instead, then we'll see if there is a workaround (or a fix, if the issue is not in calico / debian) |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
Environment:
Debian 11 VM on esxi
printf "$(uname -srm)\n$(cat /etc/os-release)\n"
):Linux 5.10.0-20-amd64 x86_64
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
ansible --version
):ansible [core 2.12.5]
python --version
):3.9.2
Kubespray version (commit) (
git rev-parse --short HEAD
):491e260
Network plugin used:
calico
Full inventory with variables (
ansible -i inventory/sample/inventory.ini all -m debug -a "var=hostvars[inventory_hostname]"
):Command used to invoke ansible:
ansible-playbook -i inventory/mycluster/inventory.ini --become --user=root --become-user=root cluster.yml
Output of ansible run:
Anything else do we need to know:
[all]
master0 ansible_host=192.168.50.120 ip=192.168.50.120
node1 ansible_host=192.168.50.121 ip=192.168.50.121
node2 ansible_host=192.168.50.122 ip=192.168.50.122
[kube_control_plane]
master0
[etcd]
master0
[kube_node]
node1
node2
[calico_rr]
[k8s_cluster:children]
kube_control_plane
kube_node
calico_rr
Hello,
i have noticed on a freshly installed kubernetes cluster, iam not able to communicate to pods from node-1 to the pods on node-2 and node-xxx.
Due to this problem, the dns resolv works sometimes and sometimes not.
I have just copied the sample inventory and adjusted my ips and disabled nodelocaldns, even with enabled nodelocaldns was not working.
Looks like the routing is not working properly.
Is this known issue or iam missing something?
The text was updated successfully, but these errors were encountered: