Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not able to communicate to pods from node-1 to the pods on node-2 #9601

Open
rufy2022 opened this issue Dec 20, 2022 · 24 comments
Open

not able to communicate to pods from node-1 to the pods on node-2 #9601

rufy2022 opened this issue Dec 20, 2022 · 24 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@rufy2022
Copy link

rufy2022 commented Dec 20, 2022

Environment:

  • Cloud provider or hardware configuration:
    Debian 11 VM on esxi
  • OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):
    Linux 5.10.0-20-amd64 x86_64
    PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
    NAME="Debian GNU/Linux"
    VERSION_ID="11"
    VERSION="11 (bullseye)"
    VERSION_CODENAME=bullseye
    ID=debian
    HOME_URL="https://www.debian.org/"
    SUPPORT_URL="https://www.debian.org/support"
    BUG_REPORT_URL="https://bugs.debian.org/"
  • Version of Ansible (ansible --version):
    ansible [core 2.12.5]
  • Version of Python (python --version):
    3.9.2

Kubespray version (commit) (git rev-parse --short HEAD):
491e260

Network plugin used:
calico

Full inventory with variables (ansible -i inventory/sample/inventory.ini all -m debug -a "var=hostvars[inventory_hostname]"):

Command used to invoke ansible:
ansible-playbook -i inventory/mycluster/inventory.ini --become --user=root --become-user=root cluster.yml

Output of ansible run:

Anything else do we need to know:

[all]
master0 ansible_host=192.168.50.120 ip=192.168.50.120
node1 ansible_host=192.168.50.121 ip=192.168.50.121
node2 ansible_host=192.168.50.122 ip=192.168.50.122

[kube_control_plane]
master0

[etcd]
master0

[kube_node]
node1
node2

[calico_rr]

[k8s_cluster:children]
kube_control_plane
kube_node
calico_rr

Hello,

i have noticed on a freshly installed kubernetes cluster, iam not able to communicate to pods from node-1 to the pods on node-2 and node-xxx.
Due to this problem, the dns resolv works sometimes and sometimes not.
I have just copied the sample inventory and adjusted my ips and disabled nodelocaldns, even with enabled nodelocaldns was not working.
Looks like the routing is not working properly.
Is this known issue or iam missing something?

@rufy2022 rufy2022 added the kind/bug Categorizes issue or PR as related to a bug. label Dec 20, 2022
@rufy2022
Copy link
Author

ok, it looks like some bug in the network playbook!
I have changed from calico to cni and then installed manually the latest calico v3.24.5.
Now the routing table on all nodes looks good as expected and everything is working fine.

Please fix the network bug in the playbook, currently it is useless, atleast on the latest Debian 11.

@oomichi
Copy link
Contributor

oomichi commented Dec 22, 2022

ok, it looks like some bug in the network playbook! I have changed from calico to cni and then installed manually the latest calico v3.24.5. Now the routing table on all nodes looks good as expected and everything is working fine.

Please fix the network bug in the playbook, currently it is useless, atleast on the latest Debian 11.

Thank you for submitting this issue.
Could you provide more information about the issue?
The latest Kubespray installs calico v3.24.5 which is the same as you installed manually.
So I am not sure why you solve this issue by yourself according to current information.

@rufy2022
Copy link
Author

@oomichi see i have just now pulled the latest changes on the git repo and installed freshly new kubernetes cluster with 3 vms, 1 master and 2 worker nodes.
The routing table is still wrong, simple ping to google.com or ping to kubernetes is not at all working.
Still the cross communication from one pod to pod in different node doesnt work.
So the calico ansible playbook is doing something wrong, as i said before, when i select cni and later install calico manually, everything works perfect.

Here output on freshly installed master:
root@master0: route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.50.1 0.0.0.0 UG 0 0 0 ens192
10.233.75.0 10.233.75.0 255.255.255.192 UG 0 0 0 vxlan.calico
10.233.102.128 10.233.102.128 255.255.255.192 UG 0 0 0 vxlan.calico
10.233.105.64 0.0.0.0 255.255.255.192 U 0 0 0 *
10.233.105.65 0.0.0.0 255.255.255.255 UH 0 0 0 cali2208d19a336
10.233.105.66 0.0.0.0 255.255.255.255 UH 0 0 0 calif9130857f3d
192.168.50.0 0.0.0.0 255.255.255.0 U 0 0 0 ens192

root@master0:~# kubectl run -it --rm --restart=Never --image=infoblox/dnstools:latest dnstools
If you don't see a command prompt, try pressing enter.
dnstools# ping google.com
^C
dnstools# ping google.com
^C
dnstools# ping google.com
^C
dnstools# ping kubernetes
^C
dnstools#

Routing table node-1:
root@node1: route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.50.1 0.0.0.0 UG 0 0 0 ens192
10.233.75.0 10.233.75.0 255.255.255.192 UG 0 0 0 vxlan.calico
10.233.102.128 0.0.0.0 255.255.255.192 U 0 0 0 *
10.233.102.129 0.0.0.0 255.255.255.255 UH 0 0 0 cali965a9efe442
10.233.102.130 0.0.0.0 255.255.255.255 UH 0 0 0 cali259c5d799e6
10.233.105.64 10.233.105.64 255.255.255.192 UG 0 0 0 vxlan.calico
192.168.50.0 0.0.0.0 255.255.255.0 U 0 0 0 ens192

Here the routing from the cluster, where i installed calico manually:
root@kubernetes-dev-master01:~# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.80.81.1 0.0.0.0 UG 0 0 0 ens192
10.80.81.0 0.0.0.0 255.255.255.0 U 0 0 0 ens192
10.233.100.192 0.0.0.0 255.255.255.192 U 0 0 0 *
10.233.100.193 0.0.0.0 255.255.255.255 UH 0 0 0 cali36f9a373011
10.233.104.64 10.80.81.52 255.255.255.192 UG 0 0 0 tunl0
10.233.105.128 10.80.81.51 255.255.255.192 UG 0 0 0 tunl0

node-1
root@kubernetes-dev-node01: route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.80.81.1 0.0.0.0 UG 0 0 0 ens192
10.80.81.0 0.0.0.0 255.255.255.0 U 0 0 0 ens192
10.233.100.192 10.80.81.50 255.255.255.192 UG 0 0 0 tunl0
10.233.104.64 10.80.81.52 255.255.255.192 UG 0 0 0 tunl0
10.233.105.128 0.0.0.0 255.255.255.192 U 0 0 0 *
10.233.105.129 0.0.0.0 255.255.255.255 UH 0 0 0 caliad3d97130af
10.233.105.130 0.0.0.0 255.255.255.255 UH 0 0 0 cali952936de682
10.233.105.132 0.0.0.0 255.255.255.255 UH 0 0 0 cali46926efb26c
10.233.105.133 0.0.0.0 255.255.255.255 UH 0 0 0 cali82dc20a0498
Do you see the difference?

On the nodes with ansible calico, the routing table is wrong!
master -> e.g. 10.233.75.0 10.233.75.0 255.255.255.192 UG 0 0 0 vxlan.calico
node-1 -> e.g. 10.233.105.64 10.233.105.64 255.255.255.192 UG 0 0 0 vxlan.calico

On the working nodes without ansible calico:
master -> e.g. 10.233.104.64 10.80.81.52 255.255.255.192 UG 0 0 0 tunl0
node-1 -> e.g. 10.233.100.192 10.80.81.50 255.255.255.192 UG 0 0 0 tunl0

You can see the problem now right? The ansible playbook is doing something not correctly.

@rufy2022
Copy link
Author

@kerryeon i see you worked also on the calico ansible task. Can you check the above issue?

@HoKim98
Copy link
Contributor

HoKim98 commented Dec 29, 2022

Hello, could you attach your Full inventory with variables? I see your problems, but the information is lacking.

@rufy2022
Copy link
Author

@kerryeon attached, please check. I wounder that no one else noticed this bug.
master0 | SUCCESS => {
"hostvars[inventory_hostname]": {
"ansible_check_mode": false,
"ansible_config_file": "/root/kubespray/ansible.cfg",
"ansible_diff_mode": false,
"ansible_facts": {},
"ansible_forks": 5,
"ansible_host": "192.168.50.120",
"ansible_inventory_sources": [
"/root/kubespray/inventory/mycluster/inventory.ini"
],
"ansible_playbook_python": "/usr/bin/python3",
"ansible_verbosity": 0,
"ansible_version": {
"full": "2.12.5",
"major": 2,
"minor": 12,
"revision": 5,
"string": "2.12.5"
},
"argocd_enabled": false,
"auto_renew_certificates": false,
"bin_dir": "/usr/local/bin",
"calico_cni_name": "k8s-pod-network",
"calico_pool_blocksize": 26,
"cephfs_provisioner_enabled": false,
"cert_manager_enabled": false,
"cluster_name": "cluster.local",
"container_manager": "containerd",
"coredns_k8s_external_zone": "k8s_external.local",
"credentials_dir": "/root/kubespray/inventory/mycluster/credentials",
"default_kubelet_config_dir": "/etc/kubernetes/dynamic_kubelet_dir",
"deploy_netchecker": false,
"dns_domain": "cluster.local",
"dns_mode": "coredns",
"docker_bin_dir": "/usr/bin",
"docker_container_storage_setup": false,
"docker_daemon_graph": "/var/lib/docker",
"docker_dns_servers_strict": false,
"docker_iptables_enabled": "false",
"docker_log_opts": "--log-opt max-size=50m --log-opt max-file=5",
"docker_rpm_keepcache": 1,
"enable_coredns_k8s_endpoint_pod_names": false,
"enable_coredns_k8s_external": false,
"enable_dual_stack_networks": false,
"enable_nat_default_gateway": true,
"enable_nodelocaldns": false,
"enable_nodelocaldns_secondary": false,
"etcd_data_dir": "/var/lib/etcd",
"etcd_deployment_type": "host",
"event_ttl_duration": "1h0m0s",
"group_names": [
"etcd",
"k8s_cluster",
"kube_control_plane"
],
"groups": {
"all": [
"master0",
"node1",
"node2"
],
"calico_rr": [],
"etcd": [
"master0"
],
"k8s_cluster": [
"master0",
"node1",
"node2"
],
"kube_control_plane": [
"master0"
],
"kube_node": [
"node1",
"node2"
],
"ungrouped": []
},
"helm_enabled": false,
"ingress_alb_enabled": false,
"ingress_nginx_enabled": false,
"ingress_publish_status_address": "",
"inventory_dir": "/root/kubespray/inventory/mycluster",
"inventory_file": "/root/kubespray/inventory/mycluster/inventory.ini",
"inventory_hostname": "master0",
"inventory_hostname_short": "master0",
"ip": "192.168.50.120",
"k8s_image_pull_policy": "IfNotPresent",
"kata_containers_enabled": false,
"krew_enabled": false,
"krew_root_dir": "/usr/local/krew",
"kube_api_anonymous_auth": true,
"kube_apiserver_ip": "10.233.0.1",
"kube_apiserver_port": 6443,
"kube_cert_dir": "/etc/kubernetes/ssl",
"kube_cert_group": "kube-cert",
"kube_config_dir": "/etc/kubernetes",
"kube_encrypt_secret_data": false,
"kube_log_level": 2,
"kube_manifest_dir": "/etc/kubernetes/manifests",
"kube_network_node_prefix": 24,
"kube_network_node_prefix_ipv6": 120,
"kube_network_plugin": "calico",
"kube_network_plugin_multus": false,
"kube_ovn_default_gateway_check": true,
"kube_ovn_default_logical_gateway": false,
"kube_ovn_default_vlan_id": 100,
"kube_ovn_dpdk_enabled": false,
"kube_ovn_enable_external_vpc": true,
"kube_ovn_enable_lb": true,
"kube_ovn_enable_np": true,
"kube_ovn_enable_ssl": false,
"kube_ovn_encap_checksum": true,
"kube_ovn_external_address": "8.8.8.8",
"kube_ovn_external_address_ipv6": "2400:3200::1",
"kube_ovn_external_dns": "alauda.cn",
"kube_ovn_hw_offload": false,
"kube_ovn_network_type": "geneve",
"kube_ovn_node_switch_cidr": "100.64.0.0/16",
"kube_ovn_node_switch_cidr_ipv6": "fd00:100:64::/64",
"kube_ovn_pod_nic_type": "veth_pair",
"kube_ovn_traffic_mirror": false,
"kube_ovn_tunnel_type": "geneve",
"kube_ovn_vlan_name": "product",
"kube_owner": "kube",
"kube_pods_subnet": "10.233.64.0/18",
"kube_pods_subnet_ipv6": "fd85:ee78:d8a6:8607::1:0000/112",
"kube_proxy_mode": "iptables",
"kube_proxy_nodeport_addresses": [],
"kube_proxy_strict_arp": true,
"kube_script_dir": "/usr/local/bin/kubernetes-scripts",
"kube_service_addresses": "10.233.0.0/18",
"kube_service_addresses_ipv6": "fd85:ee78:d8a6:8607::1000/116",
"kube_token_dir": "/etc/kubernetes/tokens",
"kube_version": "v1.25.5",
"kube_webhook_token_auth": false,
"kube_webhook_token_auth_url_skip_tls_verify": false,
"kubeadm_certificate_key": "aafcdd1748c9accc1aaee3b4cf0aebdb4e0f052760f5d017ec581df2b9635c7d",
"kubeadm_patches": {
"dest_dir": "/etc/kubernetes/patches",
"enabled": false,
"source_dir": "/root/kubespray/inventory/mycluster/patches"
},
"kubernetes_audit": false,
"loadbalancer_apiserver_healthcheck_port": 8081,
"loadbalancer_apiserver_port": 6443,
"local_path_provisioner_enabled": false,
"local_release_dir": "/tmp/releases",
"local_volume_provisioner_enabled": false,
"macvlan_interface": "eth1",
"metallb_enabled": false,
"metallb_speaker_enabled": false,
"metrics_server_enabled": true,
"ndots": 2,
"no_proxy_exclude_workers": false,
"nodelocaldns_bind_metrics_host_ip": false,
"nodelocaldns_health_port": 9254,
"nodelocaldns_ip": "169.254.25.10",
"nodelocaldns_second_health_port": 9256,
"nodelocaldns_secondary_skew_seconds": 5,
"ntp_enabled": false,
"ntp_manage_config": false,
"ntp_servers": [
"0.pool.ntp.org iburst",
"1.pool.ntp.org iburst",
"2.pool.ntp.org iburst",
"3.pool.ntp.org iburst"
],
"omit": "__omit_place_holder__ed2aedc52f45dc99038ab78567807d04d1b029c6",
"persistent_volumes_enabled": false,
"playbook_dir": "/root/kubespray",
"podsecuritypolicy_enabled": false,
"rbd_provisioner_enabled": false,
"registry_enabled": false,
"resolvconf_mode": "host_resolvconf",
"retry_stagger": 5,
"skydns_server": "10.233.0.3",
"skydns_server_secondary": "10.233.0.4",
"unsafe_show_logs": false,
"volume_cross_zone_attachment": false
}
}
node1 | SUCCESS => {
"hostvars[inventory_hostname]": {
"ansible_check_mode": false,
"ansible_config_file": "/root/kubespray/ansible.cfg",
"ansible_diff_mode": false,
"ansible_facts": {},
"ansible_forks": 5,
"ansible_host": "192.168.50.121",
"ansible_inventory_sources": [
"/root/kubespray/inventory/mycluster/inventory.ini"
],
"ansible_playbook_python": "/usr/bin/python3",
"ansible_verbosity": 0,
"ansible_version": {
"full": "2.12.5",
"major": 2,
"minor": 12,
"revision": 5,
"string": "2.12.5"
},
"argocd_enabled": false,
"auto_renew_certificates": false,
"bin_dir": "/usr/local/bin",
"calico_cni_name": "k8s-pod-network",
"calico_pool_blocksize": 26,
"cephfs_provisioner_enabled": false,
"cert_manager_enabled": false,
"cluster_name": "cluster.local",
"container_manager": "containerd",
"coredns_k8s_external_zone": "k8s_external.local",
"credentials_dir": "/root/kubespray/inventory/mycluster/credentials",
"default_kubelet_config_dir": "/etc/kubernetes/dynamic_kubelet_dir",
"deploy_netchecker": false,
"dns_domain": "cluster.local",
"dns_mode": "coredns",
"docker_bin_dir": "/usr/bin",
"docker_container_storage_setup": false,
"docker_daemon_graph": "/var/lib/docker",
"docker_dns_servers_strict": false,
"docker_iptables_enabled": "false",
"docker_log_opts": "--log-opt max-size=50m --log-opt max-file=5",
"docker_rpm_keepcache": 1,
"enable_coredns_k8s_endpoint_pod_names": false,
"enable_coredns_k8s_external": false,
"enable_dual_stack_networks": false,
"enable_nat_default_gateway": true,
"enable_nodelocaldns": false,
"enable_nodelocaldns_secondary": false,
"etcd_data_dir": "/var/lib/etcd",
"etcd_deployment_type": "host",
"event_ttl_duration": "1h0m0s",
"group_names": [
"k8s_cluster",
"kube_node"
],
"groups": {
"all": [
"master0",
"node1",
"node2"
],
"calico_rr": [],
"etcd": [
"master0"
],
"k8s_cluster": [
"master0",
"node1",
"node2"
],
"kube_control_plane": [
"master0"
],
"kube_node": [
"node1",
"node2"
],
"ungrouped": []
},
"helm_enabled": false,
"ingress_alb_enabled": false,
"ingress_nginx_enabled": false,
"ingress_publish_status_address": "",
"inventory_dir": "/root/kubespray/inventory/mycluster",
"inventory_file": "/root/kubespray/inventory/mycluster/inventory.ini",
"inventory_hostname": "node1",
"inventory_hostname_short": "node1",
"ip": "192.168.50.121",
"k8s_image_pull_policy": "IfNotPresent",
"kata_containers_enabled": false,
"krew_enabled": false,
"krew_root_dir": "/usr/local/krew",
"kube_api_anonymous_auth": true,
"kube_apiserver_ip": "10.233.0.1",
"kube_apiserver_port": 6443,
"kube_cert_dir": "/etc/kubernetes/ssl",
"kube_cert_group": "kube-cert",
"kube_config_dir": "/etc/kubernetes",
"kube_encrypt_secret_data": false,
"kube_log_level": 2,
"kube_manifest_dir": "/etc/kubernetes/manifests",
"kube_network_node_prefix": 24,
"kube_network_node_prefix_ipv6": 120,
"kube_network_plugin": "calico",
"kube_network_plugin_multus": false,
"kube_ovn_default_gateway_check": true,
"kube_ovn_default_logical_gateway": false,
"kube_ovn_default_vlan_id": 100,
"kube_ovn_dpdk_enabled": false,
"kube_ovn_enable_external_vpc": true,
"kube_ovn_enable_lb": true,
"kube_ovn_enable_np": true,
"kube_ovn_enable_ssl": false,
"kube_ovn_encap_checksum": true,
"kube_ovn_external_address": "8.8.8.8",
"kube_ovn_external_address_ipv6": "2400:3200::1",
"kube_ovn_external_dns": "alauda.cn",
"kube_ovn_hw_offload": false,
"kube_ovn_network_type": "geneve",
"kube_ovn_node_switch_cidr": "100.64.0.0/16",
"kube_ovn_node_switch_cidr_ipv6": "fd00:100:64::/64",
"kube_ovn_pod_nic_type": "veth_pair",
"kube_ovn_traffic_mirror": false,
"kube_ovn_tunnel_type": "geneve",
"kube_ovn_vlan_name": "product",
"kube_owner": "kube",
"kube_pods_subnet": "10.233.64.0/18",
"kube_pods_subnet_ipv6": "fd85:ee78:d8a6:8607::1:0000/112",
"kube_proxy_mode": "iptables",
"kube_proxy_nodeport_addresses": [],
"kube_proxy_strict_arp": true,
"kube_script_dir": "/usr/local/bin/kubernetes-scripts",
"kube_service_addresses": "10.233.0.0/18",
"kube_service_addresses_ipv6": "fd85:ee78:d8a6:8607::1000/116",
"kube_token_dir": "/etc/kubernetes/tokens",
"kube_version": "v1.25.5",
"kube_webhook_token_auth": false,
"kube_webhook_token_auth_url_skip_tls_verify": false,
"kubeadm_certificate_key": "aafcdd1748c9accc1aaee3b4cf0aebdb4e0f052760f5d017ec581df2b9635c7d",
"kubeadm_patches": {
"dest_dir": "/etc/kubernetes/patches",
"enabled": false,
"source_dir": "/root/kubespray/inventory/mycluster/patches"
},
"kubernetes_audit": false,
"loadbalancer_apiserver_healthcheck_port": 8081,
"loadbalancer_apiserver_port": 6443,
"local_path_provisioner_enabled": false,
"local_release_dir": "/tmp/releases",
"local_volume_provisioner_enabled": false,
"macvlan_interface": "eth1",
"metallb_enabled": false,
"metallb_speaker_enabled": false,
"metrics_server_enabled": true,
"ndots": 2,
"no_proxy_exclude_workers": false,
"nodelocaldns_bind_metrics_host_ip": false,
"nodelocaldns_health_port": 9254,
"nodelocaldns_ip": "169.254.25.10",
"nodelocaldns_second_health_port": 9256,
"nodelocaldns_secondary_skew_seconds": 5,
"ntp_enabled": false,
"ntp_manage_config": false,
"ntp_servers": [
"0.pool.ntp.org iburst",
"1.pool.ntp.org iburst",
"2.pool.ntp.org iburst",
"3.pool.ntp.org iburst"
],
"omit": "__omit_place_holder__ed2aedc52f45dc99038ab78567807d04d1b029c6",
"persistent_volumes_enabled": false,
"playbook_dir": "/root/kubespray",
"podsecuritypolicy_enabled": false,
"rbd_provisioner_enabled": false,
"registry_enabled": false,
"resolvconf_mode": "host_resolvconf",
"retry_stagger": 5,
"skydns_server": "10.233.0.3",
"skydns_server_secondary": "10.233.0.4",
"unsafe_show_logs": false,
"volume_cross_zone_attachment": false
}
}
node2 | SUCCESS => {
"hostvars[inventory_hostname]": {
"ansible_check_mode": false,
"ansible_config_file": "/root/kubespray/ansible.cfg",
"ansible_diff_mode": false,
"ansible_facts": {},
"ansible_forks": 5,
"ansible_host": "192.168.50.122",
"ansible_inventory_sources": [
"/root/kubespray/inventory/mycluster/inventory.ini"
],
"ansible_playbook_python": "/usr/bin/python3",
"ansible_verbosity": 0,
"ansible_version": {
"full": "2.12.5",
"major": 2,
"minor": 12,
"revision": 5,
"string": "2.12.5"
},
"argocd_enabled": false,
"auto_renew_certificates": false,
"bin_dir": "/usr/local/bin",
"calico_cni_name": "k8s-pod-network",
"calico_pool_blocksize": 26,
"cephfs_provisioner_enabled": false,
"cert_manager_enabled": false,
"cluster_name": "cluster.local",
"container_manager": "containerd",
"coredns_k8s_external_zone": "k8s_external.local",
"credentials_dir": "/root/kubespray/inventory/mycluster/credentials",
"default_kubelet_config_dir": "/etc/kubernetes/dynamic_kubelet_dir",
"deploy_netchecker": false,
"dns_domain": "cluster.local",
"dns_mode": "coredns",
"docker_bin_dir": "/usr/bin",
"docker_container_storage_setup": false,
"docker_daemon_graph": "/var/lib/docker",
"docker_dns_servers_strict": false,
"docker_iptables_enabled": "false",
"docker_log_opts": "--log-opt max-size=50m --log-opt max-file=5",
"docker_rpm_keepcache": 1,
"enable_coredns_k8s_endpoint_pod_names": false,
"enable_coredns_k8s_external": false,
"enable_dual_stack_networks": false,
"enable_nat_default_gateway": true,
"enable_nodelocaldns": false,
"enable_nodelocaldns_secondary": false,
"etcd_data_dir": "/var/lib/etcd",
"etcd_deployment_type": "host",
"event_ttl_duration": "1h0m0s",
"group_names": [
"k8s_cluster",
"kube_node"
],
"groups": {
"all": [
"master0",
"node1",
"node2"
],
"calico_rr": [],
"etcd": [
"master0"
],
"k8s_cluster": [
"master0",
"node1",
"node2"
],
"kube_control_plane": [
"master0"
],
"kube_node": [
"node1",
"node2"
],
"ungrouped": []
},
"helm_enabled": false,
"ingress_alb_enabled": false,
"ingress_nginx_enabled": false,
"ingress_publish_status_address": "",
"inventory_dir": "/root/kubespray/inventory/mycluster",
"inventory_file": "/root/kubespray/inventory/mycluster/inventory.ini",
"inventory_hostname": "node2",
"inventory_hostname_short": "node2",
"ip": "192.168.50.122",
"k8s_image_pull_policy": "IfNotPresent",
"kata_containers_enabled": false,
"krew_enabled": false,
"krew_root_dir": "/usr/local/krew",
"kube_api_anonymous_auth": true,
"kube_apiserver_ip": "10.233.0.1",
"kube_apiserver_port": 6443,
"kube_cert_dir": "/etc/kubernetes/ssl",
"kube_cert_group": "kube-cert",
"kube_config_dir": "/etc/kubernetes",
"kube_encrypt_secret_data": false,
"kube_log_level": 2,
"kube_manifest_dir": "/etc/kubernetes/manifests",
"kube_network_node_prefix": 24,
"kube_network_node_prefix_ipv6": 120,
"kube_network_plugin": "calico",
"kube_network_plugin_multus": false,
"kube_ovn_default_gateway_check": true,
"kube_ovn_default_logical_gateway": false,
"kube_ovn_default_vlan_id": 100,
"kube_ovn_dpdk_enabled": false,
"kube_ovn_enable_external_vpc": true,
"kube_ovn_enable_lb": true,
"kube_ovn_enable_np": true,
"kube_ovn_enable_ssl": false,
"kube_ovn_encap_checksum": true,
"kube_ovn_external_address": "8.8.8.8",
"kube_ovn_external_address_ipv6": "2400:3200::1",
"kube_ovn_external_dns": "alauda.cn",
"kube_ovn_hw_offload": false,
"kube_ovn_network_type": "geneve",
"kube_ovn_node_switch_cidr": "100.64.0.0/16",
"kube_ovn_node_switch_cidr_ipv6": "fd00:100:64::/64",
"kube_ovn_pod_nic_type": "veth_pair",
"kube_ovn_traffic_mirror": false,
"kube_ovn_tunnel_type": "geneve",
"kube_ovn_vlan_name": "product",
"kube_owner": "kube",
"kube_pods_subnet": "10.233.64.0/18",
"kube_pods_subnet_ipv6": "fd85:ee78:d8a6:8607::1:0000/112",
"kube_proxy_mode": "iptables",
"kube_proxy_nodeport_addresses": [],
"kube_proxy_strict_arp": true,
"kube_script_dir": "/usr/local/bin/kubernetes-scripts",
"kube_service_addresses": "10.233.0.0/18",
"kube_service_addresses_ipv6": "fd85:ee78:d8a6:8607::1000/116",
"kube_token_dir": "/etc/kubernetes/tokens",
"kube_version": "v1.25.5",
"kube_webhook_token_auth": false,
"kube_webhook_token_auth_url_skip_tls_verify": false,
"kubeadm_certificate_key": "aafcdd1748c9accc1aaee3b4cf0aebdb4e0f052760f5d017ec581df2b9635c7d",
"kubeadm_patches": {
"dest_dir": "/etc/kubernetes/patches",
"enabled": false,
"source_dir": "/root/kubespray/inventory/mycluster/patches"
},
"kubernetes_audit": false,
"loadbalancer_apiserver_healthcheck_port": 8081,
"loadbalancer_apiserver_port": 6443,
"local_path_provisioner_enabled": false,
"local_release_dir": "/tmp/releases",
"local_volume_provisioner_enabled": false,
"macvlan_interface": "eth1",
"metallb_enabled": false,
"metallb_speaker_enabled": false,
"metrics_server_enabled": true,
"ndots": 2,
"no_proxy_exclude_workers": false,
"nodelocaldns_bind_metrics_host_ip": false,
"nodelocaldns_health_port": 9254,
"nodelocaldns_ip": "169.254.25.10",
"nodelocaldns_second_health_port": 9256,
"nodelocaldns_secondary_skew_seconds": 5,
"ntp_enabled": false,
"ntp_manage_config": false,
"ntp_servers": [
"0.pool.ntp.org iburst",
"1.pool.ntp.org iburst",
"2.pool.ntp.org iburst",
"3.pool.ntp.org iburst"
],
"omit": "__omit_place_holder__ed2aedc52f45dc99038ab78567807d04d1b029c6",
"persistent_volumes_enabled": false,
"playbook_dir": "/root/kubespray",
"podsecuritypolicy_enabled": false,
"rbd_provisioner_enabled": false,
"registry_enabled": false,
"resolvconf_mode": "host_resolvconf",
"retry_stagger": 5,
"skydns_server": "10.233.0.3",
"skydns_server_secondary": "10.233.0.4",
"unsafe_show_logs": false,
"volume_cross_zone_attachment": false
}
}

@marekk1717
Copy link

marekk1717 commented Jan 10, 2023

I've got the same issue on Ubuntu 22.04. Do we have any workarounds?

@rufy2022
Copy link
Author

@marekk1717 my workaround is to select cni instead of calico and then install calico manually from yaml.
In file: group_vars/k8s_cluster/k8s-cluster.yml
kube_network_plugin: cni

https://raw.githubusercontent.com/projectcalico/calico/v3.24.5/manifests/calico.yaml

@marekk1717
Copy link

Thx rufy2022.
Is there any way to remove callico on a working cluster and add it manually?
What do you think about weave instead of callico? I need it on my test cluster on vmware only. Non-production only.

@rufy2022
Copy link
Author

I tried the same, was trying to replace with flannel, but same behavior networking. The ansible playbook is doing some additional network setup thats why it has wrong route.
I would recommend to do a fresh installation with cni and afterwards calico installation.

@marekk1717
Copy link

It must be something wrong with Debian/Ubuntu. I switched to Rocky Linux 8.7 and it works with the same config as on Ubuntu.

@rufy2022
Copy link
Author

fyi. @kerryeon @oomichi please fix it ;)

@aussielunix
Copy link

aussielunix commented Jan 29, 2023

FYI
Ubuntu 22.04 with Kubespray branch release-2.21

I am hitting this too if I put FQDNs in the inventory.
But pods can ping across nodes if the inventory only uses hostnames.

[all]
-k8s-master-1.example.com ansible_host=10.0.10.21
-k8s-master-2.example.com ansible_host=10.0.10.22
-k8s-master-3.example.com ansible_host=10.0.10.23
-k8s-node-1.example.com ansible_host=10.0.10.31
-k8s-node-2.example.com ansible_host=10.0.10.32
+k8s-master-1 ansible_host=10.0.10.21
+k8s-master-2 ansible_host=10.0.10.22
+k8s-master-3 ansible_host=10.0.10.23
+k8s-node-1 ansible_host=10.0.10.31
+k8s-node-2 ansible_host=10.0.10.32
...
...
...

@aussielunix
Copy link

I spoke too soon.
I have deleted and built 4 times since the above and it fails again.

@jonny-ervine
Copy link

This looks like an issue in the vxlan component of Debian ... if you try redeploying kubernetes via kubespray and set the calico variables:
calico_network_backend: bird
calico_ipip_mode: 'Always'
calico_vxlan_mode: 'Never'

These variables are set in the inventory group_vars/k8s_cluster/k8s-net-calico.yml file.

@gurmsc5
Copy link

gurmsc5 commented May 10, 2023

This looks like an issue in the vxlan component of Debian ... if you try redeploying kubernetes via kubespray and set the calico variables: calico_network_backend: bird calico_ipip_mode: 'Always' calico_vxlan_mode: 'Never'

These variables are set in the inventory group_vars/k8s_cluster/k8s-net-calico.yml file.

Thank you! I've been dealing with DNS resolution issues for several days and this resolved it (all my nodes are on Ubuntu 22.04)

@VannTen
Copy link
Contributor

VannTen commented Dec 19, 2023

this looks like #10436 which was recently fixed.
/close
feel free to reopen if it's in fact a different bug

@k8s-ci-robot
Copy link
Contributor

@VannTen: Closing this issue.

In response to this:

this looks like #10436 which was recently fixed.
/close
feel free to reopen if it's in fact a different bug

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mrzysztof
Copy link

mrzysztof commented Jun 15, 2024

Issue still remains when deploying on ubuntu 22.04 nodes.

Solution provided by @jonny-ervine does the trick, but deployment does not work fully with the defaults(and personally took some time to figure it out)
Perhaps ipip should be the default for debian/ubuntu?

@korallo159
Copy link

korallo159 commented Jul 2, 2024

This looks like an issue in the vxlan component of Debian ... if you try redeploying kubernetes via kubespray and set the calico variables: calico_network_backend: bird calico_ipip_mode: 'Always' calico_vxlan_mode: 'Never'

These variables are set in the inventory group_vars/k8s_cluster/k8s-net-calico.yml file.

You can also change it in working kubernetes cluster, you don't have to redeploy.

Change in calico_node in configmap
k edit cm calico-config -n kube-system
calico_network_backend: bird

Change in default ippool
k edit ippool default-pool
calico_ipip_mode: 'Always'
calico_vxlan_mode: 'Never'

@ant31
Copy link
Contributor

ant31 commented Jul 2, 2024

@mrzysztof you can propose a PR or open issue to discuss the defaults

@VannTen
Copy link
Contributor

VannTen commented Aug 27, 2024

Solution provided by @jonny-ervine does the trick, but deployment does not work fully with the defaults(and personally took some time to figure it out) Perhaps ipip should be the default for debian/ubuntu?

We have already way too much distro specific stuff. Let's identify the exact problem instead, then we'll see if there is a workaround (or a fix, if the issue is not in calico / debian)

@VannTen VannTen reopened this Aug 27, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 25, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 25, 2024
@k8s-ci-robot k8s-ci-robot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Dec 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests