-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NodeLocalDNS Loop detected for zone "." #9948
Comments
The same thing happened to me |
fyi Edit: |
I have same error.. |
I have same error after restart cluster |
I am also getting a similar problem. |
I also encountered this problem (on kubespray 2.20). My Fix I found that setting I suspect you could also set Explanation Without So you have a loop: host -> nodelocaldns -> host -> nodelocaldns -> ... This is what nodelocaldns is detecting. The fix is just to break either the 'host -> nodelocaldns' or 'nodelocaldns -> host' link of the loop. Kubespray also applies Impact of fix If you set Relevant Files https://github.com/kubernetes-sigs/kubespray/blob/release-2.20/roles/kubernetes-apps/ansible/templates/coredns-config.yml.j2#L55 |
Same issue here, it is quite misleading as issue appears only after node restart. Hopefully it will be fixed soon |
Setting upstream_dns_servers helped for us, works with reanimating clusters which fail to start. |
Try explicitly setting More accurately, you need to have I'll create a PR for it if it works for others too. Note: as others also mentioned, apparently changes in resolved.conf is not read unless you restart the cluster or node. |
It was not 'false', which made some tasks (e.g. using systemd-resolved template) to effectively remove default search domains; caused DNS loop after rebooting the node/restarting cluster, so localdns service didn't run correctly. Fixes kubernetes-sigs#9948
In Jinja, if undefined, it does not match {% if remove_default_searchdomains is sameas false or (remove_default_searchdomains is sameas true and searchdomains|default([])|length==0)%}
Domains={{ ([ 'default.svc.' + dns_domain, 'svc.' + dns_domain ] + searchdomains|default([])) | join(' ') }}
{% else %}
Domains={{ searchdomains|default([]) | join(' ') }}
{% endif %} |
Hello, did anyone find a solution to this issue the above setting didnot work for me |
Have you tried to delete localnodedns to restart this pod after update above settings for k8s? |
The following steps worked for me
|
A temporary quickfix for urgent:
|
I have been doing a fair bit of digging on this and I believe this is something that could be fixed in kubespray properly without the need for workarounds (then again, I am not super proficient in ansible or linux stuff in general so I'll let you be the judge). The root-cause seems to be a (more or less) faulty content of the
Since it just reads
which then on reboot is translated to an entry
in Using Is it possible for kubespray to check the nameserver contents and either ignore the 127.0.0.53 entry or resolve the upstream nameservers defined by systemd-resolved when encountering it? Currently we are running cleanup operations after running the playbook on our small cluster manually:
- supersede domain-name-servers 169.254.25.10, 127.0.0.53;
+ supersede domain-name-servers 169.254.25.10;
- nameserver 127.0.0.53
Admittedly this is rather clunky and if possible, I would love to see this edgecase covered in kubespray! I am also willing to contribute a fix (although I would need a pointer or two in the right direction as I am unfamiliar with the repository and ansible). |
The easiest fix for me was just to use Talos instead of Kubespray. Best decision I made in years 🤣 |
Hello
Sometimes the localnodedns pods getting into loop and crashing due to memory overload
The logs in nodelocaldns says to troubleshoot it through: https://coredns.io/plugins/loop/#troubleshooting
in short:
I have installed the cluster on 3 nodes and didnt touched configurations regarding localnodedns (default config) except requests&limits
Environment:
Cloud provider or hardware configuration:
OS
Linux 5.15.0-67-generic x86_64
VERSION="22.04.2 LTS (Jammy Jellyfish)"
Version of Ansible
ansible [core 2.12.5]
Version of Python
Python 3.10.4
Kubespray version:
0c4f57a
Network plugin used:
calico
hosts.yml:
Command used to invoke ansible:
k8s-cluster.yaml
The text was updated successfully, but these errors were encountered: