Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NodeLocalDNS Loop detected for zone "." #9948

Closed
kfirfer opened this issue Apr 1, 2023 · 17 comments · Fixed by #10554 or #10533
Closed

NodeLocalDNS Loop detected for zone "." #9948

kfirfer opened this issue Apr 1, 2023 · 17 comments · Fixed by #10554 or #10533
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@kfirfer
Copy link

kfirfer commented Apr 1, 2023

Hello

Sometimes the localnodedns pods getting into loop and crashing due to memory overload
The logs in nodelocaldns says to troubleshoot it through: https://coredns.io/plugins/loop/#troubleshooting

[FATAL] plugin/loop: Loop (169.254.25.10:42096 -> 169.254.25.10:53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 8227980046591666151.5519681328841792818."

in short:
I have installed the cluster on 3 nodes and didnt touched configurations regarding localnodedns (default config) except requests&limits

Environment:

  • Cloud provider or hardware configuration:

  • OS
    Linux 5.15.0-67-generic x86_64
    VERSION="22.04.2 LTS (Jammy Jellyfish)"

  • Version of Ansible
    ansible [core 2.12.5]

  • Version of Python
    Python 3.10.4

Kubespray version:
0c4f57a

Network plugin used:
calico

hosts.yml:

all:
  hosts:
    node1:
      ansible_host: 192.168.200.101
      ip: 192.168.200.101
      access_ip: 192.168.200.101
    node2:
      ansible_host: 192.168.200.102
      ip: 192.168.200.102
      access_ip: 192.168.200.102
    node3:
      ansible_host: 192.168.200.103
      ip: 192.168.200.103
      access_ip: 192.168.200.103
  children:
    kube_control_plane:
      hosts:
        node1:
        node2:
    kube_node:
      hosts:
        node1:
        node2:
        node3:
    etcd:
      hosts:
        node1:
        node2:
        node3:
    k8s_cluster:
      children:
        kube_control_plane:
        kube_node:
    calico_rr:
      hosts: {}

Command used to invoke ansible:

ansible-playbook -i inventory/nucs/hosts.yml cluster.yml -b -v -u root -e upgrade_cluster_setup=true

k8s-cluster.yaml

---
# Kubernetes configuration dirs and system namespace.
# Those are where all the additional config stuff goes
# the kubernetes normally puts in /srv/kubernetes.
# This puts them in a sane location and namespace.
# Editing those values will almost surely break something.
kube_config_dir: /etc/kubernetes
kube_script_dir: "{{ bin_dir }}/kubernetes-scripts"
kube_manifest_dir: "{{ kube_config_dir }}/manifests"

# This is where all the cert scripts and certs will be located
kube_cert_dir: "{{ kube_config_dir }}/ssl"

# This is where all of the bearer tokens will be stored
kube_token_dir: "{{ kube_config_dir }}/tokens"

kube_api_anonymous_auth: true

## Change this to use another Kubernetes version, e.g. a current beta release
kube_version: v1.26.3

# Where the binaries will be downloaded.
# Note: ensure that you've enough disk space (about 1G)
local_release_dir: "/tmp/releases"
# Random shifts for retrying failed ops like pushing/downloading
retry_stagger: 5

# This is the user that owns tha cluster installation.
kube_owner: kube

# This is the group that the cert creation scripts chgrp the
# cert files to. Not really changeable...
kube_cert_group: kube-cert

# Cluster Loglevel configuration
kube_log_level: 2

# Directory where credentials will be stored
credentials_dir: "{{ inventory_dir }}/credentials"

## It is possible to activate / deactivate selected authentication methods (oidc, static token auth)
kube_oidc_auth: true
# kube_token_auth: false


## Variables for OpenID Connect Configuration https://kubernetes.io/docs/admin/authentication/
## To use OpenID you have to deploy additional an OpenID Provider (e.g Dex, Keycloak, ...)

kube_oidc_url: https://dex.tatzan.com
kube_oidc_client_id: kubernetes-nucs
## Optional settings for OIDC
# kube_oidc_ca_file: "{{ kube_cert_dir }}/ca.pem"
kube_oidc_username_claim: email
# kube_oidc_username_prefix: 'oidc:'
kube_oidc_groups_claim: groups
# kube_oidc_groups_prefix: 'oidc:'

## Variables to control webhook authn/authz
# kube_webhook_token_auth: false
# kube_webhook_token_auth_url: https://...
# kube_webhook_token_auth_url_skip_tls_verify: false

## For webhook authorization, authorization_modes must include Webhook
# kube_webhook_authorization: false
# kube_webhook_authorization_url: https://...
# kube_webhook_authorization_url_skip_tls_verify: false

# Choose network plugin (cilium, calico, kube-ovn, weave or flannel. Use cni for generic cni plugin)
# Can also be set to 'cloud', which lets the cloud provider setup appropriate routing
kube_network_plugin: calico

# Setting multi_networking to true will install Multus: https://github.com/k8snetworkplumbingwg/multus-cni
kube_network_plugin_multus: false

# Kubernetes internal network for services, unused block of space.
kube_service_addresses: 10.233.0.0/18

# internal network. When used, it will assign IP
# addresses from this range to individual pods.
# This network must be unused in your network infrastructure!
kube_pods_subnet: 10.233.64.0/18

# internal network node size allocation (optional). This is the size allocated
# to each node for pod IP address allocation. Note that the number of pods per node is
# also limited by the kubelet_max_pods variable which defaults to 110.
#
# Example:
# Up to 64 nodes and up to 254 or kubelet_max_pods (the lowest of the two) pods per node:
#  - kube_pods_subnet: 10.233.64.0/18
#  - kube_network_node_prefix: 24
#  - kubelet_max_pods: 110
#
# Example:
# Up to 128 nodes and up to 126 or kubelet_max_pods (the lowest of the two) pods per node:
#  - kube_pods_subnet: 10.233.64.0/18
#  - kube_network_node_prefix: 25
#  - kubelet_max_pods: 110
kube_network_node_prefix: 24
kubelet_max_pods: 200
# Configure Dual Stack networking (i.e. both IPv4 and IPv6)
enable_dual_stack_networks: false

# Kubernetes internal network for IPv6 services, unused block of space.
# This is only used if enable_dual_stack_networks is set to true
# This provides 4096 IPv6 IPs
kube_service_addresses_ipv6: fd85:ee78:d8a6:8607::1000/116

# Internal network. When used, it will assign IPv6 addresses from this range to individual pods.
# This network must not already be in your network infrastructure!
# This is only used if enable_dual_stack_networks is set to true.
# This provides room for 256 nodes with 254 pods per node.
kube_pods_subnet_ipv6: fd85:ee78:d8a6:8607::1:0000/112

# IPv6 subnet size allocated to each for pods.
# This is only used if enable_dual_stack_networks is set to true
# This provides room for 254 pods per node.
kube_network_node_prefix_ipv6: 120

# The port the API Server will be listening on.
kube_apiserver_ip: "{{ kube_service_addresses|ipaddr('net')|ipaddr(1)|ipaddr('address') }}"
kube_apiserver_port: 6443  # (https)
kube_apiserver_request_timeout: 3600s

# Kube-proxy proxyMode configuration.
# Can be ipvs, iptables
kube_proxy_mode: ipvs

# configure arp_ignore and arp_announce to avoid answering ARP queries from kube-ipvs0 interface
# must be set to true for MetalLB, kube-vip(ARP enabled) to work
kube_proxy_strict_arp: false

# A string slice of values which specify the addresses to use for NodePorts.
# Values may be valid IP blocks (e.g. 1.2.3.0/24, 1.2.3.4/32).
# The default empty string slice ([]) means to use all local addresses.
# kube_proxy_nodeport_addresses_cidr is retained for legacy config
kube_proxy_nodeport_addresses: >-
  {%- if kube_proxy_nodeport_addresses_cidr is defined -%}
  [{{ kube_proxy_nodeport_addresses_cidr }}]
  {%- else -%}
  []
  {%- endif -%}

# If non-empty, will use this string as identification instead of the actual hostname
# kube_override_hostname: >-
#   {%- if cloud_provider is defined and cloud_provider in [ 'aws' ] -%}
#   {%- else -%}
#   {{ inventory_hostname }}
#   {%- endif -%}

## Encrypting Secret Data at Rest
kube_encrypt_secret_data: true

# Graceful Node Shutdown (Kubernetes >= 1.21.0), see https://kubernetes.io/blog/2021/04/21/graceful-node-shutdown-beta/
# kubelet_shutdown_grace_period had to be greater than kubelet_shutdown_grace_period_critical_pods to allow
# non-critical podsa to also terminate gracefully
# kubelet_shutdown_grace_period: 60s
# kubelet_shutdown_grace_period_critical_pods: 20s

# DNS configuration.
# Kubernetes cluster name, also will be used as DNS domain
cluster_name: cluster.local
# Subdomains of DNS domain to be resolved via /etc/resolv.conf for hostnet pods
ndots: 2
# dns_timeout: 2
# dns_attempts: 2
# Custom search domains to be added in addition to the default cluster search domains
# searchdomains:
#   - svc.{{ cluster_name }}
#   - default.svc.{{ cluster_name  }}
# Remove default cluster search domains (``default.svc.{{ dns_domain }}, svc.{{ dns_domain }}``).
# remove_default_searchdomains: false
# Can be coredns, coredns_dual, manual or none
dns_mode: coredns
# Set manual server if using a custom cluster DNS server
# manual_dns_server: 10.x.x.x
# Enable nodelocal dns cache
enable_nodelocaldns: true
enable_nodelocaldns_secondary: false
nodelocaldns_ip: 169.254.25.10
nodelocaldns_health_port: 9254
nodelocaldns_second_health_port: 9256
nodelocaldns_bind_metrics_host_ip: false
nodelocaldns_secondary_skew_seconds: 5
# nodelocaldns_external_zones:
# - zones:
#   - example.com
#   - example.io:1053
#   nameservers:
#   - 1.1.1.1
#   - 2.2.2.2
#   cache: 5
# - zones:
#   - https://mycompany.local:4453
#   nameservers:
#   - 192.168.0.53
#   cache: 0
# - zones:
#   - mydomain.tld
#   nameservers:
#   - 10.233.0.3
#   cache: 5
#   rewrite:
#   - name website.tld website.namespace.svc.cluster.local
# Enable k8s_external plugin for CoreDNS
enable_coredns_k8s_external: false
coredns_k8s_external_zone: k8s_external.local
# Enable endpoint_pod_names option for kubernetes plugin
enable_coredns_k8s_endpoint_pod_names: false
# Set forward options for upstream DNS servers in coredns (and nodelocaldns) config
# dns_upstream_forward_extra_opts:
#   policy: sequential
# Apply extra options to coredns kubernetes plugin
# coredns_kubernetes_extra_opts:
#   - 'fallthrough example.local'
# Forward extra domains to the coredns kubernetes plugin
# coredns_kubernetes_extra_domains: ''

# Can be docker_dns, host_resolvconf or none
resolvconf_mode: host_resolvconf
# Deploy netchecker app to verify DNS resolve as an HTTP service
deploy_netchecker: false
# Ip address of the kubernetes skydns service
skydns_server: "{{ kube_service_addresses|ipaddr('net')|ipaddr(3)|ipaddr('address') }}"
skydns_server_secondary: "{{ kube_service_addresses|ipaddr('net')|ipaddr(4)|ipaddr('address') }}"
dns_domain: "{{ cluster_name }}"

## Container runtime
## docker for docker, crio for cri-o and containerd for containerd.
## Default: containerd
container_manager: containerd

# Additional container runtimes
kata_containers_enabled: false

kubeadm_certificate_key: "{{ lookup('password', credentials_dir + '/kubeadm_certificate_key.creds length=64 chars=hexdigits') | lower }}"

# K8s image pull policy (imagePullPolicy)
k8s_image_pull_policy: IfNotPresent

# audit log for kubernetes
kubernetes_audit: true
#audit_log_maxage: 30
#audit_log_maxbackups: 5
#audit_log_maxsize: 100
audit_log_path: "-"

# define kubelet config dir for dynamic kubelet
# kubelet_config_dir:
default_kubelet_config_dir: "{{ kube_config_dir }}/dynamic_kubelet_dir"

# pod security policy (RBAC must be enabled either by having 'RBAC' in authorization_modes or kubeadm enabled)
podsecuritypolicy_enabled: false

# Custom PodSecurityPolicySpec for restricted policy
# podsecuritypolicy_restricted_spec: {}

# Custom PodSecurityPolicySpec for privileged policy
# podsecuritypolicy_privileged_spec: {}

# Make a copy of kubeconfig on the host that runs Ansible in {{ inventory_dir }}/artifacts
# kubeconfig_localhost: false
# Use ansible_host as external api ip when copying over kubeconfig.
# kubeconfig_localhost_ansible_host: false
# Download kubectl onto the host that runs Ansible in {{ bin_dir }}
# kubectl_localhost: false

# Make a copy of kubeconfig on the host that runs Ansible in {{ inventory_dir }}/artifacts
kubeconfig_localhost: true

# Download kubectl onto the host that runs Ansible in {{ bin_dir }}
kubectl_localhost: true


# A comma separated list of levels of node allocatable enforcement to be enforced by kubelet.
# Acceptable options are 'pods', 'system-reserved', 'kube-reserved' and ''. Default is "".
# kubelet_enforce_node_allocatable: pods

## Set runtime and kubelet cgroups when using systemd as cgroup driver (default)
# kubelet_runtime_cgroups: "/{{ kube_service_cgroups }}/{{ container_manager }}.service"
# kubelet_kubelet_cgroups: "/{{ kube_service_cgroups }}/kubelet.service"

## Set runtime and kubelet cgroups when using cgroupfs as cgroup driver
# kubelet_runtime_cgroups_cgroupfs: "/system.slice/{{ container_manager }}.service"
# kubelet_kubelet_cgroups_cgroupfs: "/system.slice/kubelet.service"

# Optionally reserve this space for kube daemons.
# kube_reserved: false
## Uncomment to override default values
## The following two items need to be set when kube_reserved is true
# kube_reserved_cgroups_for_service_slice: kube.slice
# kube_reserved_cgroups: "/{{ kube_reserved_cgroups_for_service_slice }}"
# kube_memory_reserved: 256Mi
# kube_cpu_reserved: 100m
# kube_ephemeral_storage_reserved: 2Gi
# kube_pid_reserved: "1000"
# Reservation for master hosts
# kube_master_memory_reserved: 512Mi
# kube_master_cpu_reserved: 200m
# kube_master_ephemeral_storage_reserved: 2Gi
# kube_master_pid_reserved: "1000"

## Optionally reserve resources for OS system daemons.
# system_reserved: true
## Uncomment to override default values
## The following two items need to be set when system_reserved is true
# system_reserved_cgroups_for_service_slice: system.slice
# system_reserved_cgroups: "/{{ system_reserved_cgroups_for_service_slice }}"
# system_memory_reserved: 512Mi
# system_cpu_reserved: 500m
# system_ephemeral_storage_reserved: 2Gi
## Reservation for master hosts
# system_master_memory_reserved: 256Mi
# system_master_cpu_reserved: 250m
# system_master_ephemeral_storage_reserved: 2Gi

## Eviction Thresholds to avoid system OOMs
# https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#eviction-thresholds
# eviction_hard: {}
# eviction_hard_control_plane: {}

# An alternative flexvolume plugin directory
# kubelet_flexvolumes_plugins_dir: /usr/libexec/kubernetes/kubelet-plugins/volume/exec

## Supplementary addresses that can be added in kubernetes ssl keys.
## That can be useful for example to setup a keepalived virtual IP
# supplementary_addresses_in_ssl_keys: [10.0.0.1, 10.0.0.2, 10.0.0.3]

## Supplementary addresses that can be added in kubernetes ssl keys.
## That can be useful for example to setup a keepalived virtual IP
## IMPORTANT FOR LOADBALANCERS TO WORK
supplementary_addresses_in_ssl_keys: [192.168.200.101, 192.168.200.102, 192.168.200.103 ]

## Running on top of openstack vms with cinder enabled may lead to unschedulable pods due to NoVolumeZoneConflict restriction in kube-scheduler.
## See https://github.com/kubernetes-sigs/kubespray/issues/2141
## Set this variable to true to get rid of this issue
volume_cross_zone_attachment: false
## Add Persistent Volumes Storage Class for corresponding cloud provider (supported: in-tree OpenStack, Cinder CSI,
## AWS EBS CSI, Azure Disk CSI, GCP Persistent Disk CSI)
persistent_volumes_enabled: false

## Container Engine Acceleration
## Enable container acceleration feature, for example use gpu acceleration in containers
# nvidia_accelerator_enabled: true
## Nvidia GPU driver install. Install will by done by a (init) pod running as a daemonset.
## Important: if you use Ubuntu then you should set in all.yml 'docker_storage_options: -s overlay2'
## Array with nvida_gpu_nodes, leave empty or comment if you don't want to install drivers.
## Labels and taints won't be set to nodes if they are not in the array.
# nvidia_gpu_nodes:
#   - kube-gpu-001
# nvidia_driver_version: "384.111"
## flavor can be tesla or gtx
# nvidia_gpu_flavor: gtx
## NVIDIA driver installer images. Change them if you have trouble accessing gcr.io.
# nvidia_driver_install_centos_container: atzedevries/nvidia-centos-driver-installer:2
# nvidia_driver_install_ubuntu_container: gcr.io/google-containers/ubuntu-nvidia-driver-installer@sha256:7df76a0f0a17294e86f691c81de6bbb7c04a1b4b3d4ea4e7e2cccdc42e1f6d63
## NVIDIA GPU device plugin image.
# nvidia_gpu_device_plugin_container: "registry.k8s.io/nvidia-gpu-device-plugin@sha256:0842734032018be107fa2490c98156992911e3e1f2a21e059ff0105b07dd8e9e"

## Support tls min version, Possible values: VersionTLS10, VersionTLS11, VersionTLS12, VersionTLS13.
# tls_min_version: ""

## Support tls cipher suites.
# tls_cipher_suites: {}
#   - TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA
#   - TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256
#   - TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
#   - TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA
#   - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
#   - TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
#   - TLS_ECDHE_ECDSA_WITH_RC4_128_SHA
#   - TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA
#   - TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA
#   - TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
#   - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
#   - TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA
#   - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
#   - TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
#   - TLS_ECDHE_RSA_WITH_RC4_128_SHA
#   - TLS_RSA_WITH_3DES_EDE_CBC_SHA
#   - TLS_RSA_WITH_AES_128_CBC_SHA
#   - TLS_RSA_WITH_AES_128_CBC_SHA256
#   - TLS_RSA_WITH_AES_128_GCM_SHA256
#   - TLS_RSA_WITH_AES_256_CBC_SHA
#   - TLS_RSA_WITH_AES_256_GCM_SHA384
#   - TLS_RSA_WITH_RC4_128_SHA

## Amount of time to retain events. (default 1h0m0s)
event_ttl_duration: "1h0m0s"

## Automatically renew K8S control plane certificates on first Monday of each month
auto_renew_certificates: false
# First Monday of each month
# auto_renew_certificates_systemd_calendar: "Mon *-*-1,2,3,4,5,6,7 03:{{ groups['kube_control_plane'].index(inventory_hostname) }}0:00"

# kubeadm patches path
kubeadm_patches:
  enabled: false
  source_dir: "{{ inventory_dir }}/patches"
  dest_dir: "{{ kube_config_dir }}/patches"

kubelet_custom_flags:
  - "--logging-format=json"
  - "--pod-max-pids=4096"
  - "--node-status-update-frequency=4s"
  - "--v=1"

kube_controller_node_monitor_grace_period: 20s
kube_controller_node_monitor_period: 2s

kube_kubeadm_controller_extra_args:
  pod-eviction-timeout: 30s
  logging-format: json

kube_kubeadm_apiserver_extra_args:
  logging-format: json

kube_apiserver_pod_eviction_not_ready_timeout_seconds: 30
kube_apiserver_pod_eviction_unreachable_timeout_seconds: 30
kube_apiserver_enable_admission_plugins:
  - PodNodeSelector
  - NodeRestriction

audit_policy_custom_rules: |
  - level: None
    omitStages:
      - RequestReceived
    verbs: [ "update" ]
    resources:
      - group: "batch"
        resources: [ "cronjobs" , "cronjobs" ]
      - group: ""
        resources: [ "cronjobs" ]

  - level: None
    verbs: [ "delete" ]
    resources:
      - group: "batch"
        resources: [ "jobs" ]

  - level: None
    verbs: [ "create", "patch" ]
    resources:
      - group: ""
        resources: [ "pods/log", "pods/status", "pods/portforward" ]

  - level: None
    verbs: [ "patch" ]
    resources:
      - group: ""
        resources: [ "events" ]

  - level: None
    verbs: [ "delete", "update" ]
    resources:
      - group: "crd.projectcalico.org"
        resources: [ "ipamhandles" ]

  - level: None
    verbs: [ "update" ]
    resources:
      - group: "crd.projectcalico.org"
        resources: [ "ipamblocks" ]

  - level: None
    verbs: [ "patch" ]
    resources:
      - group: ""
        resources: [ "nodes/status" ]

  - level: None
    verbs: [ "update" ]
    resources:
      - group: "discovery.k8s.io"
        resources: [ "endpointslices" ]

  - level: None
    verbs: [ "delete", "create" ]
    resources:
      - group: "dex.coreos.com"
        resources: [ "authrequests" ]

  - level: None
    verbs: [ "patch" ]
    resources:
      - group: "velero.io"
        resources: [ "backupstoragelocations", backupstoragelocations/status" ]

  - level: None
    verbs: [ "create" ]
    resources:
      - group: ""
        resources: [ "serviceaccounts/token" ]

  - level: None
    verbs: [ "patch" ]
    resources:
      - group: "argoproj.io"
        resources: [ "applications" ]

  - level: None
    verbs: [ "update" ]
    resources:
      - group: ""
        resources: [ "configmaps" ]

  - level: None
    resources:
      - group: "batch"
        resources: [ "cronjobs" , "cronjobs/status", "jobs/status" ]
      - group: ""
        resources: [ "cronjobs" , "cronjobs/status" ]
      - group: "autoscaling"
        resources: [ "horizontalpodautoscalers/status" ]

  # Don't log requests to a configmap called "controller-leader" / "ingress-controller-leader" / "istio-namespace-controller-election" / "istio-leader" / "istio-gateway-leader"
  - level: None
    resources:
      - group: ""
        resources: [ "configmaps" ]
        resourceNames: [ "controller-leader", "ingress-controller-leader", "istio-namespace-controller-election", "istio-leader", "ingress-controller-leader", "istio-gateway-leader" ]

  - level: None
    omitStages:
      - RequestReceived
    resources:
      - group: authentication.k8s.io
        resources:
          - tokenreviews
      - group: authorization.k8s.io
        resources:
          - subjectaccessreviews
          - selfsubjectaccessreviews

  - level: None
    verbs: [ "get", "list", "watch" ]

  - level: None
    verbs: [ "get", "list", "watch" ]
    resources:
      - group: "" # core
      - group: "admissionregistration.k8s.io"
      - group: "apiextensions.k8s.io"
      - group: "apiregistration.k8s.io"
      - group: "apps"
      - group: "authentication.k8s.io"
      - group: "authorization.k8s.io"
      - group: "autoscaling"
      - group: "certificates.k8s.io"
      - group: "extensions"
      - group: "metrics.k8s.io"
      - group: "networking.k8s.io"
      - group: "policy"
      - group: "rbac.authorization.k8s.io"
      - group: "settings.k8s.io"
      - group: "storage.k8s.io"

  - level: None
    resources:
      - group: ""
        resources: [ "configmaps" ]
        resourceNames: [ "cert-manager-cainjector-leader-election-core","cert-manager-cainjector-leader-election", "ingress-controller-leader-nginx" ]

  # Don't log requests to a configmaps in namespace "topolvm-system" & "pvc-autoresizer"
  - level: None
    resources:
      - group: ""
        resources: [ "configmaps" ]
    namespaces: [ "topolvm-system", "pvc-autoresizer" ]

  - level: None
    resources:
      - group: ""
        resources: [ "ippools", "endpoints" ]

  - level: None
    resources:
      - group: "coordination.k8s.io"
        resources: [ "leases" ]

  # Don't log watch requests by the "system:kube-proxy" on endpoints or services
  - level: None
    users: [ "system:kube-proxy" ]
    verbs: [ "watch" ]
    resources:
      - group: "" # core API group
        resources: [ "endpoints", "services" ]

  # Don't log authenticated requests to certain non-resource URL paths.
  - level: None
    userGroups: [ "system:authenticated" ]
    nonResourceURLs:
      - "/api*" # Wildcard matching.
      - "/version"

  # Dont log this non-resource URL paths.
  - level: None
    nonResourceURLs:
      - "/healthz*"
      - "/logs"
      - "/metrics"
      - "/swagger*"
      - "/version"

  # Log the request body of configmap changes in kube-system.
  - level: Request
    resources:
      - group: "" # core API group
        resources: [ "configmaps" ]
    # This rule only applies to resources in the "kube-system" namespace.
    # The empty string "" can be used to select non-namespaced resources.
    namespaces: [ "kube-system" ]

  # Log configmap and secret changes in all other namespaces at the Metadata level.
  - level: Metadata
    resources:
      - group: "" # core API group
        resources: [ "secrets", "configmaps" ]

  # Log all other resources in core and extensions at the Request level.
  - level: Request
    resources:
      - group: "" # core API group
      - group: "extensions" # Version of group should NOT be included.

  # A catch-all rule to log all other requests at the Metadata level.
  - level: Metadata
    # Long-running requests like watches that fall under this rule will not
    # generate an audit event in RequestReceived.
    omitStages:
      - "RequestReceived"

calico_node_memory_requests: 200M
dns_memory_limit: 2000Mi
dns_cpu_limit: 2000m
nodelocaldns_memory_limit: 500Mi
nodelocaldns_cpu_requests: 50m
calico_loglevel: warning

@kfirfer kfirfer added the kind/bug Categorizes issue or PR as related to a bug. label Apr 1, 2023
@tan-zhuo
Copy link

tan-zhuo commented Apr 3, 2023

The same thing happened to me

@kfirfer
Copy link
Author

kfirfer commented Apr 3, 2023

fyi
when I set nodelocaldns as HA it doesn't reproduce
enable_nodelocaldns_secondary: true

Edit:
it is reproduced but less frequently

@janghyukjin
Copy link

I have same error..

@dimsunv
Copy link

dimsunv commented May 1, 2023

I have same error after restart cluster

@alekseyolg
Copy link
Contributor

I am also getting a similar problem.

@jonathonflorek
Copy link

I also encountered this problem (on kubespray 2.20).

My Fix

I found that setting resolvconf_mode: none fixed it. I would recommend retrying on a fresh host though, I don't think the reset playbook cleans up resolvconf_mode.

I suspect you could also set upstream_dns_servers to a non-empty list, if you know what you want that to be.

Explanation

Without upstream_dns_servers set, both coredns and nodelocaldns 'fall back' to the host's DNS setting.
And with the default resolvconf_mode: host_resolvconf, the host is told to use nodelocaldns/coredns.

So you have a loop: host -> nodelocaldns -> host -> nodelocaldns -> ...

This is what nodelocaldns is detecting. The fix is just to break either the 'host -> nodelocaldns' or 'nodelocaldns -> host' link of the loop.

Kubespray also applies resolvconf_mode: host_resolvconf after it starts up, which it why it seems to work initially but fails after a cluster restart. The nodelocaldns check passed the first time since the loop wasn't configured yet,

Impact of fix

If you set resolvconf_mode: none then you won't be able to access services by their domain names from the hosts. DNS resolution still works fine within the pods, but from the host itself you will have to use a service's IP address. I haven't needed to do DNS resolution of cluster services on the host, though. And if you did, I think upstream_dns_servers would solve the problem just as well and keep host DNS working.

Relevant Files

https://github.com/kubernetes-sigs/kubespray/blob/release-2.20/roles/kubernetes-apps/ansible/templates/coredns-config.yml.j2#L55
https://github.com/kubernetes-sigs/kubespray/blob/release-2.20/roles/kubernetes-apps/ansible/templates/nodelocaldns-config.yml.j2#L83
https://github.com/kubernetes-sigs/kubespray/blob/release-2.20/roles/kubernetes-apps/ansible/tasks/nodelocaldns.yml#L63

@szwede
Copy link

szwede commented May 7, 2023

Same issue here, it is quite misleading as issue appears only after node restart. Hopefully it will be fixed soon

@perfectra1n
Copy link

Same exact issue here, happened after a node restart.

image

@timm1k
Copy link

timm1k commented Jul 3, 2023

I also encountered this problem (on kubespray 2.20).

My Fix

I found that setting resolvconf_mode: none fixed it. I would recommend retrying on a fresh host though, I don't think the reset playbook cleans up resolvconf_mode.

I suspect you could also set upstream_dns_servers to a non-empty list, if you know what you want that to be.

Explanation

Without upstream_dns_servers set, both coredns and nodelocaldns 'fall back' to the host's DNS setting. And with the default resolvconf_mode: host_resolvconf, the host is told to use nodelocaldns/coredns.

So you have a loop: host -> nodelocaldns -> host -> nodelocaldns -> ...

This is what nodelocaldns is detecting. The fix is just to break either the 'host -> nodelocaldns' or 'nodelocaldns -> host' link of the loop.

Kubespray also applies resolvconf_mode: host_resolvconf after it starts up, which it why it seems to work initially but fails after a cluster restart. The nodelocaldns check passed the first time since the loop wasn't configured yet,

Impact of fix

If you set resolvconf_mode: none then you won't be able to access services by their domain names from the hosts. DNS resolution still works fine within the pods, but from the host itself you will have to use a service's IP address. I haven't needed to do DNS resolution of cluster services on the host, though. And if you did, I think upstream_dns_servers would solve the problem just as well and keep host DNS working.

Relevant Files

https://github.com/kubernetes-sigs/kubespray/blob/release-2.20/roles/kubernetes-apps/ansible/templates/coredns-config.yml.j2#L55 https://github.com/kubernetes-sigs/kubespray/blob/release-2.20/roles/kubernetes-apps/ansible/templates/nodelocaldns-config.yml.j2#L83 https://github.com/kubernetes-sigs/kubespray/blob/release-2.20/roles/kubernetes-apps/ansible/tasks/nodelocaldns.yml#L63

Setting upstream_dns_servers helped for us, works with reanimating clusters which fail to start.

@hedayat
Copy link
Contributor

hedayat commented Oct 24, 2023

Try explicitly setting remove_default_searchdomains: false, which is supposed to be the default but it seems it is not (which is a bug IMHO).

More accurately, you need to have Domains=default.svc.cluster.local svc.cluster.local inside resolved configuration file (/etc/systemd/resolved.conf by default).

I'll create a PR for it if it works for others too.

Note: as others also mentioned, apparently changes in resolved.conf is not read unless you restart the cluster or node.

hedayat added a commit to hedayat/kubespray that referenced this issue Oct 24, 2023
It was not 'false', which made some tasks (e.g. using systemd-resolved
template) to effectively remove default search domains; caused DNS loop
after rebooting the node/restarting cluster, so localdns service didn't
run correctly. Fixes kubernetes-sigs#9948
@yckaolalala
Copy link
Contributor

In Jinja, if undefined, it does not match remove_default_searchdomains is sameas false, and the default value is true.

{% if remove_default_searchdomains is sameas false or (remove_default_searchdomains is sameas true and searchdomains|default([])|length==0)%}
Domains={{ ([ 'default.svc.' + dns_domain, 'svc.' + dns_domain ] + searchdomains|default([])) | join(' ') }}
{% else %}
Domains={{ searchdomains|default([]) | join(' ') }}
{% endif %}

@Himanshu631
Copy link

Himanshu631 commented Dec 7, 2023

Hello, did anyone find a solution to this issue
I am also facing the same issue on
kubespray :- v2.23.1
kubernetes :- v1.28.3

the above setting didnot work for me

@Saigut
Copy link

Saigut commented Dec 8, 2023

Hello, did anyone find a solution to this issue I am also facing the same issue on kubespray :- v2.23.1 kubernetes :- v1.28.3

the above setting didnot work for me

Have you tried to delete localnodedns to restart this pod after update above settings for k8s?

@thegamer1907
Copy link

I also encountered this problem (on kubespray 2.20).

My Fix

I found that setting resolvconf_mode: none fixed it. I would recommend retrying on a fresh host though, I don't think the reset playbook cleans up resolvconf_mode.

I suspect you could also set upstream_dns_servers to a non-empty list, if you know what you want that to be.

Explanation

Without upstream_dns_servers set, both coredns and nodelocaldns 'fall back' to the host's DNS setting. And with the default resolvconf_mode: host_resolvconf, the host is told to use nodelocaldns/coredns.

So you have a loop: host -> nodelocaldns -> host -> nodelocaldns -> ...

This is what nodelocaldns is detecting. The fix is just to break either the 'host -> nodelocaldns' or 'nodelocaldns -> host' link of the loop.

Kubespray also applies resolvconf_mode: host_resolvconf after it starts up, which it why it seems to work initially but fails after a cluster restart. The nodelocaldns check passed the first time since the loop wasn't configured yet,

Impact of fix

If you set resolvconf_mode: none then you won't be able to access services by their domain names from the hosts. DNS resolution still works fine within the pods, but from the host itself you will have to use a service's IP address. I haven't needed to do DNS resolution of cluster services on the host, though. And if you did, I think upstream_dns_servers would solve the problem just as well and keep host DNS working.

Relevant Files

https://github.com/kubernetes-sigs/kubespray/blob/release-2.20/roles/kubernetes-apps/ansible/templates/coredns-config.yml.j2#L55 https://github.com/kubernetes-sigs/kubespray/blob/release-2.20/roles/kubernetes-apps/ansible/templates/nodelocaldns-config.yml.j2#L83 https://github.com/kubernetes-sigs/kubespray/blob/release-2.20/roles/kubernetes-apps/ansible/tasks/nodelocaldns.yml#L63

The following steps worked for me

  1. Remove the k8 cluster using the reset ansible playbook
  2. Fix DNS for all the VMs. I reffered to this link
  3. Change settings in k8 config file for kubespray. I changed 2 things: resolvconf_mode: none and remove_default_searchdomains: false
  4. Redeploy the cluster on all VMs
  5. Everything seems to be working fine now. Even after restart the nodelocaldns comes up healthy.

@liyuntao
Copy link

A temporary quickfix for urgent:

  1. determine your DNS IP for external, on nodes. e.g. 1.1.1.1
  2. change configmap for nodelocaldns, modify .53 zone from forward . /etc/resolv.conf line to forward . 1.1.1.1

@mvarendorff2
Copy link

mvarendorff2 commented Nov 13, 2024

I have been doing a fair bit of digging on this and I believe this is something that could be fixed in kubespray properly without the need for workarounds (then again, I am not super proficient in ansible or linux stuff in general so I'll let you be the judge).

The root-cause seems to be a (more or less) faulty content of the configured_nameservers variable here:

{{ (([nodelocaldns_ip] if enable_nodelocaldns else []) + (coredns_server | d([]) if not enable_nodelocaldns else []) + nameservers | d([]) + cloud_resolver | d([]) + (configured_nameservers | d([]) if not disable_host_nameservers | d() | bool else [])) | unique }}

Since it just reads /etc/resolv.conf, on most Ubuntu systems (among other distros) this will contain 127.0.0.53 which (according to ansible/ansible#56772) is just a stub to sytemd-resolved. Since that IP is then attached to the /etc/dhcp/dhclient.conf, it will contain:

supersede domain-name-servers 169.254.25.10, 127.0.0.53;

which then on reboot is translated to an entry

nameserver 127.0.0.53

in /run/systemd/resolve/resolv.conf which is then referenced by the cluster as defined in /etc/kubernetes/kubelet-config.yaml creating the DNS loop.


Using resolvconf_mode: none serves as a workaround because in that case, no entries are written to /etc/dhcp/dhclient.conf and thus don't propagate to the resolv.conf used in the cluster but it feels like a workaround that shouldn't be needed.

Is it possible for kubespray to check the nameserver contents and either ignore the 127.0.0.53 entry or resolve the upstream nameservers defined by systemd-resolved when encountering it?


Currently we are running cleanup operations after running the playbook on our small cluster manually:

  1. Connect to node with faulty localdns-pod
  2. Remove loopback from /etc/dhcp/dhclient.conf:
- supersede domain-name-servers 169.254.25.10, 127.0.0.53;
+ supersede domain-name-servers 169.254.25.10;
  1. Remove loopback nameserver from /run/systemd/resolve/resolv.conf (you are not supposed to edit it but this avoids having to reboot the node again):
- nameserver 127.0.0.53
  1. Delete the failing localdns pod to force an immediate restart (skipping backoff timers)

Admittedly this is rather clunky and if possible, I would love to see this edgecase covered in kubespray! I am also willing to contribute a fix (although I would need a pointer or two in the right direction as I am unfamiliar with the repository and ansible).

@perfectra1n
Copy link

The easiest fix for me was just to use Talos instead of Kubespray. Best decision I made in years 🤣

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet