-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DNS errors cause global-rate-limit-memcached-connect-timeout to not be respected. #11337
Comments
This issue is currently awaiting triage. If Ingress contributors determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/remove-kind bug 1- dns should hit the service and not the pod ips so when they're replaced dns doesn't just timeout. (I haven't figured this one out yet.)
|
The pod doesn't have to crash, any dropped dns queries to resolve the memcached host will cause the configured timeout to not be followed. (Assuming the memcached timeout is < dns retry interval.) This is an example: coredns pod ip
Delete the pod, or rollout restart the deployment, and the new pod will get a new IP.
By DNS - I mean the DNS query that ingress-nginx uses to resolve the global-rate-limit-memcached-host to an IP address. By service I mean the coredns service, in this case 10.65.0.10 (10.65/16 is this example clusters service range. 10.64/16 is the pod cidr range, with /24's being given to each node to be assigned to pods.)
Now connections will be kept alive to memcached, but if the memcached deployment also gets an update or pods get rescheduled, ingress-nginx will try to reconnect and query coredns for the IP. When this happens (only with queries to ingress objects using global-rate-limit) this shows up in the hubble/cilum logs.
(Notice how this is to a coredns pod IP and not to the coredns service address.) Then after the dns query to resolve the global-rate-limit-memcached-host times out, the request to the backend pod happens as normal - so every request is delayed by an extra 10s. I haven't figured out why this happens, and querying for the memcached hostname from within the nginx pod resolves correctly.
This suggests to me that it's something within ingress-nginx/openresty/lua that is doing something weird with dns, but I haven't figured that part out yet. (It could also be some weird issue with cilium/ebpf.) None of these specifics really matter, because regardless of the underlying issue, it shouldn't cause a 10s delay to requests when the timeout is configured to be 50ms. |
Thanks for explaining. But none of your data is actual live status from output of commands so I personally don't see much relevance pointing to a problem in the controller like the suggestion that you configured a 50ms timeout for something, somewhere in the controller and the controller did not honor that.
But these are my opinions. Wait for comments from experts who agree that a user breaking coredns showcases a bug in the ingress-controller image you are running, which may not be a image released by this project. |
Also, if the coredns pods crashed and recreated, is there a hint here that the ipaddress of the memcached pod changed ? I wish you get to the bottom of this issue and solve future occurrence of this break in rate-limiting |
This is using the ingress-nginx helm chart. You can see the output of
You can see the helm chart values I used to reproduce this issue in the description. The only thing configured is the memcached host.
The 50ms timeout is the default, as shown here:
DNS cache is implementation specific, If you have more information on the lua resolver dns cache implementation, that would be helpful. As indicated by my nslookup command from within the nginx controller pod, the pod can resolve it correctly, and uses the coredns service address to do so, but based on the traffic captured the lua code for the global-rate-limit does not. This does not require blocking traffic to coredns, or coredns crashing. Coredns was updated, and how updates work is by creating a new pod, waiting for it to become ready, the service/endpoint slices get updated, and then the old pods are removed. |
Thanks for your comments. If I was to summarize the problem, it would like below. You are reporting that after a crash/recreate of the coredns pods in a cluster, it takes 10s for ingress-nginx-controller to resolve the memcached workload service to a memcached pod ipaddress, and this 10s is a blind-spot for the rate-limit annotation because it can NOT get the counters in those 10seconds. You are expecting that it should take only 50ms, after the crash/recreate of coredns pods in a cluster, for ingress-nginx to resolve & query the memcached workload and get the rate-limit-counters, for the rate-limit-annotation. Is this correct ? |
2 issues - issue 1 causes issue 2. DNS resolution from within the global-rate-limit code-path does not use the service ip, and does not reach coredns if the pod IPs change. This does not apply to nslookup from within the ingress-nginx pod, or to resolving the backend pods for an ingress. (I believe this uses the k8s api instead of dns, but regardless.) Any dns lookup failure (i.e. timeout, not an immediate NXDOMAIN) will cause traffic through ingress-nginx to be delayed by 10s, even when the timeout configured for the global-rate-limit is set to 50ms. Here's an easier way you can reproduce issue 2. This is showing both the input (commands I run) and the output. I'll put just the commands down below. user@computer ~ % kind create cluster
Creating cluster "kind" ...
✓ Ensuring node image (kindest/node:v1.29.2) 🖼
✓ Preparing nodes 📦
✓ Writing configuration 📜
✓ Starting control-plane 🕹️
✓ Installing CNI 🔌
✓ Installing StorageClass 💾
Set kubectl context to "kind-kind"
You can now use your cluster with:
kubectl cluster-info --context kind-kind
Not sure what to do next? 😅 Check out https://kind.sigs.k8s.io/docs/user/quick-start/
user@computer ~ % helm upgrade --install ingress-nginx ingress-nginx \
--repo https://kubernetes.github.io/ingress-nginx \
--namespace ingress-nginx --create-namespace \
--set controller.config.global-rate-limit-memcached-host=asdf.blackhole.alphabet5.dev
Release "ingress-nginx" does not exist. Installing it now.
NAME: ingress-nginx
LAST DEPLOYED: Fri May 3 12:21:37 2024
NAMESPACE: ingress-nginx
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The ingress-nginx controller has been installed.
It may take a few minutes for the load balancer IP to be available.
You can watch the status by running 'kubectl get service --namespace ingress-nginx ingress-nginx-controller --output wide --watch'
An example Ingress that makes use of the controller:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: example
namespace: foo
spec:
ingressClassName: nginx
rules:
- host: www.example.com
http:
paths:
- pathType: Prefix
backend:
service:
name: exampleService
port:
number: 80
path: /
# This section is only required if TLS is to be enabled for the Ingress
tls:
- hosts:
- www.example.com
secretName: example-tls
If TLS is enabled for the Ingress, a Secret containing the certificate and key must also be provided:
apiVersion: v1
kind: Secret
metadata:
name: example-tls
namespace: foo
data:
tls.crt: <base64 encoded cert>
tls.key: <base64 encoded key>
type: kubernetes.io/tls
user@computer ~ % cat <<EOF > manifests.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: dummy
spec:
selector:
matchLabels:
app: dummy
replicas: 1
template:
metadata:
labels:
app: dummy
spec:
containers:
- name: podinfo
image: stefanprodan/podinfo
ports:
- containerPort: 9898
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: dummy
annotations:
nginx.ingress.kubernetes.io/global-rate-limit: "1000000"
nginx.ingress.kubernetes.io/global-rate-limit-window: "1m"
spec:
ingressClassName: nginx
rules:
- host: "localhost"
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: dummy
port:
number: 9898
---
apiVersion: v1
kind: Service
metadata:
name: dummy
spec:
selector:
app: dummy
ports:
- protocol: TCP
port: 9898
targetPort: 9898
EOF
user@computer ~ % kubectl apply -f manifests.yaml
deployment.apps/dummy created
ingress.networking.k8s.io/dummy created
service/dummy created
user@computer ~ % sudo kubectl port-forward -n ingress-nginx svc/ingress-nginx-controller 80:80
user@computer ~ % Forwarding from 127.0.0.1:80 -> 80
Forwarding from [::1]:80 -> 80
user@computer ~ % time curl http://localhost
Handling connection for 80
user@computer ~ % time curl http://localhost
Handling connection for 80
{
"hostname": "dummy-749c8b54cd-js75j",
"version": "6.6.2",
"revision": "8d010c498e79f499d1b37480507ca1ffb81a3bf7",
"color": "#34577c",
"logo": "https://raw.githubusercontent.com/stefanprodan/podinfo/gh-pages/cuddle_clap.gif",
"message": "greetings from podinfo v6.6.2",
"goos": "linux",
"goarch": "arm64",
"runtime": "go1.22.2",
"num_goroutine": "6",
"num_cpu": "4"
}curl http://localhost 0.00s user 0.00s system 0% cpu 6.028 total
kind create cluster
helm upgrade --install ingress-nginx ingress-nginx \
--repo https://kubernetes.github.io/ingress-nginx \
--namespace ingress-nginx --create-namespace \
--set controller.config.global-rate-limit-memcached-host=asdf.blackhole.alphabet5.dev
cat <<EOF > manifests.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: dummy
spec:
selector:
matchLabels:
app: dummy
replicas: 1
template:
metadata:
labels:
app: dummy
spec:
containers:
- name: podinfo
image: stefanprodan/podinfo
ports:
- containerPort: 9898
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: dummy
annotations:
nginx.ingress.kubernetes.io/global-rate-limit: "1000000"
nginx.ingress.kubernetes.io/global-rate-limit-window: "1m"
spec:
ingressClassName: nginx
rules:
- host: "localhost"
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: dummy
port:
number: 9898
---
apiVersion: v1
kind: Service
metadata:
name: dummy
spec:
selector:
app: dummy
ports:
- protocol: TCP
port: 9898
targetPort: 9898
EOF
sudo kubectl port-forward -n ingress-nginx svc/ingress-nginx-controller 80:80
time curl http://localhost If you modify the memcached host to something that doesn't exist, the timeout works as expected. (i.e. if the failure is not dns related, and is the socket connect) |
For this last reproduction, the time is not exactly 10s because the SERVFAIL time will vary based on your upstream dns servers. If lua can't connect to your dns server, it will always be 10s. (Because it's configured for 5x retries, 2s timeout.) time nslookup asdf.blackhole.alphabet5.dev 1.1.1.1
Server: 1.1.1.1
Address: 1.1.1.1#53
** server can't find asdf.blackhole.alphabet5.dev: SERVFAIL
real 0m4.079s
user 0m0.005s
sys 0m0.005s time nslookup asdf.blackhole.alphabet5.dev 8.8.8.8
;; communications error to 8.8.8.8#53: timed out
;; communications error to 8.8.8.8#53: timed out
;; communications error to 8.8.8.8#53: timed out
;; no servers could be reached
real 0m15.131s
user 0m0.004s
sys 0m0.008s |
Hi @alphabet5 , thank you very much for the updated info you provided. Appreciate the time taken to give details. These 2 message from you highlighted below , seem to be the key to issue you are reporting. NS resolution from within the global-rate-limit code-path does not use the service ip, and does not reach coredns if the pod IPs change. and this message This does not apply to nslookup from within the ingress-nginx pod, or to resolving the backend pods for an ingress. (I believe this uses the k8s api instead of dns, but regardless.) I think it will help if you copy/paste the links to the code-snippets you are referring to. Meanwhile I will try to look it up and see if I can understand the details of the 2 messages highlighted above. |
This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach |
What happened:
We had a unique failure where coredns pods were removed/replaced during an upgrade. ingress-nginx was sending traffic to the old pod IPs which was getting dropped by cilium (because they were stale IPs / not assigned to anything.)
The default dns timeout is 5 retries x 2000ms - so a 10s delay happens, even when global-rate-limit-memcached-connect-timeout is set to 50ms.
ingress-nginx/rootfs/etc/nginx/lua/util/dns.lua
Line 99 in 300f772
This is the resulting log message
What you expected to happen:
1- dns should hit the service and not the pod ips so when they're replaced dns doesn't just timeout. (I haven't figured this one out yet.)
2- the 50ms timeout should be followed even if dns is down.
For #2, it's easily reproducible:
Ingress values.yaml
All the manifests
block dns for testing
This results in the following set of log messages
NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):
NGINX Ingress controller
Release: v1.10.1
Build: 4fb5aac
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.25.3
Kubernetes version (use
kubectl version
):(Also on v1.27.8+rke2r1 and v1.28.8+rke2r1)
Environment:
Debian GNU/Linux 12 (bookworm)
,Ubuntu 22.04
,Ubuntu 20.04
uname -a
): v5.15? (20.04) and6.1.0-18-amd64
The text was updated successfully, but these errors were encountered: