-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reload storm upon startup / potential security hole #147
Comments
@woopstar (1) Hitting the wrong container Let's assume that you have 3 Ingress resources, each with its own domain: Ingress1 - domain1, Ingress2 - domain2, Ingress3 - domain3. This problem can be addresses for http requests, if you start the Ingress controller with the For https request, there is no option for now, but we will definitely add it. I can suggest a quick workaround to add the catch-all server for https requests: First, add the following lines to configure the default server for https requests to nginx.conf.tmpl listen 443 ssl default_server;
ssl_certificate /etc/nginx/default-ssl/tls.crt;
ssl_certificate_key /etc/nginx/default-ssl/tls.key; Build the image with the modified nginx.conf.tmpl. Second, deploy a secret with an SSL key and certificate for the default server. In our example, it is Third, deploy NGINX Ingress controller, adding the following lines to the Replication Controller/Deployment for mounting that secret at the volumeMounts:
- name: default-secret
mountPath: /etc/nginx/default-ssl
readOnly: true
volumes:
- name: default-secret
secret:
secretName: cafe-secret Let me know if this workaround addresses your problem. (2) The warm-up period Also, it is possible to add a warm-up period to the Pod Specification using a Readiness Probe, as shown below: readinessProbe:
httpGet:
path: /nginx-health
port: 80
initialDelaySeconds: 20
periodSeconds: 5 The NGINX pod will be considered ready 20 seconds after it starts. It works when you expose NGINX using a Service, in conjunction with a NodePort or a Cloud Load balancer, which utilizes the NodePort. If you are exposing NGINX through another load balancer and not using the NodePort, it is possible to add the initialDelaySeconds to the Controller, so that when the Controller is started with |
So I looked a bit into this issue again. The first part using a custom nginx config and adding a default ssl server to the health block would work. I like the option as I have not found any other solutions that actually works better. It was the same option I came to. Regarding the readiness probe. You are right that under the circumstances that you expose it using a Service etc. you could use the probe to make make it turn ready. I think an initial delay is so far the best option while maybe still adding the debounce option. I am not sure yet. If we drain a node and update it, and restart it. We have the problem with the ingress controlling being ready "too soon" to our in front load balancer causing it to get requests while the reload storm is on. |
In this case it looks like delaying the successful health check response should work, if your front end load balancer supports HTTP health checks. Is that the case? The debounce option can make sure that the initial configuration of NGINX happens in a single step, as opposed to the current behavior when Ingress resources are added sequentially, reloading NGINX each time. However, I am not sure about use cases when the current behavior cause problems. Need to think more about it. |
I had a developer help out and added the debounce functionality to the reload function in the forked version of this ingress we use. It is only forked as we have altered the templates used. You can see the changes here: https://github.com/pasientskyhosting/kubernetes-ingress/pull/1/files#diff-b042f5fa65dd46d9ae76cb0616601e95L22 |
@woopstar thx. I'll take a closer look. Does it solve your issue? |
Not totally. We still see another race condition too. We have about 200 ingress's that uses the same certificate. We see a race condition where the certificate is written, nginx is reloaded, then it rewrites the file while nginx tries to read the file. I am thinking the best solution is to alter the name of the certificate to include the ingress name it belongs too. I do know that would give you 200 "equal" files, but then you wont have the race condition where you read a file while writing to it multiple times. |
Yup. Race condition of reading certificates is gone. If you want, we can create a pull request. I did this change: |
@woopstar Also, the change will cause issues if you have multiple secrets referenced from an Ingress resource, because the corresponding pem files will get the same name. Thus, you will have only one pem file instead of multiple, which is a bug. |
We have about 200 ingress' in our setup as of now.
When we start a nginx-ingress controller, nginx starts when the pod comes up. But then a reload storm starts while it fetches all ingress' and reloads the config for each of them (creating the conf.d files).
Is there any option to have the nginx controller do some sort of warm up, while it fetches all ingress' upon startup and THEN starts nginx?
We see that the controller starts to serve pages way before it is even read during this reload storm, causing returns to be very funky.
Also, when this happens - maybe an implementation of some sort of debounce on the nginx reload? (http://reactivex.io/documentation/operators/debounce.html) - I found a GO implementation that should be fairly easy to implement upon the reload function: https://github.com/bep/debounce or even this great example https://nathanleclaire.com/blog/2014/08/03/write-a-function-similar-to-underscore-dot-jss-debounce-in-golang/
Here is a sample output while all our nginx-controllers are reloading upon an update. First a connection refused, and then suddenly it hits a jetty container, while the normal container that serves the request is php. (oh boy, can this leak information?)
The text was updated successfully, but these errors were encountered: