Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reload storm upon startup / potential security hole #147

Closed
woopstar opened this issue Jun 1, 2017 · 8 comments
Closed

Reload storm upon startup / potential security hole #147

woopstar opened this issue Jun 1, 2017 · 8 comments

Comments

@woopstar
Copy link
Contributor

woopstar commented Jun 1, 2017

We have about 200 ingress' in our setup as of now.

When we start a nginx-ingress controller, nginx starts when the pod comes up. But then a reload storm starts while it fetches all ingress' and reloads the config for each of them (creating the conf.d files).

Is there any option to have the nginx controller do some sort of warm up, while it fetches all ingress' upon startup and THEN starts nginx?
We see that the controller starts to serve pages way before it is even read during this reload storm, causing returns to be very funky.

Also, when this happens - maybe an implementation of some sort of debounce on the nginx reload? (http://reactivex.io/documentation/operators/debounce.html) - I found a GO implementation that should be fairly easy to implement upon the reload function: https://github.com/bep/debounce or even this great example https://nathanleclaire.com/blog/2014/08/03/write-a-function-similar-to-underscore-dot-jss-debounce-in-golang/

Here is a sample output while all our nginx-controllers are reloading upon an update. First a connection refused, and then suddenly it hits a jetty container, while the normal container that serves the request is php. (oh boy, can this leak information?)

# [20:59:09] ~
$ curl https://domain/internal-api
{"code":0,"message":"No route found for \u0022GET \/internal-api\u0022"}%

# [20:59:10] ~
$ curl https://domain/internal-api
curl: (7) Failed to connect to domain port 443: Connection refused

# [20:59:11] ~
$ curl https://domain/internal-api
{"code":0,"message":"No route found for \u0022GET \/internal-api\u0022"}%

# [20:59:15] ~
$ curl https://domain/internal-api
{"code":0,"message":"No route found for \u0022GET \/internal-api\u0022"}%

# [20:59:16] ~
$ curl https://domain/internal-api
{"code":0,"message":"No route found for \u0022GET \/internal-api\u0022"}%

# [20:59:17] ~
$ curl https://domain/internal-api
{"code":0,"message":"No route found for \u0022GET \/internal-api\u0022"}%

# [20:59:18] ~
$ curl https://domain/internal-api
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 404 </title>
</head>
<body>
<h2>HTTP ERROR: 404</h2>
<p>Problem accessing /internal-api. Reason:
<pre>    Not Found</pre></p>
<hr /><a href="http://eclipse.org/jetty">Powered by Jetty:// 9.3.z-SNAPSHOT</a><hr/>
</body>
</html>

# [20:59:19] ~
$ curl https://domain/internal-api
{"code":0,"message":"No route found for \u0022GET \/internal-api\u0022"}%
@pleshakov
Copy link
Contributor

@woopstar
I see two issues here:

(1) Hitting the wrong container
This happens because currently there is no catch-all server in NGINX for SSL connections.

Let's assume that you have 3 Ingress resources, each with its own domain: Ingress1 - domain1, Ingress2 - domain2, Ingress3 - domain3.
If you deploy Ingress1 and Ingress2, and then try to make a request to https://domain3, depending on the order in which NGINX configuration for those Ingresses was added, you will hit either services behind Ingress1 or Ingress2. Once you deploy Ingress3, you will hit services behind Ingress3.

This problem can be addresses for http requests, if you start the Ingress controller with the -health-status option, which will configure the default catch-all server for http requests.

For https request, there is no option for now, but we will definitely add it.

I can suggest a quick workaround to add the catch-all server for https requests:

First, add the following lines to configure the default server for https requests to nginx.conf.tmpl

        listen 443 ssl default_server;

        ssl_certificate /etc/nginx/default-ssl/tls.crt;
        ssl_certificate_key /etc/nginx/default-ssl/tls.key;

Build the image with the modified nginx.conf.tmpl.

Second, deploy a secret with an SSL key and certificate for the default server. In our example, it is cafe-secret.

Third, deploy NGINX Ingress controller, adding the following lines to the Replication Controller/Deployment for mounting that secret at the /etc/nginx/default-ssl folder:

        volumeMounts:
        - name: default-secret
          mountPath: /etc/nginx/default-ssl
          readOnly: true
      volumes:
      - name: default-secret
        secret:
          secretName: cafe-secret

Let me know if this workaround addresses your problem.

(2) The warm-up period
There is no easy way to add all Ingresses at once in a single step. However, assuming that the hitting the wrong container issue is addresses, is this still important?

Also, it is possible to add a warm-up period to the Pod Specification using a Readiness Probe, as shown below:

       readinessProbe:
          httpGet:
            path: /nginx-health
            port: 80
          initialDelaySeconds: 20
          periodSeconds: 5

The NGINX pod will be considered ready 20 seconds after it starts. It works when you expose NGINX using a Service, in conjunction with a NodePort or a Cloud Load balancer, which utilizes the NodePort.

If you are exposing NGINX through another load balancer and not using the NodePort, it is possible to add the initialDelaySeconds to the Controller, so that when the Controller is started with -health-status, NGINX will return a successful health check only after initialDelaySeconds period elapses. What do you think?

@woopstar
Copy link
Contributor Author

woopstar commented Jun 7, 2017

So I looked a bit into this issue again.

The first part using a custom nginx config and adding a default ssl server to the health block would work. I like the option as I have not found any other solutions that actually works better. It was the same option I came to.

Regarding the readiness probe. You are right that under the circumstances that you expose it using a Service etc. you could use the probe to make make it turn ready.
Unfortunately we do not expose it in that we, but it uses the host network so we cannot use that option.

I think an initial delay is so far the best option while maybe still adding the debounce option. I am not sure yet. If we drain a node and update it, and restart it. We have the problem with the ingress controlling being ready "too soon" to our in front load balancer causing it to get requests while the reload storm is on.

@pleshakov
Copy link
Contributor

I think an initial delay is so far the best option while maybe still adding the debounce option. I am not sure yet. If we drain a node and update it, and restart it. We have the problem with the ingress controlling being ready "too soon" to our in front load balancer causing it to get requests while the reload storm is on.

In this case it looks like delaying the successful health check response should work, if your front end load balancer supports HTTP health checks. Is that the case?

The debounce option can make sure that the initial configuration of NGINX happens in a single step, as opposed to the current behavior when Ingress resources are added sequentially, reloading NGINX each time. However, I am not sure about use cases when the current behavior cause problems. Need to think more about it.

@woopstar
Copy link
Contributor Author

I had a developer help out and added the debounce functionality to the reload function in the forked version of this ingress we use. It is only forked as we have altered the templates used.

You can see the changes here: https://github.com/pasientskyhosting/kubernetes-ingress/pull/1/files#diff-b042f5fa65dd46d9ae76cb0616601e95L22

@pleshakov
Copy link
Contributor

@woopstar thx. I'll take a closer look. Does it solve your issue?

@woopstar
Copy link
Contributor Author

Not totally. We still see another race condition too.

We have about 200 ingress's that uses the same certificate. We see a race condition where the certificate is written, nginx is reloaded, then it rewrites the file while nginx tries to read the file.

I am thinking the best solution is to alter the name of the certificate to include the ingress name it belongs too. I do know that would give you 200 "equal" files, but then you wont have the race condition where you read a file while writing to it multiple times.

@woopstar
Copy link
Contributor Author

Yup. Race condition of reading certificates is gone. If you want, we can create a pull request. I did this change:

pasientskyhosting@5796702

@pleshakov
Copy link
Contributor

@woopstar
Thanks for pointing out a problem. Unfortunately, the problem may still occur even with your change applied. It is better to change how a pem file gets written, so that there is no way the file can be partially written. For that, we need to write the content to a temp file and then rename the temp file.

Also, the change will cause issues if you have multiple secrets referenced from an Ingress resource, because the corresponding pem files will get the same name. Thus, you will have only one pem file instead of multiple, which is a bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants