Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation: setting up ingress on metal with the right, secure CNI configuration #17

Closed
sandys opened this issue Nov 16, 2016 · 16 comments

Comments

@sandys
Copy link

sandys commented Nov 16, 2016

Based on a conversation with @Beeps on slack.

the BIGGEST problem with ingress is that the common assumption is "you will use google cloud or AWS ELB". There is exactly zero information on how to use ingress on metal. What is a production configuration ? do the ingress controllers sit on another node, etc. This is EXTREMELY confusing because if you do L4 routing.. then there is a dependence on configuring CNI/Flannel/Calico in the right way ... especially wrt security. Weave gets even more confusing.

I think the very basic configuration im looking for is for ingress to act as a pipe and passthrough everything to a nginx pod inside. there are challenges on how to set up the ingress to do this correctly ... and maintaining information like "actual source ip". We are really blocking on this stuff. Most of the docs around ingress delve into the more complex aspects of how to setup the annotations correctly, etc... but miss out on just creating a pipe.

Since most of us have existing infrastructure that we are porting to K8s... I believe this is the best first example.

@sandys
Copy link
Author

sandys commented Nov 16, 2016

as a followup - some people are running ingress as daemonsets. Not sure if that is one of the recommendations - would still like to see how the networking, security, etc configuration needs to be setup

@nordri
Copy link

nordri commented Nov 17, 2016

I found this that works for me:https://medium.com/@rothgar/exposing-services-using-ingress-with-on-prem-kubernetes-clusters-f413d87b6d34#.4ifqsb8ge

@sandys
Copy link
Author

sandys commented Nov 20, 2016

@nordri there are other questions - should an ingress be scaled with a loadbalancer in front ? or is ingress treated as a loadbalancer itself.

would you create a daemonset of ingresses?

Remember that there is no working loadbalancer on metal (integrated into k8s unlike other cloud providers)

@rothgar
Copy link
Member

rothgar commented Nov 21, 2016

I'm the author of the post above and I can clarify at least how I run it.

My ingress controller is a daemonset on all my worker nodes. I have a hardware loadbalancer with a VIP that routes traffic to the ingress controllers (:80 and :443). In this setup I have redundant loadbalancers (layer 4) in front of ingress controllers that also act as layer 7 load balancers.

The problems with this setup I have run into are

  • The hardware load balancers are not synced to kubernetes at all and currently are manually configured. Someone showed me an F5 controller for kubernetes which looks promising
  • The current ingress controllers I've used (nginx/traefik) have only been layer 7. nginx and haproxy are capable of TCP/UDP routing but I haven't tried them as controllers.
  • The nginx controller has some weird oddities with routing if you don't have a default (catch-all) service defined and use use TLS termination

I've seen lots of people who use keepalive + haproxy externally for dynamic service load balancing. I've also heard of plenty of people that have automated their hardware load balancers to do similar things. Some companies also use anycast for routing to IPs based on location and for failover. You could technically even have an external nginx controller that routes your traffic for you.

With on-prem there are almost too many ways to document "the way" of doing things. Any company will want to leverage their existing hardware, network configuration, and expertise to accomplish the same goal.

With GCE and AWS you could do many of the same things, but why would you when they have a better way to do that already.

@sandys
Copy link
Author

sandys commented Nov 21, 2016

@rothgar I think bare metal needs very specific attention as well, so do many others.. can you chime in? https://groups.google.com/forum/m/#!topic/kubernetes-dev/ztVnvbrpTK8

hopefully (if nothing else), we can probably have a Traefik/haproxy setup in k8s

@bprashanth
Copy link
Contributor

We should make the nginx controller work well on bare metal: https://github.com/kubernetes/ingress/tree/master/controllers/nginx, we're still transferring/rewriting the docs from contrib but these steps should basically work https://github.com/kubernetes/contrib/tree/master/ingress/controllers/nginx#deployment and loadbalance straight to endpoints. If you try it, please report where it doesn't.

I know people also stick keepalive in front: https://github.com/kubernetes/contrib/tree/master/keepalived-vip, thought that's similar to just using clusterIP + a nodePort service

@bprashanth
Copy link
Contributor

@sandys
Copy link
Author

sandys commented Nov 24, 2016 via email

@bprashanth
Copy link
Contributor

This is a systems problem, not really specific to ingress, though I agree we can document the options better. The treadoffs are the same as when to use a daemonset vs a deployment.

Get traffic to the controller, the controller applies routing rules.

If your cluster needs a bank of loadbalancers, use a deployment, the clusterip will send traffic to one of them, which applies the algorithm and routes to an endpoint. I think you're asking how to avoid the extra hop, not how to avoid a "loadbalancer", you can't if you're using the default kube-proxy implementation of cluster ip but often times the convenience of clusterip trumps the latency of that hop.

You can avoid the extra hop for true ingress traffic (not intra cluster via clusterip) by deploying a daemonset + node port service with the onlylocal annotation (http://kubernetes.io/docs/user-guide/load-balancer/#annotation-to-modify-the-loadbalancer-behavior-for-preservation-of-source-ip, eg kubernetes/kubernetes#35758 (comment)).

@sandys
Copy link
Author

sandys commented Nov 24, 2016

hi @bprashanth
actually i did not know there was even a feature called clusterip that can
loadbalance across a scaled deployment of ingress (and avoid the need to
muck about keepalived/traefik) ... and i have discussed this a LOT on
slack.

is there a place where i can take a look at such a setup?

@aledbf
Copy link
Member

aledbf commented Nov 24, 2016

https://github.com/kubernetes/contrib/tree/master/keepalived-vip, thought
that's similar to just using clusterIP + a nodePort service

Using keepalived you can choose the load balancing algorithm for instance or weight

what happens at scale? do you have a scaled deployment of ingress or do you
do a daemonset?

what do you mean by "at scale"?
As a comment the commercial version of nginx suggest keepalived (the software keepalived-vip wraps) as the solution for HA (in baremetal) https://www.nginx.com/resources/admin-guide/nginx-ha-keepalived/

@sandys
Copy link
Author

sandys commented Nov 24, 2016

@aledbf well im not second guessing keepalived.. im just asking for a recommended way on metal. one way is keepalived, another case is hostport with ingress daemonset... today i learned about clusterip (dont know if that is a good alternative to keepalived).

@nordri
Copy link

nordri commented Nov 24, 2016

I just wonder why @rothgar need an ingress controller for each node.
I'm using Flannel as network overlay so I just need an ingress that routes the traffic to the appropriate node.

@bprashanth
Copy link
Contributor

bprashanth commented Nov 24, 2016

clusterip is just the ip each kubernetes Services gets: http://kubernetes.io/docs/user-guide/services/#choosing-your-own-ip-address, currently it only supports rr or ip based affinity. It's the easiest way to get an intra-cluster vip in kube. The ingress documentation is not the place to explain how clusterIP works, though there should probably be a section comparing keepalive with clusterIP.

Flannel is only going to handle L3/L4 traffic, to apply routing rules you actually need to parse http, buffer the request, match regexes etc. You will need multiple controllers as qps goes up. Whether you choose to pre provision one per node or reactively scale up a deployment is really an architectural design choice that the admin needs to make. Some people keep it one per node so it can share node secrets, for example. Some people even run it as a sidecar (one per pod) for the same reason.

@rothgar
Copy link
Member

rothgar commented Nov 26, 2016

@nordri I use ingress controller per node because I have a hardware load balancer routing to all of my worker nodes on :80 and :443. This gives me a stable IP address to route all my services (the same you get with keepalive and anyip) If you only have 1 ingress controller and you lose the node the ingress controller is running on you would need to change DNS aliases for services you're routing. You could do that through keepalive but it can also be done through a load balancer.

@sandys
Copy link
Author

sandys commented Nov 26, 2016

@bprashanth thanks for the explanation. yes - it would be my request to add some information about using clusterip with ingress. that would be really useful - i would like to reiterate that this is the first time im even hearing about this possibility despite discussing this on slack several times.

ingress documentation should also have deployment patterns. with all due respect perhaps a lot of people really understand the nuances of ingress scaled deployment vs daemonset vs sidecar... but for a lot of us, this is new.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants