Tweaks to network config #1407

cknowles · 2018-07-15T05:00:34Z

While looking at some issues with CNI and hostPort connections mentioned in #704 (comment), I found some discrepancies between the kube-aws network configuration and what's expected or perhaps optimal. I've since discovered the cause of my hostPort issues was something unrelated but the items I found are likely worth including in kube-aws. I'm going to list them here first and we can always split them up.

We are not setting the MTU for Calico in the CNI config. It means that for Canal the MTU of the flannel interface is 8951 but all the Calico interfaces are 1500. It still works but I believe it's not the optimal value. According to the Calico docs, we should set this to 8951 to match flannel:

When using flannel for networking, the MTU for the network interfaces should match the MTU of the flannel interface. In the above table the 4th column “Calico MTU with VXLAN” is the expected MTU when using flannel configured with VXLAN.

We should set externalSetMarkChain to KUBE-MARK-MASQ in the CNI config to reuse existing iptables chains. It's not in the default YAMLs but I found it here. It seems our friend redbaron found the same

externalSetMarkChain - string, default nil. If you already have a Masquerade mark chain (e.g. Kubernetes), specify it here. This will use that instead of creating a separate chain. When this is set, markMasqBit must be unspecified.

We should set the --service-cluster-ip-range flag on controller manager as it prevents IPs being assigned in the service CIDR range should it overlap with the pod CIDR range. This doesn't seem to be documented anywhere I could find, but most bootstrap instructions set it and I just found we didn't set it. Dug into the code and found this.
When using Canal or Flannel, we should be defaulting the podCIDR to 10.244.0.0/16. When using Calico it should be 192.168.0.0/16. References - Calico, Canal, kubeadm instructions and code. I'm wondering what the impact of setting these to anything else is as our default right now is 10.2.0.0/16 and it appears to work for basic functionality. Perhaps the impact is in network policy enforcement?
We should update cniVersion from 0.3.0 to 0.3.1

The text was updated successfully, but these errors were encountered:

Fixes kubernetes-retired#1407. For now just document the `podCIDR` aspects but we should probably switch the default based on the networking setup, possibly combine this config with the `selfHosting` config into a networking section.

davidmccormick · 2018-07-16T13:55:29Z

Hi

Good catch. I vote for all of the above, except 4 - which I don't rightly understand what is the significance of using one cidr range over another? Can't we decide our own cidr and then accommodate the defaults and kube manifests around that choice? What is the advantage of using 192.168.0.0 with calico?

We choose our own cidr ranges so I don't think we ever keep the default settings.

cknowles · 2018-07-16T14:07:09Z

@davidmccormick I'm not sure at this stage about 4 and why every set of instructions I found seem to indicate it has to be set to that. At least on the surface our defaults work right now so I assume there's some other subtle issue or just everyone has copied the instructions verbatim 😆 (including kubeadm).

davidmccormick · 2018-07-17T14:37:09Z

:) I think we should stick with the same default podCIDR unless we find hard evidence to change it.
All the others sound like we should merge asap.

cknowles · 2018-07-17T15:17:55Z

Sure, we can at least do that and have the other bit ready.

One thing though, I'm struggling to identify the cause of a full cluster outage so asked for the PR to not be merged yet (see the message there but either related to this change or
#1397). Nodes started marking themselves a NotReady intermittently and sometimes NotScheduleable until gradually all of them fell over. A dev cluster with a few nodes did not experience that issue which is why we promoted the changes to a larger cluster with more use and that's where the problems started. Actually the dev cluster experienced it on one node just after we had successfully restored the downed cluster. Killing that one node and the dev cluster has been fine for another couple of days.

Surprisingly for the downed cluster a brand new cluster using a backup/restore via ark didn't work, the restored pods appeared to make the new cluster fragile in some way. Redeployed all from source and it was healthy. I suspect it was the CIDR change although it could be the newer Calico or the other network changes. I did follow the instructions from your recent PR supporting CIDR changes with cluster downtime. I found a few kube issues related to issues waiting for CNI plugins to respond which seem like they have similar symptoms.

We still have the node problem detector backlog ticket in kube-aws of course but that'll only patch over the root cause.

* Tweaks to network config Fixes #1407. For now just document the `podCIDR` aspects but we should probably switch the default based on the networking setup, possibly combine this config with the `selfHosting` config into a networking section. * Correct pod CIDR notes

cknowles self-assigned this Jul 15, 2018

cknowles added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 15, 2018

cknowles mentioned this issue Jul 15, 2018

Tweaks to network config #1408

Merged

mumoshu closed this as completed in #1408 Sep 16, 2018

edalford11 mentioned this issue Sep 24, 2018

Best practices in configuring and running production clusters #1050

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tweaks to network config #1407

Tweaks to network config #1407

cknowles commented Jul 15, 2018 •

edited

Loading

davidmccormick commented Jul 16, 2018 •

edited

Loading

cknowles commented Jul 16, 2018

davidmccormick commented Jul 17, 2018

cknowles commented Jul 17, 2018 •

edited

Loading

Tweaks to network config #1407

Tweaks to network config #1407

Comments

cknowles commented Jul 15, 2018 • edited Loading

davidmccormick commented Jul 16, 2018 • edited Loading

cknowles commented Jul 16, 2018

davidmccormick commented Jul 17, 2018

cknowles commented Jul 17, 2018 • edited Loading

cknowles commented Jul 15, 2018 •

edited

Loading

davidmccormick commented Jul 16, 2018 •

edited

Loading

cknowles commented Jul 17, 2018 •

edited

Loading