-
Notifications
You must be signed in to change notification settings - Fork 294
Tweaks to network config #1407
Comments
Fixes kubernetes-retired#1407. For now just document the `podCIDR` aspects but we should probably switch the default based on the networking setup, possibly combine this config with the `selfHosting` config into a networking section.
Hi Good catch. I vote for all of the above, except 4 - which I don't rightly understand what is the significance of using one cidr range over another? Can't we decide our own cidr and then accommodate the defaults and kube manifests around that choice? What is the advantage of using 192.168.0.0 with calico? We choose our own cidr ranges so I don't think we ever keep the default settings. |
@davidmccormick I'm not sure at this stage about 4 and why every set of instructions I found seem to indicate it has to be set to that. At least on the surface our defaults work right now so I assume there's some other subtle issue or just everyone has copied the instructions verbatim 😆 (including kubeadm). |
:) I think we should stick with the same default podCIDR unless we find hard evidence to change it. |
Sure, we can at least do that and have the other bit ready. One thing though, I'm struggling to identify the cause of a full cluster outage so asked for the PR to not be merged yet (see the message there but either related to this change or Surprisingly for the downed cluster a brand new cluster using a backup/restore via ark didn't work, the restored pods appeared to make the new cluster fragile in some way. Redeployed all from source and it was healthy. I suspect it was the CIDR change although it could be the newer Calico or the other network changes. I did follow the instructions from your recent PR supporting CIDR changes with cluster downtime. I found a few kube issues related to issues waiting for CNI plugins to respond which seem like they have similar symptoms. We still have the node problem detector backlog ticket in kube-aws of course but that'll only patch over the root cause. |
* Tweaks to network config Fixes #1407. For now just document the `podCIDR` aspects but we should probably switch the default based on the networking setup, possibly combine this config with the `selfHosting` config into a networking section. * Correct pod CIDR notes
While looking at some issues with CNI and hostPort connections mentioned in #704 (comment), I found some discrepancies between the kube-aws network configuration and what's expected or perhaps optimal. I've since discovered the cause of my hostPort issues was something unrelated but the items I found are likely worth including in kube-aws. I'm going to list them here first and we can always split them up.
externalSetMarkChain
toKUBE-MARK-MASQ
in the CNI config to reuse existing iptables chains. It's not in the default YAMLs but I found it here. It seems our friend redbaron found the sameWe should set the
--service-cluster-ip-range
flag on controller manager as it prevents IPs being assigned in the service CIDR range should it overlap with the pod CIDR range. This doesn't seem to be documented anywhere I could find, but most bootstrap instructions set it and I just found we didn't set it. Dug into the code and found this.When using Canal or Flannel, we should be defaulting the
podCIDR
to10.244.0.0/16
. When using Calico it should be192.168.0.0/16
. References - Calico, Canal, kubeadm instructions and code. I'm wondering what the impact of setting these to anything else is as our default right now is10.2.0.0/16
and it appears to work for basic functionality. Perhaps the impact is in network policy enforcement?We should update
cniVersion
from0.3.0
to0.3.1
The text was updated successfully, but these errors were encountered: