Skip to content
This repository was archived by the owner on Jul 30, 2021. It is now read-only.

CNI: Enable portmap plugin by default #697

Merged
merged 1 commit into from
Sep 19, 2017
Merged

CNI: Enable portmap plugin by default #697

merged 1 commit into from
Sep 19, 2017

Conversation

klausenbusk
Copy link
Contributor

@klausenbusk klausenbusk commented Aug 29, 2017

With this change, hostPort is finally working.

Fix #662


Require: coreos/flannel-cni#5 (due to the file extension change).

cc @aaronlevy

I have tested this on my bootkube cluster.

@coreosbot
Copy link

Can one of the admins verify this patch?

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Aug 29, 2017
@aaronlevy
Copy link
Contributor

Sorry this is lingering so long. Was waiting to see outcome of PR to flannel-cni (also @dghubble is on vacation for a bit)

@dghubble
Copy link
Contributor

Responded in coreos/flannel-cni#5 (comment). Once that merges and a new version of flannel-cni is published, we may want to validate that we haven't broken anything by first bumping the version here, showing it still works, then (later PR) switching to the new CNI config to enable host port.

@dghubble
Copy link
Contributor

Added #705 to show flannel-cni:v0.3.0 is backwards compatible. In this PR, can you update to the new release and use the new default config which has portmap, thanks.

@klausenbusk
Copy link
Contributor Author

Added #705 to show flannel-cni:v0.3.0 is backwards compatible. In this PR, can you update to the new release and use the new default config which has portmap, thanks.

Done..

@dghubble
Copy link
Contributor

dghubble commented Sep 12, 2017

Can you rebase please?

@dghubble
Copy link
Contributor

ok to test

aaronlevy
aaronlevy previously approved these changes Sep 12, 2017
Copy link
Contributor

@aaronlevy aaronlevy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@dghubble
Copy link
Contributor

Passes tests and doesn't regress, but I haven't verified this fixes hostPort as claimed.

@dghubble
Copy link
Contributor

dghubble commented Sep 12, 2017

Wait, @klausenbusk, the CNI_CONF_NAME override was to show that the bump was backwards compatible. When rebasing and updating to the new CNI config, you previously noted the name had to be the conflist variant, so I think you'll now want to remove the name and use the default.

Needs rebase and minor tweak.

@klausenbusk
Copy link
Contributor Author

klausenbusk commented Sep 13, 2017 via email

With this change, hostPort is finally working.

Fix #662
@klausenbusk
Copy link
Contributor Author

Rebase done

@dghubble
Copy link
Contributor

ok to test

dghubble added a commit to poseidon/terraform-render-bootstrap that referenced this pull request Sep 13, 2017
dghubble added a commit to poseidon/terraform-render-bootstrap that referenced this pull request Sep 13, 2017
@dghubble
Copy link
Contributor

dghubble commented Sep 13, 2017

I'm not so confident in this change or these green tests yet. I'm seeing a lot of flakiness and the fact different users of bootkube are using different plugins (hyperkube vs host) I still can't verify this works properly across environments.

Edit: Admittedly there are a bunch of issues with download speeds in the lab.

@aaronlevy
Copy link
Contributor

If we wanted to we could switch all the bootkube examples to use the bin dir provided by the CNI plugin itself (and not use the hyperkube fs copies). Should just be a matter of adding a --cni-bin-dir flag that points to a location that is a hostmount shared with kubelet (via kubelet.service).

@dghubble
Copy link
Contributor

I did get this up using the on-host CNI plugins (which I think we should try to use everywhere, but weren't being used where I was testing), but creating the reproduction case shown in kubernetes/kubernetes#23920, netstat shows no host port binding. @klausenbusk what was your verification?

@dghubble
Copy link
Contributor

@aaronlevy The hack examples are bind mounting it in the default location, it should be in use even without the flag (though I don't usually use the hack examples so not 100% sure). In other environments like Matchbox and Tectonic, the hyperkube copies are being used. I wasn't aware of this until recently, it wasn't my intent in Matchbox at least.

@klausenbusk
Copy link
Contributor Author

klausenbusk commented Sep 14, 2017

@klausenbusk what was your verification?

I'm using hostPort with the nginx ingress controller in production, and it is working fine. The flannel-cni image is slightly older through.

bootkube v0.6.1 and nodes configured with a slightly modified hack/quickstart/init-node.sh

@klausenbusk
Copy link
Contributor Author

netstat shows no host port binding.

Same here, but it works. I'm not sure why we can't see the bind through.

@klausenbusk
Copy link
Contributor Author

I'm not sure why we can't see the bind through.

It seems like it is handled by a firewall rule:

-A CNI-DN-bc1f43f7a441389471188 -p tcp -m tcp --dport 80 -j DNAT --to-destination 10.2.0.3:80

@dghubble
Copy link
Contributor

Makes sense, portmap is described as being a completely iptables based mechanism. It'll be kinda opaque to debug, but its better than the current non-functional state. My test pod with hostPort does seem to be addressable as claimed. 👍

@dghubble
Copy link
Contributor

dghubble commented Sep 14, 2017

Oh, btw, above I described how after this change, projects using bootkube must use the host's CNI plugins or a really new hyperkube. If not you'll get network: incompatible CNI versions; config is \"0.3.1\", plugin supports [\"0.1.0\" \"0.2.0\" \"0.3.0\"]". You might consider just leaving the version at 0.3.0 for now (haven't tested) to be a little more lenient.

Tectonic, Matchbox, and others need to change before updating or use the fact we allowed this change to be backwards compatible (but not get hostPort). cc @squat

@dghubble dghubble self-requested a review September 14, 2017 22:25
Copy link
Contributor

@dghubble dghubble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works alright for me.

cc @aaronlevy @squat for how Tectonic wants to proceed.

@weikinhuang
Copy link
Contributor

weikinhuang commented Sep 14, 2017

Does calico's kube-calico-cfg need to be updated to .conflist too?

@dghubble
Copy link
Contributor

dghubble commented Sep 14, 2017

Deploying the Calico policy-only addon is taking over as the alphabetically first CNI config, so I'd expect with calico, hostPort would continue to not function properly. I've noted the problems with this setup in #698, I'm not quite sure how we keep balancing the two. I personally use Calico networking instead, which I'm hoping to add soon.

@weikinhuang
Copy link
Contributor

weikinhuang commented Sep 14, 2017

I made these changes to get it to work with the experimental calico policy:

diff --git c/bootkube-system/manifests/kube-calico-cfg.yaml i/bootkube-system/manifests/kube-calico-cfg.yaml
index a709907..4a625ac 100644
--- c/bootkube-system/manifests/kube-calico-cfg.yaml
+++ i/bootkube-system/manifests/kube-calico-cfg.yaml
@@ -9,20 +9,30 @@ data:
     {
         "name": "k8s-pod-network",
         "cniVersion": "0.3.0",
-        "type": "calico",
-        "log_level": "debug",
-        "datastore_type": "kubernetes",
-        "nodename": "__KUBERNETES_NODE_NAME__",
-        "ipam": {
-            "type": "host-local",
-            "subnet": "usePodCidr"
-        },
-        "policy": {
-            "type": "k8s",
-            "k8s_auth_token": "__SERVICEACCOUNT_TOKEN__"
-        },
-        "kubernetes": {
-            "k8s_api_root": "https://__KUBERNETES_SERVICE_HOST__:__KUBERNETES_SERVICE_PORT__",
-            "kubeconfig": "__KUBECONFIG_FILEPATH__"
-        }
+        "plugins": [
+          {
+            "type": "calico",
+            "log_level": "debug",
+            "datastore_type": "kubernetes",
+            "nodename": "__KUBERNETES_NODE_NAME__",
+            "ipam": {
+                "type": "host-local",
+                "subnet": "usePodCidr"
+            },
+            "policy": {
+                "type": "k8s",
+                "k8s_auth_token": "__SERVICEACCOUNT_TOKEN__"
+            },
+            "kubernetes": {
+                "k8s_api_root": "https://__KUBERNETES_SERVICE_HOST__:__KUBERNETES_SERVICE_PORT__",
+                "kubeconfig": "__KUBECONFIG_FILEPATH__"
+            }
+          },
+          {
+            "type": "portmap",
+            "capabilities": {
+              "portMappings": true
+            }
+          }
+        ]
     }
diff --git c/bootkube-system/manifests/kube-calico.yaml i/bootkube-system/manifests/kube-calico.yaml
index 64224e5..43473b1 100644
--- c/bootkube-system/manifests/kube-calico.yaml
+++ i/bootkube-system/manifests/kube-calico.yaml
@@ -78,6 +78,8 @@ spec:
           image: quay.io/calico/cni:v1.10.0
           command: ["/install-cni.sh"]
           env:
+            - name: CNI_CONF_NAME
+              value: 10-calico.conflist
             - name: CNI_NETWORK_CONFIG
               valueFrom:
                 configMapKeyRef:

However I had to manually delete the old 10-calico.conf on every node.

@dghubble
Copy link
Contributor

Can you add that as a separate PR? On master, neither flannel nor canal support hostPort, so I don't think this regresses.

@klausenbusk
Copy link
Contributor Author

klausenbusk commented Sep 15, 2017

or a really new hyperkube

v0.6.0 hasn't got into k8s yet, and won't before at least v1.9.0. So people must use the host CNI binaries.
kubernetes/kubernetes#49480
Portmap require this fix (and maybe a few more): containernetworking/plugins#23 which is included in v0.6.0.

You might consider just leaving the version at 0.3.0 for now (haven't tested) to be a little more lenient.

portmap require v0.6.0, it won't work with older version due to some bugs.
Edit: as portmap won't work with older version of CNI it make little sense to change the version to 0.3.0.

@dghubble
Copy link
Contributor

dghubble commented Sep 15, 2017 via email

@klausenbusk
Copy link
Contributor Author

v0.3.0 is referring to the CNI spec version in the config, rather than the release version.

I'm aware of that and I changed my comment afterwards:

Edit: as portmap won't work with older version of CNI it make little sense to change the version to 0.3.0.

In other words: portmap won't work with the older CNI release which only support 0.3.0 (v0.5.0<=v0.5.1), so I see no reason to lower the CNI spec version required. As there is no CNI release which only support 0.3.0 where portmap+flannel is also working.

@dghubble
Copy link
Contributor

Sounds fine then, thanks for the clarifications. There are no loopholes, projects must switch to using on-host CNI plugins. I've changed all Matchbox clusters to do this. Poking Tectonic.

@dghubble
Copy link
Contributor

dghubble commented Sep 19, 2017

Tectonic may opt to keep their current behavior (hostPort not working) and avoid the immediate need to migrate to using the on-host plugin (instead of the hyperkube ones) following my compatability note in coreos/flannel-cni#5 (comment).

cc @yifan-gu @derekparker

@aaronlevy
Copy link
Contributor

@dghubble can you open an issue in the tectonic jira to track this?

@dghubble
Copy link
Contributor

Yep, mentioned you on the issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants