Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SysctlForbidden on Deployment when controller.hostNetwork = true #3714

Closed
davefinster opened this issue Apr 2, 2023 · 6 comments · Fixed by #3722
Closed

SysctlForbidden on Deployment when controller.hostNetwork = true #3714

davefinster opened this issue Apr 2, 2023 · 6 comments · Fixed by #3722

Comments

@davefinster
Copy link

davefinster commented Apr 2, 2023

Describe the bug
With the advent of #3573 it appears that any deployment using hostNetwork=true on a deployment (not sure about daemonset) will now encounter an error with

forbidden sysctl: "net.ipv4.ip_unprivileged_port_start" not allowed with host net enabled

To Reproduce
Steps to reproduce the behavior:

  1. Deploy using the HELM charts with controller.hostNetwork set to true

Expected behavior
nginx pods start up successfully

Your environment

  • Version of the Ingress Controller - 3.1.0
  • Version of Kubernetes: 1.24
  • Kubernetes platform (e.g. Mini-kube or GCP): Microk8s
  • Using NGINX or NGINX Plus: nginx

Kube Version Info:

Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.2", GitCommit:"fc04e732bb3e7198d2fa44efa5457c7c6f8c0f5b", GitTreeState:"clean", BuildDate:"2023-02-22T13:39:03Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"24+", GitVersion:"v1.24.12-2+36e803be0d732b", GitCommit:"36e803be0d732b9da15f6568ed23e5c1929f8701", GitTreeState:"clean", BuildDate:"2023-03-17T18:58:08Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"linux/amd64"}

Pinning the HELM chart version to 0.16.2 resolves the problem

@github-actions
Copy link

github-actions bot commented Apr 2, 2023

Hi @davefinster thanks for reporting!

Be sure to check out the docs and the Contributing Guidelines while you wait for a human to take a look at this 🙂

Cheers!

@brianehlert
Copy link
Collaborator

I am curious if the contributor of the change @sigv might have any comments.

@sigv
Copy link
Contributor

sigv commented Apr 3, 2023

This is not great.

I for some reason believed that it would be okay in Kubernetes, even though I was myself considering host-networking on Docker would not be viable, before opening that PR, when I was looking at moby/moby/daemon/oci_linux.go:

This to me implies that with Kubernetes, even if host network is used, the sysctl should be safe to specify.

Looking back at Kubernetes documentation (ref) it is clear that the net.* sysctls are not safe to use in combination with host networking.

This should be patched, as host networking support should be restored.. but I am not yet 100% sure of the best compromise.

@sigv
Copy link
Contributor

sigv commented Apr 3, 2023

Adding a simple if controller.hostNetwork would not be viable, as the process no longer gets NET_BIND_SERVICE.

The chart should be kept as simple as possible, so mixing the Capability with Sysctl feels like asking for trouble down the line. If the Sysctl is not viable, then it should be dropped, and replaced with IC process receiving NET_BIND_SERVICE which is then inherited over to Nginx process.

The cap should be dropped by IC after starting Nginx, and by Nginx after binding the ports, but this is not a blocker.

Currently not seeing any other approach to this, except a revert, which brings back the escalation concern.

@lucacome
Copy link
Member

lucacome commented Apr 3, 2023

Looking back at Kubernetes documentation (ref) it is clear that the net.* sysctls are not safe to use in combination with host networking.

@sigv Where do you see that? I can only see that net.ipv4.ip_unprivileged_port_start is namespaced and safe. What am I missing?

It is good practice to consider nodes with special sysctl settings as tainted within a cluster, and only schedule pods onto them which need those sysctl settings. It is suggested to use the Kubernetes taints and toleration feature to implement this.

Doesn't a cluster admin need to enable net.ipv4.ip_unprivileged_port_start on the node when used with host network? And use taints to schedule pods in the right node?

@sigv
Copy link
Contributor

sigv commented Apr 3, 2023

@sigv Where do you see that? I can only see that net.ipv4.ip_unprivileged_port_start is namespaced and safe. What am I missing?

From the Setting Sysctls for a Pod section, worded a bit strangely: The parameters under net.* that can be set in container networking namespace.

Will open an upstream PR to have this documented more clearly.

I was also looking at K8s docs saying The example net.ipv4.tcp_syncookies is not namespaced on Linux kernel version 4.4 or lower. So I checked ip_unprivileged_port_start was added in linux@4548b68 back in 4.11, this felt safe as well.

Doesn't a cluster admin need to enable net.ipv4.ip_unprivileged_port_start on the node when used with host network? And use taints to schedule pods in the right node?

Based on the reported error saying it's not allowed with host net enabled, I think that K8s simply blocks this scenario - no workaround.

Will try to get a patch with NET_BIND_SERVICE available for evaluation shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants