Pods restart loop with error "[emerg] 23#23: bind() to 0.0.0.0:80 failed (13: Permission denied)" in latest chart/version for daemonset #3932

brian-provenzano · 2023-05-22T21:24:47Z

Describe the bug
Using latest image and helm chart and upgrading from v2.4.2 I am getting permission denied errors in nginx pods which causes constant restarts. It appears the issue revolves around these recent securityContext changes PR 3722 and PR 3573.

To Reproduce
Steps to reproduce the behavior:

Deploy v3.1.1 (Chart 0.17.1) in daemonset configuration by using helm template... then kubectl apply. - see sample values.yaml to see our settings.
Pods will not successfully start - continuously restart
View logs on a restarting pod and will see 2023/05/22 17:08:45 [emerg] 23#23: bind() to 0.0.0.0:80 failed (13: Permission denied)
If change daemonset.spec.template.spec.containers.securityContext.allowPrivilegeEscalation to true (current setting is false in chart template) and restart the ds it works fine and pods start. This appears to be the same setting that was present in v.2.4.2 which we currently run without issue.

Expected behavior
I expect the pods to start successfully even with the new securityContext in place.

Your environment

Version of the Ingress Controller - v3.1.1 with Chart 0.17.1
Version of Kubernetes - 1.23
Kubernetes platform (e.g. Mini-kube or GCP) - EKS
Using NGINX or NGINX Plus : NGINX

Additional context
I can provide more information if needed. I would adjust the daemonset.spec.template.spec.containers.securityContext.allowPrivilegeEscalation to false to fix this ourselves (albeit reverting to a previously less secure setup that was present in v.2.4.2), but that param is not configurable in the chart.

v3.1.1 Images tried:nginx/nginx-ingress:3.1.1-ubi and public.ecr.aws/nginx/nginx-ingress:3.1.1-ubi (but we use the aws ecr public image due to dockerhub throttles)

test-values.yaml.txt

The text was updated successfully, but these errors were encountered:

github-actions · 2023-05-22T21:24:58Z

Hi @brian-provenzano thanks for reporting!

Be sure to check out the docs and the Contributing Guidelines while you wait for a human to take a look at this 🙂

Cheers!

vepatel · 2023-05-23T09:43:10Z

Hi @brian-provenzano, tested this on Nginx Ingress controller v3.1.1 on k8s 1.27:

/nginx/kubernetes-ingress/deployments/helm-chart|72473392⚡ ⇒  k logs test-release-nginx-ingress-controller-4vkdg | grep Version=
NGINX Ingress Controller Version=3.1.1 Commit=72473392d14cb0971de4b916a8db9bb675a16634 Date=2023-05-04T23:50:20Z DirtyState=false Arch=linux/amd64 Go=go1.20.3

/nginx/kubernetes-ingress/deployments/helm-chart|72473392⚡ ⇒  k get pods
NAME                                          READY   STATUS    RESTARTS   AGE
test-release-nginx-ingress-controller-4vkdg   1/1     Running   0          5m21s
test-release-nginx-ingress-controller-9ckjh   1/1     Running   0          5m21s
test-release-nginx-ingress-controller-lt6mj   1/1     Running   0          5m21s

/nginx/kubernetes-ingress/deployments/helm-chart|72473392⚡ ⇒  k get pods test-release-nginx-ingress-controller-4vkdg -o yaml | grep allowPrivilegeEscalation
      allowPrivilegeEscalation: false

can you please make sure if you're on correct release tag while running helm install.. or kubectl apply..

helm cmd used: helm install test-release --set controller.kind=daemonset --set controller.nginxplus=false --set controller.image.repository=nginx/nginx-ingress --set controller.image.tag="3.1.1" --set controller.image.pullPolicy=Always .

brianehlert · 2023-05-23T14:30:57Z

Specifically, there is tuning to the netbind service in the patch https://docs.nginx.com/nginx-ingress-controller/releases/#nginx-ingress-controller-311
Thus the Helm chart / manifests must match the container version.

brian-provenzano · 2023-05-23T15:08:01Z

We run helm template...kubectl apply (actually it is run thru spinnaker). I used 3.1.1-ubi from dockerhub and the same image public ecr. I corrected the version in the original post.

I will double check my work though to be sure and get back to you asap...

brian-provenzano · 2023-05-23T16:54:35Z

OK - I tried with these images nginx/nginx-ingress:3.1.1 and nginx/nginx-ingress:3.1.1-ubi. I have attached a copy of the ds I tried that is using thenginx/nginx-ingress:3.1.1 image which still does not work for us (pods throw the perm error previously described).

Testing process: I edited the ds on the cluster to use the nginx/nginx-ingress:3.1.1 image (which launched new pods), but still get the perm error in pod logs and pods constantly restarting. If I change allowPrivilegeEscalation to true all is fine.

Could this be some issue in how our nodes are configured? AMI, OS etc? We are using custom Ubuntu CIS AMIs and not the official AWS EKS optimized AMIs.

Logs from a pod that successfully starts/runs once I change to allowPrivilegeEscalation: true:

NGINX Ingress Controller Version=3.1.1 Commit=72473392d14cb0971de4b916a8db9bb675a16634 Date=2023-05-04T23:50:20Z DirtyState=false Arch=linux/amd64 Go=go1.20.3
I0523 16:51:05.622911       1 flags.go:294] Starting with flags: ["-nginx-plus=false" "-nginx-reload-timeout=60000" "-enable-app-protect=false" "-enable-app-protect-dos=false" "-nginx-configmaps=nginx-ingress/nginx-config" "-default-server-tls-secret=nginx-ingress/nginx-ingress-secret" "-ingress-class=nginx" "-health-status=false" "-health-status-uri=/nginx-health" "-nginx-debug=false" "-v=1" "-nginx-status=false" "-report-ingress-status" "-external-service=nginx-ingress-external" "-enable-leader-election=true" "-leader-election-lock-name=kdp-core-nginx-ingress-leader-election" "-enable-prometheus-metrics=false" "-prometheus-metrics-listen-port=9113" "-prometheus-tls-secret=" "-enable-service-insight=false" "-service-insight-listen-port=9114" "-service-insight-tls-secret=" "-enable-custom-resources=true" "-enable-snippets=true" "-include-year=false" "-disable-ipv6=false" "-enable-tls-passthrough=false" "-enable-preview-policies=false" "-enable-cert-manager=false" "-enable-oidc=false" "-enable-external-dns=false" "-ready-status=true" "-ready-status-port=8081" "-enable-latency-metrics=false"]
I0523 16:51:05.629088       1 main.go:234] Kubernetes version: 1.23.17
I0523 16:51:05.635203       1 main.go:380] Using nginx version: nginx/1.23.4
I0523 16:51:05.739233       1 main.go:776] Pod label updated: nginx-ingress-q2bvf
2023/05/23 16:51:05 [notice] 18#18: using the "epoll" event method
2023/05/23 16:51:05 [notice] 18#18: nginx/1.23.4
2023/05/23 16:51:05 [notice] 18#18: built by gcc 11.2.1 20220127 (Red Hat 11.2.1-9) (GCC)
2023/05/23 16:51:05 [notice] 18#18: OS: Linux 5.4.0-1100-aws
2023/05/23 16:51:05 [notice] 18#18: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2023/05/23 16:51:05 [notice] 18#18: start worker processes
2023/05/23 16:51:05 [notice] 18#18: start worker process 22
2023/05/23 16:51:05 [notice] 18#18: start worker process 23
2023/05/23 16:51:05 [notice] 18#18: start worker process 24
2023/05/23 16:51:05 [notice] 18#18: start worker process 25
2023/05/23 16:51:05 [notice] 18#18: start worker process 26
2023/05/23 16:51:05 [notice] 18#18: start worker process 27
2023/05/23 16:51:05 [notice] 18#18: start worker process 28
2023/05/23 16:51:05 [notice] 18#18: start worker process 29
2023/05/23 16:51:05 [notice] 18#18: start worker process 30
2023/05/23 16:51:05 [notice] 18#18: start worker process 31
2023/05/23 16:51:05 [notice] 18#18: start worker process 32
2023/05/23 16:51:05 [notice] 18#18: start worker process 33
2023/05/23 16:51:05 [notice] 18#18: start worker process 34
2023/05/23 16:51:05 [notice] 18#18: start worker process 35
2023/05/23 16:51:05 [notice] 18#18: start worker process 36
2023/05/23 16:51:05 [notice] 18#18: start worker process 37
...

Logs from a pod when allowPrivilegeEscalation: false (pod does not start/restarts constantly):

NGINX Ingress Controller Version=3.1.1 Commit=72473392d14cb0971de4b916a8db9bb675a16634 Date=2023-05-04T23:50:20Z DirtyState=false Arch=linux/amd64 Go=go1.20.3
I0523 16:49:08.587514       1 flags.go:294] Starting with flags: ["-nginx-plus=false" "-nginx-reload-timeout=60000" "-enable-app-protect=false" "-enable-app-protect-dos=false" "-nginx-configmaps=nginx-ingress/nginx-config" "-default-server-tls-secret=nginx-ingress/nginx-ingress-secret" "-ingress-class=nginx" "-health-status=false" "-health-status-uri=/nginx-health" "-nginx-debug=false" "-v=1" "-nginx-status=false" "-report-ingress-status" "-external-service=nginx-ingress-external" "-enable-leader-election=true" "-leader-election-lock-name=kdp-core-nginx-ingress-leader-election" "-enable-prometheus-metrics=false" "-prometheus-metrics-listen-port=9113" "-prometheus-tls-secret=" "-enable-service-insight=false" "-service-insight-listen-port=9114" "-service-insight-tls-secret=" "-enable-custom-resources=true" "-enable-snippets=true" "-include-year=false" "-disable-ipv6=false" "-enable-tls-passthrough=false" "-enable-preview-policies=false" "-enable-cert-manager=false" "-enable-oidc=false" "-enable-external-dns=false" "-ready-status=true" "-ready-status-port=8081" "-enable-latency-metrics=false"]
I0523 16:49:08.593176       1 main.go:234] Kubernetes version: 1.23.17
I0523 16:49:08.601693       1 main.go:380] Using nginx version: nginx/1.23.4
I0523 16:49:08.635197       1 main.go:776] Pod label updated: nginx-ingress-dbwbh
2023/05/23 16:49:08 [emerg] 24#24: bind() to 0.0.0.0:80 failed (13: Permission denied)

nginx-ingress-ds.yaml.txt

brianehlert · 2023-05-23T20:18:59Z

We have had issues with Helm upgrades in the past where changes to rbac.yaml (or in openshift scc.yaml) is not processed properly due to how Helm performs the upgrade.

I see that you are using a daemonset instead of a deployment..
Do you get a different result if you use a deployment? I am curious.

brian-provenzano · 2023-05-23T20:37:39Z

OK - I will give that a try and report back - shouldn't take long to test

brian-provenzano · 2023-05-23T20:57:14Z

Same issue - no change in behavior as deployment. Attached is a copy of the deployment.

pod logs when deployed as a deployment (same as before):

NGINX Ingress Controller Version=3.1.1 Commit=72473392d14cb0971de4b916a8db9bb675a16634 Date=2023-05-04T23:50:20Z DirtyState=false Arch=linux/amd64 Go=go1.20.3
I0523 20:47:37.302872       1 flags.go:294] Starting with flags: ["-nginx-plus=false" "-nginx-reload-timeout=60000" "-enable-app-protect=false" "-enable-app-protect-dos=false" "-nginx-configmaps=nginx-ingress/nginx-config" "-default-server-tls-secret=nginx-ingress/nginx-ingress-secret" "-ingress-class=nginx" "-health-status=false" "-health-status-uri=/nginx-health" "-nginx-debug=false" "-v=1" "-nginx-status=false" "-report-ingress-status" "-external-service=nginx-ingress-external" "-enable-leader-election=true" "-leader-election-lock-name=kdp-core-nginx-ingress-leader-election" "-enable-prometheus-metrics=false" "-prometheus-metrics-listen-port=9113" "-prometheus-tls-secret=" "-enable-service-insight=false" "-service-insight-listen-port=9114" "-service-insight-tls-secret=" "-enable-custom-resources=true" "-enable-snippets=true" "-include-year=false" "-disable-ipv6=false" "-enable-tls-passthrough=false" "-enable-preview-policies=false" "-enable-cert-manager=false" "-enable-oidc=false" "-enable-external-dns=false" "-ready-status=true" "-ready-status-port=8081" "-enable-latency-metrics=false"]
I0523 20:47:37.393542       1 main.go:234] Kubernetes version: 1.23.17
I0523 20:47:37.400536       1 main.go:380] Using nginx version: nginx/1.23.4
I0523 20:47:37.432189       1 main.go:776] Pod label updated: nginx-ingress-77d64565d8-mttlk
2023/05/23 20:47:37 [emerg] 24#24: bind() to 0.0.0.0:80 failed (13: Permission denied)

Again if I change to allowPrivilegeEscalation: true it works fine.

NGINX Ingress Controller Version=3.1.1 Commit=72473392d14cb0971de4b916a8db9bb675a16634 Date=2023-05-04T23:50:20Z DirtyState=false Arch=linux/amd64 Go=go1.20.3
I0523 20:55:42.888299       1 flags.go:294] Starting with flags: ["-nginx-plus=false" "-nginx-reload-timeout=60000" "-enable-app-protect=false" "-enable-app-protect-dos=false" "-nginx-configmaps=nginx-ingress/nginx-config" "-default-server-tls-secret=nginx-ingress/nginx-ingress-secret" "-ingress-class=nginx" "-health-status=false" "-health-status-uri=/nginx-health" "-nginx-debug=false" "-v=1" "-nginx-status=false" "-report-ingress-status" "-external-service=nginx-ingress-external" "-enable-leader-election=true" "-leader-election-lock-name=kdp-core-nginx-ingress-leader-election" "-enable-prometheus-metrics=false" "-prometheus-metrics-listen-port=9113" "-prometheus-tls-secret=" "-enable-service-insight=false" "-service-insight-listen-port=9114" "-service-insight-tls-secret=" "-enable-custom-resources=true" "-enable-snippets=true" "-include-year=false" "-disable-ipv6=false" "-enable-tls-passthrough=false" "-enable-preview-policies=false" "-enable-cert-manager=false" "-enable-oidc=false" "-enable-external-dns=false" "-ready-status=true" "-ready-status-port=8081" "-enable-latency-metrics=false"]
I0523 20:55:42.895868       1 main.go:234] Kubernetes version: 1.23.17
I0523 20:55:42.907961       1 main.go:380] Using nginx version: nginx/1.23.4
I0523 20:55:42.935903       1 main.go:776] Pod label updated: nginx-ingress-86bfb79447-4pnh6
2023/05/23 20:55:42 [notice] 25#25: using the "epoll" event method
2023/05/23 20:55:42 [notice] 25#25: nginx/1.23.4
2023/05/23 20:55:42 [notice] 25#25: built by gcc 10.2.1 20210110 (Debian 10.2.1-6)
2023/05/23 20:55:42 [notice] 25#25: OS: Linux 5.4.0-1100-aws
2023/05/23 20:55:42 [notice] 25#25: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2023/05/23 20:55:42 [notice] 25#25: start worker processes
2023/05/23 20:55:42 [notice] 25#25: start worker process 26
2023/05/23 20:55:42 [notice] 25#25: start worker process 27
2023/05/23 20:55:42 [notice] 25#25: start worker process 28
2023/05/23 20:55:42 [notice] 25#25: start worker process 29
2023/05/23 20:55:42 [notice] 25#25: start worker process 30
2023/05/23 20:55:42 [notice] 25#25: start worker process 31
2023/05/23 20:55:42 [notice] 25#25: start worker process 32
2023/05/23 20:55:42 [notice] 25#25: start worker process 33
2023/05/23 20:55:42 [notice] 25#25: start worker process 34
2023/05/23 20:55:42 [notice] 25#25: start worker process 35
2023/05/23 20:55:42 [notice] 25#25: start worker process 36
2023/05/23 20:55:42 [notice] 25#25: start worker process 37
2023/05/23 20:55:42 [notice] 25#25: start worker process 38
2023/05/23 20:55:42 [notice] 25#25: start worker process 39
2023/05/23 20:55:42 [notice] 25#25: start worker process 40
2023/05/23 20:55:42 [notice] 25#25: start worker process 41

nginx-ingress-deployment.yaml.txt

vepatel · 2023-05-24T09:26:43Z

weird, working for me with default values on both GKE and AKS with helm chart=0.17.1
GKE: #3932 (comment)
AKS: In this scenario I performed an upgrade from 2.4.2 to 3.1.1

/nginx/kubernetes-ingress/deployments/helm-chart|72473392⚡ ⇒  k get pods
NAME                                          READY   STATUS    RESTARTS   AGE
test-release-nginx-ingress-controller-2bn59   1/1     Running   0          12s
test-release-nginx-ingress-controller-5kj6z   1/1     Running   0          12s
test-release-nginx-ingress-controller-w596l   1/1     Running   0          12s

/nginx/kubernetes-ingress/deployments/helm-chart|72473392⚡ ⇒  k describe daemonsets.apps test-release-nginx-ingress-controller 
Name:           test-release-nginx-ingress-controller
Selector:       app.kubernetes.io/instance=test-release,app.kubernetes.io/name=nginx-ingress
Node-Selector:  <none>
Labels:         app.kubernetes.io/instance=test-release
                app.kubernetes.io/managed-by=Helm
                app.kubernetes.io/name=nginx-ingress
                app.kubernetes.io/version=3.1.1
                helm.sh/chart=nginx-ingress-0.17.1
Annotations:    deprecated.daemonset.template.generation: 1
                meta.helm.sh/release-name: test-release
                meta.helm.sh/release-namespace: default

/nginx/kubernetes-ingress/deployments/helm-chart|72473392⚡ ⇒  k get pods test-release-nginx-ingress-controller-jrlm6 -o yaml | grep allowPrivilegeEscalation 
      allowPrivilegeEscalation: false

I'll try EKS with official EKS optimized Amazon Linux 2 instances later.

brian-provenzano · 2023-05-24T22:42:15Z

Alright, I am starting to think it is something unique to our environment.

I did the following:

spun up a new EKS cluster with eksctl running k8s v1.23 with Amazon EKS AMIS. This is the same version of k8s we are using.
ran helm install test-release oci://ghcr.io/nginxinc/charts/nginx-ingress --version 0.17.1 --values values-test-nginx.yaml --create-namespace --namespace nginx-ingress on the new test cluster. Attached is the exact values I used
checked status and all pods run fine no errors

One other possible variable here is our container runtime is still docker in 1.23, besides the fact we are not using official AWS EKS AMIs. I think the current EKS AMIs built for 1.23 use containerd...?

Anyway, I am going to try another test on another one of our 1.23 clusters created using our IaC (TF not eksctl; our custom ubuntu AMI with docker runtime) to further test, but it appears to be an issue on my end. Sorry about the wild goose chase here :(

I am guessing we can close this for now and I can report back if anything changes...

values-test-nginx.yaml.txt

brianehlert · 2023-05-24T23:25:24Z

It is fine to leave this until you resolve. I think we all learn from these kinds of things.

vepatel · 2023-05-25T07:32:52Z

Thanks @brian-provenzano for checking, I'll close this for now 👍🏼

justbert · 2024-01-10T14:35:24Z

We're running into the same error on 3.3.2. We're building our own image to include some extra modules/capabilities and when our image is built with Docker this issue does not happen, however, when it's built with Kaniko, it does.

vepatel · 2024-02-20T12:51:12Z

@justbert we'll be adding option to modify securityContext in 3.5.0 via helm so that should solve your issue hopefully

justbert · 2024-02-20T13:17:07Z

Found the issue! (I should have updated my comment) It seems Kaniko doesn't copy over extended file attributes whereas Docker does which means the NET_CAP_BIND was missing from the binary. It's not a well defined part of the COPY command which (as we can see) causes issues.

vepatel closed this as completed May 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pods restart loop with error "[emerg] 23#23: bind() to 0.0.0.0:80 failed (13: Permission denied)" in latest chart/version for daemonset #3932

Pods restart loop with error "[emerg] 23#23: bind() to 0.0.0.0:80 failed (13: Permission denied)" in latest chart/version for daemonset #3932

brian-provenzano commented May 22, 2023 •

edited

Loading

github-actions bot commented May 22, 2023

vepatel commented May 23, 2023 •

edited

Loading

brianehlert commented May 23, 2023

brian-provenzano commented May 23, 2023

brian-provenzano commented May 23, 2023 •

edited

Loading

brianehlert commented May 23, 2023

brian-provenzano commented May 23, 2023

brian-provenzano commented May 23, 2023

vepatel commented May 24, 2023 •

edited

Loading

brian-provenzano commented May 24, 2023 •

edited

Loading

brianehlert commented May 24, 2023

vepatel commented May 25, 2023

justbert commented Jan 10, 2024

vepatel commented Feb 20, 2024

justbert commented Feb 20, 2024

Pods restart loop with error "[emerg] 23#23: bind() to 0.0.0.0:80 failed (13: Permission denied)" in latest chart/version for daemonset #3932

Pods restart loop with error "[emerg] 23#23: bind() to 0.0.0.0:80 failed (13: Permission denied)" in latest chart/version for daemonset #3932

Comments

brian-provenzano commented May 22, 2023 • edited Loading

github-actions bot commented May 22, 2023

vepatel commented May 23, 2023 • edited Loading

brianehlert commented May 23, 2023

brian-provenzano commented May 23, 2023

brian-provenzano commented May 23, 2023 • edited Loading

brianehlert commented May 23, 2023

brian-provenzano commented May 23, 2023

brian-provenzano commented May 23, 2023

vepatel commented May 24, 2023 • edited Loading

brian-provenzano commented May 24, 2023 • edited Loading

brianehlert commented May 24, 2023

vepatel commented May 25, 2023

justbert commented Jan 10, 2024

vepatel commented Feb 20, 2024

justbert commented Feb 20, 2024

brian-provenzano commented May 22, 2023 •

edited

Loading

vepatel commented May 23, 2023 •

edited

Loading

brian-provenzano commented May 23, 2023 •

edited

Loading

vepatel commented May 24, 2023 •

edited

Loading

brian-provenzano commented May 24, 2023 •

edited

Loading