Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy EKS on fargate #4888

Closed
wheestermans31 opened this issue Jan 6, 2020 · 42 comments
Closed

Deploy EKS on fargate #4888

wheestermans31 opened this issue Jan 6, 2020 · 42 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@wheestermans31
Copy link

I'm deploying the ingress-nginx towards an EKS cluster on fargate (new ) uisng the standard deployment yaml file, but there is issue with the securitycontext

Warning FailedScheduling fargate-scheduler Pod not supported on Fargate: invalid SecurityContext fields: AllowPrivilegeEscalation

AllowPrivilegeEscalation doesn't seems to be allowed.

@aledbf
Copy link
Member

aledbf commented Jan 6, 2020

AllowPrivilegeEscalation doesn't seems to be allowed.

Did you check there is no PodSecurityPolicy forbidding this?

@kalinsma
Copy link

Hi,
I've noticed the same issue, here is my PSP

kubectl describe psp eks.privileged
Name: eks.privileged

Settings:
Allow Privileged: true
Allow Privilege Escalation: 0xc0003fcb78
Default Add Capabilities:
Required Drop Capabilities:
Allowed Capabilities: *
Allowed Volume Types: *
Allow Host Network: true
Allow Host Ports: 0-65535
Allow Host PID: true
Allow Host IPC: true
Read Only Root Filesystem: false
SELinux Context Strategy: RunAsAny
User:
Role:
Type:
Level:
Run As User Strategy: RunAsAny
Ranges:
FSGroup Strategy: RunAsAny
Ranges:
Supplemental Groups Strategy: RunAsAny
Ranges:

@wheestermans31
Copy link
Author

My psp

uid: d7fc543e-288b-11ea-a11c-0a72aad1a7be
spec:
allowPrivilegeEscalation: true
allowedCapabilities:

  • '*'
    fsGroup:
    rule: RunAsAny
    hostIPC: true
    hostNetwork: true
    hostPID: true
    hostPorts:
  • max: 65535
    min: 0
    privileged: true
    runAsUser:
    rule: RunAsAny
    seLinux:
    rule: RunAsAny
    supplementalGroups:
    rule: RunAsAny
    volumes:
  • '*'

---> allowPrivilegeEscalation: true, so for this point it looks fine

Seems to be blocked on other level, the pod execution policy?

@wheestermans31
Copy link
Author

I think we are htting some EKS on fargate restriction here

There are currently a few limitations that you should be aware of:

https://aws.amazon.com/blogs/aws/amazon-eks-on-aws-fargate-now-generally-available/

You cannot run Daemonsets, Privileged pods, or pods that use HostNetwork or HostPort.

Can we run the controller with the option AllowPrivilegeEscalation set to false?

Can sombody advise on this?

@arunkumarmurugesan90
Copy link

I have changed the allowescalation to false in the 0.21.0 till 0.28.0 getting Permission denied error.

Error:

Error: exit status 1
2020/02/13 11:47:06 [notice] 73#73: ModSecurity-nginx v1.0.0
nginx: the configuration file /tmp/nginx-cfg598726495 syntax is ok
2020/02/13 11:47:06 [emerg] 73#73: bind() to 0.0.0.0:80 failed (13: Permission denied)
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
nginx: configuration file /tmp/nginx-cfg598726495 test failed

can anyone help me out here?

@mellonbrook
Copy link

same issue....

@rblaine95
Copy link

rblaine95 commented Mar 21, 2020

I solved this issue by setting the following values in the helm chart

controller:
  extraArgs:
    http-port: 8080
    https-port: 8443

  containerPort:
    http: 8080
    https: 8443

  service:
    ports:
      http: 80
      https: 443
    targetPorts:
      http: 8080
      https: 8443

  image:
    allowPrivilegeEscalation: false

@johnchesser
Copy link

I ran into this today, was thinking I have to deploy an EC2 nodegroup in my cluster to test instead of using fargate.

"Warning FailedScheduling fargate-scheduler Pod not supported on Fargate: invalid SecurityContext fields: AllowPrivilegeEscalation"

@nick4fake
Copy link

Guys, the biggest problem is the fact that nginx ingress controller uses classic load balancer, that is not supported on EKS Fargate.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 13, 2020
@esc-ga
Copy link

esc-ga commented Jul 24, 2020

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 24, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 22, 2020
@KorvinSzanto
Copy link

KorvinSzanto commented Nov 17, 2020

Is the takeaway from this that ingress-nginx is not supported on EKS fargate and should be avoided?

@TBBle
Copy link

TBBle commented Nov 18, 2020

According to #4888 (comment) it seems to be workable. (And those would be nice changes to make default in the chart, to remove the need to run privileged by default)

You'd also have to use the NLB with it (or better-yet, the newly-supported NLB-IP mode) since CLB isn't supported for Fargate, but nginx-ingress works with NLB already.

To be clear, I haven't tried nginx-ingress on Fargate or with NLB-IP myself, it's part of my plan for my next AWS k8s rollout. We do run nginx-ingress with NLB today, but not on Fargate.

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 18, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jainhitesh9998
Copy link

Error: exit status 1
nginx: the configuration file /tmp/nginx-cfg370196987 syntax is ok
2021/03/31 12:06:40 [emerg] 71#71: bind() to 0.0.0.0:80 failed (13: Permission denied)
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
nginx: configuration file /tmp/nginx-cfg370196987 test failed ```

I'm getting the following logs from controller pod after i deployed the nginx ingress controller in fargate with AllowPriviligeEscalation as False. Any inputs on deploying ingress controller(nginx or any) on a private fargate cluster

@TBBle
Copy link

TBBle commented Mar 31, 2021

I'm getting the following logs from controller pod after i deployed the nginx ingress controller in fargate with AllowPriviligeEscalation as False. Any inputs on deploying ingress controller(nginx or any) on a private fargate cluster

As well as setting AllowPriviligeEscalation to false, you need to change the ports the server listens on to be non-privileged, and change the container ports to match, and then use the service to target the new ports e.g., #4888 (comment) if you are using Helm.

I kind-of feel this should be the default setup for this Ingress Controller, since AFAIK it only needs the high privileges for the port-binding, and that seems like a high-risk security profile to take when there's a simple way to not need it. Edit: #3668 explains that this is needed for the non-http feature.

@sAnti09
Copy link

sAnti09 commented May 4, 2021

After many trials and errors, I had finally made it work with aws eks fargate. Here are some changes:

  • ports 80, 443 -> 8080, 8443
  • webhook port from 8443 to 8444 to avoid conflict with https port
  • load balancer type from nlb to nlb-ip

Notes:

  • Make sure service type is not NodeType but ClusterIp
  • Ingress reads ingress from all namespaces unless restriction is specified. In this way, you could create ingress in different namespaces and nginx ingress will still be able to capture those endpoints.

run kubectl apply -f nginx-ingress.yml

# contents of nginx-ingress.yml

apiVersion: v1
kind: Namespace
metadata:
  name: ingress-nginx
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx

---
# Source: ingress-nginx/templates/controller-serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: controller
  name: ingress-nginx
  namespace: ingress-nginx
automountServiceAccountToken: true
---
# Source: ingress-nginx/templates/controller-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: controller
  name: ingress-nginx-controller
  namespace: ingress-nginx
data:
---
# Source: ingress-nginx/templates/clusterrole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
  name: ingress-nginx
rules:
  - apiGroups:
      - ''
    resources:
      - configmaps
      - endpoints
      - nodes
      - pods
      - secrets
    verbs:
      - list
      - watch
  - apiGroups:
      - ''
    resources:
      - nodes
    verbs:
      - get
  - apiGroups:
      - ''
    resources:
      - services
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - extensions
      - networking.k8s.io   # k8s 1.14+
    resources:
      - ingresses
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - ''
    resources:
      - events
    verbs:
      - create
      - patch
  - apiGroups:
      - extensions
      - networking.k8s.io   # k8s 1.14+
    resources:
      - ingresses/status
    verbs:
      - update
  - apiGroups:
      - networking.k8s.io   # k8s 1.14+
    resources:
      - ingressclasses
    verbs:
      - get
      - list
      - watch
---
# Source: ingress-nginx/templates/clusterrolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
  name: ingress-nginx
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: ingress-nginx
subjects:
  - kind: ServiceAccount
    name: ingress-nginx
    namespace: ingress-nginx
---
# Source: ingress-nginx/templates/controller-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: controller
  name: ingress-nginx
  namespace: ingress-nginx
rules:
  - apiGroups:
      - ''
    resources:
      - namespaces
    verbs:
      - get
  - apiGroups:
      - ''
    resources:
      - configmaps
      - pods
      - secrets
      - endpoints
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - ''
    resources:
      - services
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - extensions
      - networking.k8s.io   # k8s 1.14+
    resources:
      - ingresses
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - extensions
      - networking.k8s.io   # k8s 1.14+
    resources:
      - ingresses/status
    verbs:
      - update
  - apiGroups:
      - networking.k8s.io   # k8s 1.14+
    resources:
      - ingressclasses
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - ''
    resources:
      - configmaps
    resourceNames:
      - ingress-controller-leader-nginx
    verbs:
      - get
      - update
  - apiGroups:
      - ''
    resources:
      - configmaps
    verbs:
      - create
  - apiGroups:
      - ''
    resources:
      - events
    verbs:
      - create
      - patch
---
# Source: ingress-nginx/templates/controller-rolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: controller
  name: ingress-nginx
  namespace: ingress-nginx
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: ingress-nginx
subjects:
  - kind: ServiceAccount
    name: ingress-nginx
    namespace: ingress-nginx
---
# Source: ingress-nginx/templates/controller-service-webhook.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: controller
  name: ingress-nginx-controller-admission
  namespace: ingress-nginx
spec:
  type: ClusterIP
  ports:
    - name: https-webhook
      port: 443
      targetPort: webhook
  selector:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/component: controller
---
# Source: ingress-nginx/templates/controller-service.yaml
apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: 'true'
    service.beta.kubernetes.io/aws-load-balancer-type: nlb-ip
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: controller
  name: ingress-nginx-controller
  namespace: ingress-nginx
spec:
  type: LoadBalancer
  externalTrafficPolicy: Local
  ports:
    - name: http
      port: 80
      protocol: TCP
      targetPort: 8080
    - name: https
      port: 443
      protocol: TCP
      targetPort: 8443
  selector:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/component: controller
---
# Source: ingress-nginx/templates/controller-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: controller
  name: ingress-nginx-controller
  namespace: ingress-nginx
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: ingress-nginx
      app.kubernetes.io/instance: ingress-nginx
      app.kubernetes.io/component: controller
  revisionHistoryLimit: 10
  minReadySeconds: 0
  template:
    metadata:
      labels:
        app.kubernetes.io/name: ingress-nginx
        app.kubernetes.io/instance: ingress-nginx
        app.kubernetes.io/component: controller
    spec:
      dnsPolicy: ClusterFirst
      containers:
        - name: controller
          image: k8s.gcr.io/ingress-nginx/controller:v0.46.0@sha256:52f0058bed0a17ab0fb35628ba97e8d52b5d32299fbc03cc0f6c7b9ff036b61a
          imagePullPolicy: IfNotPresent
          lifecycle:
            preStop:
              exec:
                command:
                  - /wait-shutdown
          args:
            - /nginx-ingress-controller
            - --publish-service=$(POD_NAMESPACE)/ingress-nginx-controller
            - --election-id=ingress-controller-leader
            - --ingress-class=nginx
            - --configmap=$(POD_NAMESPACE)/ingress-nginx-controller
            - --validating-webhook=:8444
            - --validating-webhook-certificate=/usr/local/certificates/cert
            - --validating-webhook-key=/usr/local/certificates/key
            - --http-port=8080
            - --https-port=8443
          securityContext:
            capabilities:
              drop:
                - ALL
              add:
                - NET_BIND_SERVICE
            runAsUser: 101
            allowPrivilegeEscalation: false
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: LD_PRELOAD
              value: /usr/local/lib/libmimalloc.so
          livenessProbe:
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 1
            successThreshold: 1
            failureThreshold: 5
          readinessProbe:
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 1
            successThreshold: 1
            failureThreshold: 3
          ports:
            - name: http
              containerPort: 8080
              protocol: TCP
            - name: https
              containerPort: 8443
              protocol: TCP
            - name: webhook
              containerPort: 8444
              protocol: TCP
          volumeMounts:
            - name: webhook-cert
              mountPath: /usr/local/certificates/
              readOnly: true
          resources:
            requests:
              cpu: 100m
              memory: 90Mi
      nodeSelector:
        kubernetes.io/os: linux
      serviceAccountName: ingress-nginx
      terminationGracePeriodSeconds: 300
      volumes:
        - name: webhook-cert
          secret:
            secretName: ingress-nginx-admission
---
# Source: ingress-nginx/templates/admission-webhooks/validating-webhook.yaml
# before changing this value, check the required kubernetes version
# https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#prerequisites
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: admission-webhook
  name: ingress-nginx-admission
webhooks:
  - name: validate.nginx.ingress.kubernetes.io
    matchPolicy: Equivalent
    rules:
      - apiGroups:
          - networking.k8s.io
        apiVersions:
          - v1beta1
        operations:
          - CREATE
          - UPDATE
        resources:
          - ingresses
    failurePolicy: Fail
    sideEffects: None
    admissionReviewVersions:
      - v1
      - v1beta1
    clientConfig:
      service:
        namespace: ingress-nginx
        name: ingress-nginx-controller-admission
        path: /networking/v1beta1/ingresses
---
# Source: ingress-nginx/templates/admission-webhooks/job-patch/serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: ingress-nginx-admission
  annotations:
    helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade
    helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: admission-webhook
  namespace: ingress-nginx
---
# Source: ingress-nginx/templates/admission-webhooks/job-patch/clusterrole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: ingress-nginx-admission
  annotations:
    helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade
    helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: admission-webhook
rules:
  - apiGroups:
      - admissionregistration.k8s.io
    resources:
      - validatingwebhookconfigurations
    verbs:
      - get
      - update
---
# Source: ingress-nginx/templates/admission-webhooks/job-patch/clusterrolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: ingress-nginx-admission
  annotations:
    helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade
    helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: admission-webhook
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: ingress-nginx-admission
subjects:
  - kind: ServiceAccount
    name: ingress-nginx-admission
    namespace: ingress-nginx
---
# Source: ingress-nginx/templates/admission-webhooks/job-patch/role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: ingress-nginx-admission
  annotations:
    helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade
    helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: admission-webhook
  namespace: ingress-nginx
rules:
  - apiGroups:
      - ''
    resources:
      - secrets
    verbs:
      - get
      - create
---
# Source: ingress-nginx/templates/admission-webhooks/job-patch/rolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: ingress-nginx-admission
  annotations:
    helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade
    helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: admission-webhook
  namespace: ingress-nginx
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: ingress-nginx-admission
subjects:
  - kind: ServiceAccount
    name: ingress-nginx-admission
    namespace: ingress-nginx
---
# Source: ingress-nginx/templates/admission-webhooks/job-patch/job-createSecret.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: ingress-nginx-admission-create
  annotations:
    helm.sh/hook: pre-install,pre-upgrade
    helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: admission-webhook
  namespace: ingress-nginx
spec:
  template:
    metadata:
      name: ingress-nginx-admission-create
      labels:
        helm.sh/chart: ingress-nginx-3.30.0
        app.kubernetes.io/name: ingress-nginx
        app.kubernetes.io/instance: ingress-nginx
        app.kubernetes.io/version: 0.46.0
        app.kubernetes.io/managed-by: Helm
        app.kubernetes.io/component: admission-webhook
    spec:
      containers:
        - name: create
          image: docker.io/jettech/kube-webhook-certgen:v1.5.1
          imagePullPolicy: IfNotPresent
          args:
            - create
            - --host=ingress-nginx-controller-admission,ingress-nginx-controller-admission.$(POD_NAMESPACE).svc
            - --namespace=$(POD_NAMESPACE)
            - --secret-name=ingress-nginx-admission
          env:
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
      restartPolicy: OnFailure
      serviceAccountName: ingress-nginx-admission
      securityContext:
        runAsNonRoot: true
        runAsUser: 2000
---
# Source: ingress-nginx/templates/admission-webhooks/job-patch/job-patchWebhook.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: ingress-nginx-admission-patch
  annotations:
    helm.sh/hook: post-install,post-upgrade
    helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded
  labels:
    helm.sh/chart: ingress-nginx-3.30.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.46.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: admission-webhook
  namespace: ingress-nginx
spec:
  template:
    metadata:
      name: ingress-nginx-admission-patch
      labels:
        helm.sh/chart: ingress-nginx-3.30.0
        app.kubernetes.io/name: ingress-nginx
        app.kubernetes.io/instance: ingress-nginx
        app.kubernetes.io/version: 0.46.0
        app.kubernetes.io/managed-by: Helm
        app.kubernetes.io/component: admission-webhook
    spec:
      containers:
        - name: patch
          image: docker.io/jettech/kube-webhook-certgen:v1.5.1
          imagePullPolicy: IfNotPresent
          args:
            - patch
            - --webhook-name=ingress-nginx-admission
            - --namespace=$(POD_NAMESPACE)
            - --patch-mutating=false
            - --secret-name=ingress-nginx-admission
            - --patch-failure-policy=Fail
          env:
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
      restartPolicy: OnFailure
      serviceAccountName: ingress-nginx-admission
      securityContext:
        runAsNonRoot: true
        runAsUser: 2000

@iainlane
Copy link

(sorry for commenting on a closed issue...)

With the YAML above (diff vs. the upstream AWS YAML below) , I'm getting this when trying to deploy my Ingress:

 Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post "https://ingress-nginx-controller-admission.ingress-nginx.svc:443/networking/v1/ingresses?timeout=10s": x509: certificate is valid for ingress.local, not ingress-nginx-controller-admission.ingress-nginx.svc

anyone got any ideas how to resolve that? I feel like maybe I missed something when deploying.

Diff:

--- ingress-nginx-aws-unmodified.yaml-donotuse	2021-09-10 11:38:09.160826297 +0100
+++ ingress-nginx-aws.yaml	2021-09-10 11:38:09.332826596 +0100
@@ -263,7 +263,8 @@
   annotations:
     service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
     service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: 'true'
-    service.beta.kubernetes.io/aws-load-balancer-type: nlb
+    service.beta.kubernetes.io/aws-load-balancer-type: "nlb-ip"
+    service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
   labels:
     helm.sh/chart: ingress-nginx-4.0.1
     app.kubernetes.io/name: ingress-nginx
@@ -280,12 +281,12 @@
     - name: http
       port: 80
       protocol: TCP
-      targetPort: http
+      targetPort: 8080
       appProtocol: http
     - name: https
       port: 443
       protocol: TCP
-      targetPort: https
+      targetPort: 8443
       appProtocol: https
   selector:
     app.kubernetes.io/name: ingress-nginx
@@ -336,9 +337,11 @@
             - --election-id=ingress-controller-leader
             - --controller-class=k8s.io/ingress-nginx
             - --configmap=$(POD_NAMESPACE)/ingress-nginx-controller
-            - --validating-webhook=:8443
+            - --validating-webhook=:8444
             - --validating-webhook-certificate=/usr/local/certificates/cert
             - --validating-webhook-key=/usr/local/certificates/key
+            - --http-port=8080
+            - --https-port=8443
           securityContext:
             capabilities:
               drop:
@@ -346,7 +349,7 @@
               add:
                 - NET_BIND_SERVICE
             runAsUser: 101
-            allowPrivilegeEscalation: true
+            allowPrivilegeEscalation: false
           env:
             - name: POD_NAME
               valueFrom:

@TBBle
Copy link

TBBle commented Sep 10, 2021

My guess is that the validating webhook is hitting your https hosting port, which your diff moved to 8443, instead of the actual validating webhook endpoint, which is now listening on port 8444.

In ingress-nginx/templates/controller-deployment.yaml (where the --validating-webhook=:8444 change is), further down in the container there's a ports section, which isn't changed in your diff.

It should look like this:

         ports:
            - name: http
              containerPort: 8080
              protocol: TCP
            - name: https
              containerPort: 8443
              protocol: TCP
            - name: webhook
              containerPort: 8444
              protocol: TCP

to match the command-line changes in your diff. I suspect you have the defaults:

         ports:
            - name: http
              containerPort: 80
              protocol: TCP
            - name: https
              containerPort: 443
              protocol: TCP
            - name: webhook
              containerPort: 8443
              protocol: TCP

Once you have that change, you don't need the targetPort changes, because the port names (http, https) will be correct again. There's other things in the YAML that are relying on the port name webhook, that's why you're seeing the webhook hit your port 8443.

I'd also suggest using a different https port, so you don't need to change the webhook port at all, to simplify things.

So all-up, the diff should be something like this (hand-rolled, so it probably won't apply literally, if I miscounted)

--- ingress-nginx-aws-unmodified.yaml-donotuse	2021-09-10 11:38:09.160826297 +0100
+++ ingress-nginx-aws.yaml	2021-09-10 11:38:09.332826596 +0100
@@ -263,7 +263,8 @@
   annotations:
     service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
     service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: 'true'
-    service.beta.kubernetes.io/aws-load-balancer-type: nlb
+    service.beta.kubernetes.io/aws-load-balancer-type: "nlb-ip"
+    service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
   labels:
     helm.sh/chart: ingress-nginx-4.0.1
     app.kubernetes.io/name: ingress-nginx
@@ -336,9 +337,11 @@
             - --election-id=ingress-controller-leader
             - --controller-class=k8s.io/ingress-nginx
             - --configmap=$(POD_NAMESPACE)/ingress-nginx-controller
             - --validating-webhook=:8443
             - --validating-webhook-certificate=/usr/local/certificates/cert
             - --validating-webhook-key=/usr/local/certificates/key
+            - --http-port=8080
+            - --https-port=8081
           securityContext:
             capabilities:
               drop:
@@ -346,7 +349,7 @@
               add:
                 - NET_BIND_SERVICE
             runAsUser: 101
-            allowPrivilegeEscalation: true
+            allowPrivilegeEscalation: false
           env:
             - name: POD_NAME
               valueFrom:
@@ -380,10 +383,10 @@
             failureThreshold: 3
           ports:
              - name: http
-               containerPort: 80
+               containerPort: 8080
                protocol: TCP
              - name: https
-               containerPort: 443
+               containerPort: 8081
                protocol: TCP
              - name: webhook
                containerPort: 8443

If you're doing this with the current Helm chart, then the equivalent values.yaml for these changes should be something like:

controller:
  extraArgs:
    http-port: 8080
    https-port: 8081

  containerPort:
    http: 8080
    https: 8081

  image:
    allowPrivilegeEscalation: false

@iainlane
Copy link

You got it, thanks so much @TBBle ❤️

I didn't understand that those names were actually defined in this file.

@karthikvishal22
Copy link

@TBBle . Using nlb-ip mode in fargate isn't supported in nginx ingress controller. I can't provision nlb using the above yaml of @sAnti09 . Please help with a workaround

@TBBle
Copy link

TBBle commented Oct 3, 2021

@InfectedOne nlb-ip mode is supported in Fargate, according to the docs. You haven't given any details on what went wrong, so there's no way to help you at this point.

That said, I haven't tried the YAML files posted above, I use Helm to manage deployments, and actually don't have a current Fargate/NGINX Ingress deployment to test against. So my posts here are mostly guesswork against the docs.

@karthikvishal22
Copy link

@TBBle, I have used the same tamil which @sAnti09 has posted in the chat. Deploying that I have successfully deployed the controller but there isn't any load balancer provisioned. I tried the same with lb type as nlb it was successful. I'm sorry that can't post any files as it goes against my org. I have raised a case with AWS support where the have mentioned that nlb-ip targets aren't supports for nginx ingress controller as of nos. But according to this post it has to work right. So please help me

@karthikvishal22
Copy link

@TBBle , the docs which you shared refers to aws load balancer controller where I'm using nginx ingress controller. Does this difference create any issue?

@TBBle
Copy link

TBBle commented Oct 3, 2021

The AWS Load Balancer Controller creates Load Balancers, it's separate from the NGINX Ingress Controller which uses Load Balancers.

You need to have deployed the AWS Load Balancer Controller before you can use NLB-IP on your cluster for anything, otherwise you're limited to the NLB instance mode supported by the (legacy) built-in AWS Load Balancer support in k8s or the AWS Cloud Controller (depending on which version of k8s you're running).

My guess is that the AWS support response is that they don't support NGINX Ingress with NLB-IP targets, because the AWS Load Balancer Controller is not yet deployed by default, but that doesn't mean it doesn't work, just that they don't manage it for you.

You don't need nlb-ip mode to run the NGINX Ingress pods on Fargate, though. NLB Instance mode will work. But @sAnti09's YAML is for nlb-ip mode, so it depends on the AWS Load Balancer Controller.

@karthikvishal22
Copy link

Understood @TBBle. Will try out your suggestion and revert the output. Thanks

@timblaktu
Copy link

timblaktu commented Nov 9, 2021

@TBBle, @iainlane, @InfectedOne, anyone, are you aware of any branch/release of this ingress-nginx helm chart repo which implements the suggested workarounds for installing the chart on Fargate/EKS? Just wanted to ask before I try hacking this together myself. Thanks.

EDIT: I see that I don't need the branch - what is needed to fix this is to simply provide this yaml as a values file at the helm install, and that the helm docs indicate that -f values will override the values in the chart fetched from the repo.

@TBBle
Copy link

TBBle commented Nov 9, 2021

For the Helm chart, as far as I know all the changes needed are possible through passing in values when deploying, I believe, so there isn't any need for a separate chart repo.

I haven't tested it, and don't currently have a working deployment of Ingress NGINX on EKS Fargate at-hand, but the values overrides at the end of this comment should be a good starting place.

As well as those settings, you'll also need to use the appropriate annotations to run using NLB rather than ELB, so all-up it ends up looking something like

controller:
  extraArgs:
    http-port: 8080
    https-port: 8081

  containerPort:
    http: 8080
    https: 8081

  image:
    allowPrivilegeEscalation: false

  service:
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-type: "nlb-ip"

Edit: Fixed the aws-load-balancer-type to be nlb-ip, as that's required for Fargate. It probably should be

service.beta.kubernetes.io/aws-load-balancer-type: "external"
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "ip"

for current versions of the AWS Load Balancer controller (2.2 onwards), but new versions will recognise the nlb-ip annotation.

@timblaktu
Copy link

This guy suggests doing effectively the same -f values, but takes it a step further and applies several other values at the command line using --set and --set-string:

helm install nginx-ingress ingress-nginx/ingress-nginx 
    --set-string controller.service.externalTrafficPolicy=Local 
    --set-string controller.service.type=NodePort 
    --set controller.publishService.enabled=true 
    --set serviceAccount.create=true 
    --set rbac.create=true 
    --set-string controller.config.server-tokens=false 
    --set-string controller.config.use-proxy-protocol=false 
    --set-string controller.config.compute-full-forwarded-for=true 
    --set-string controller.config.use-forwarded-headers=true 
    --set controller.metrics.enabled=true 
    --set controller.autoscaling.maxReplicas=1 
    --set controller.autoscaling.minReplicas=1 
    --set controller.autoscaling.enabled=true 
    --namespace kube-system 
    -f nginx-values.yaml 

Not at all sure how much of this is necessary. Seems like the folks here are saying just the ports (privileged-->unprivileged) and allowPrivilegeEscalation: false are necessary. Right?

@TBBle
Copy link

TBBle commented Nov 9, 2021

That link is setting up a different system, putting an ALB in front of NGINX Ingress (apparently the author has some business requirement to route all incoming traffic through a single ALB), rather than an NLB as I've been assuming here. So rather than an NLB routing the TCP traffic to NGINX Ingress to handle the HTTP, that setup has ALB handling HTTP and then proxying onwards to NGINX Ingress.

In the linked setup, NGINX Ingress is being used like any other Ingress-exposed Service, and doesn't even need to be NodePort, it could be ClusterIP. So unless you're trying to do that specific setup, the controller.service settings being used there don't apply to you. (I suspect NodePort won't directly function with Fargate, and will need to have EC2 Nodes available to host those ports and then forward the traffic into Fargate. But I've never tested this.)

It seems overly-complicated, but it's possible that guide was written before NLB-IP was viable for EKS Fargate, and ALB was the only option? This ticket is definitely that old, but my involvement was only after NLB-IP was available, so I've never looked at using ALB in front of NGINX Ingress.

The rest of the settings being used are up to you, they're not related to Fargate at all, just general Ingress NGINX config.

I definitely would not deploy it into kube-system though, that seems like a bad practice. Better to put it in its own namespace, and create an EKS Fargate profile to match it, I think. That makes it much easier to experiment/test without disrupting other parts of the system.

@timblaktu
Copy link

timblaktu commented Nov 12, 2021

@TBBle sorry for commenting on a closed issue, but I need help that may be related to this issue. I've worked around this issue by reinstalling the helm chart with the below values, but the resulting ingress-nginx controller pod on my Fargate-only EKS cluster always fails liveness and readiness probes on startup.

I first increased livenessProbe:initialDelaySeconds and readinessProbe:initialDelaySeconds helm-chart values to 60s, then 20m, and am still encountering the probe failures, so I suspect the issue is something related to the VPC or container networking I have botched.

Right now, I'm seeing the below errors in the nginx-ingress-ingress-nginx-controller pod/container logs that there is an error binding 0.0.0.0:8443 bc address in use, but I'm at a loss what in the sea of configuration I need to adapt/correct.

2021/11/12 17:15:02 [emerg] 27#27: bind() to 0.0.0.0:8443 failed (98: Address in use)
2021/11/12 17:15:02 [emerg] 27#27: bind() to [::]:8443 failed (98: Address in use)
.
.

Here is my helm-chart values file (helm/nginx-values.yaml):

controller: 
  extraArgs: 
    http-port: 8080 
    https-port: 8443 
  containerPort: 
    http: 8080 
    https: 8443 
  service: 
    ports: 
      http: 80 
      https: 443 
    targetPorts: 
      http: 8080 
      https: 8443 
  image: 
    allowPrivilegeEscalation: false
    # https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes
    livenessProbe:
      initialDelaySeconds: 60  # 30
    readinessProbe:
      initialDelaySeconds: 60  # 0
  service:
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-type: "nlb"

which I am installing into my fresh cluster using:

helm install nginx-ingress ingress-nginx/ingress-nginx --namespace ingress --set controller.replicaCount=2 --values helm/nginx-values.yaml

Any ideas why the nginx-ingress-ingress-nginx-controller pod/container is not able to bind to zero, i.e. INADDR_ANY?

EDIT: I'm able to execute netstat command inside the running container as user www-data to confirm indeed 0:8443 is already bound, but because I haven't yet figured out how to get root access, the PID/name of the processes are not available to me:

> kubectl exec -n ingress --stdin --tty nginx-ingress-ingress-nginx-controller-74d46b8fd8-85tkh -- netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.1:10245         0.0.0.0:*               LISTEN      -
tcp        3      0 127.0.0.1:10246         0.0.0.0:*               LISTEN      -
tcp        0      0 127.0.0.1:10247         0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:8080            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:8080            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:8181            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:8181            0.0.0.0:*               LISTEN      -
tcp        0      0 :::8443                 :::*                    LISTEN      -
tcp        0      0 :::10254                :::*                    LISTEN      -
tcp        0      0 :::8080                 :::*                    LISTEN      -
tcp        0      0 :::8080                 :::*                    LISTEN      -
tcp        0      0 :::8181                 :::*                    LISTEN      -
tcp        0      0 :::8181                 :::*                    LISTEN      -
> kubectl exec -n ingress --stdin --tty nginx-ingress-ingress-nginx-controller-74d46b8fd8-85tkh -- /bin/bash
bash-5.1$ whoami
www-data
bash-5.1$ ps aux
PID   USER     TIME  COMMAND
    1 www-data  0:00 /usr/bin/dumb-init -- /nginx-ingress-controller --publish-service=ingress/nginx-ingress-ingress-nginx-controller --election-id=ingress-controller-leader --controller-class=k8s.io/ingress-nginx
    8 www-data  0:00 /nginx-ingress-controller --publish-service=ingress/nginx-ingress-ingress-nginx-controller --election-id=ingress-controller-leader --controller-class=k8s.io/ingress-nginx --configmap=ingress/n
   28 www-data  0:00 nginx: master process /usr/local/nginx/sbin/nginx -c /etc/nginx/nginx.conf
   30 www-data  0:00 nginx: worker process
   45 www-data  0:00 /bin/bash
   56 www-data  0:00 ps aux

EDIT: I ended up creating a new issue for what I describe above: #7913

@TBBle
Copy link

TBBle commented Nov 13, 2021

The problem is that 8443 is already bound for the webhook. That's why I used 8081 in my suggestion, not 8443. The examples using 8443 here had to also move the webhook, which introduces more complexity to the changes, and can lead to weird issues if you get it wrong.

@timblaktu
Copy link

Thanks so much @TBBle - clearly cockpit error on my part, not understanding the innards of the ingress-nginx application and not being able to inspect the running container sufficiently - so I finally have installed ingress-nginx on eks-fargate with your suggestion (and a few other values):

controller:
  extraArgs:
    http-port: 8080
    # Cannot use 8443 bc nginx ingress webhook already binds to 8443:
    #   https://github.com/kubernetes/ingress-nginx/issues/4888#issuecomment-968059561
    https-port: 8081
  containerPort:
    http: 8080
    https: 8081
  image:
    allowPrivilegeEscalation: false
    # https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes
    livenessProbe:
      initialDelaySeconds: 60  # 30
    readinessProbe:
      initialDelaySeconds: 60  # 0
  service:
    annotations:
      # TODO: check if alb type "external" "ip" works, per this comment:
      #       https://github.com/kubernetes/ingress-nginx/issues/4888#issuecomment-964535071
      service.beta.kubernetes.io/aws-load-balancer-type: "nlb-ip"

Was slow getting back here bc I had to learn how to troubleshoot broken helm installations. Thanks again.

@TBBle
Copy link

TBBle commented Nov 17, 2021

No worries. And thank you for sharing a final, working setup. That's very valuable to have in a single place.

@timblaktu
Copy link

Well, @TBBle the setup is "working" in the sense that the helm install process will succeed, the ingress controller pods will schedule and be run, but the ingress controller service will be stuck indefinitely in "Pending" EXTERNAL-IP, apparently "Ensuring Load Balancer" never completes:

> kubectl describe service -n ingress nginx-ingress-ingress-nginx-controller
Name:                     nginx-ingress-ingress-nginx-controller
Namespace:                ingress
Labels:                   app.kubernetes.io/component=controller
                          app.kubernetes.io/instance=nginx-ingress
                          app.kubernetes.io/managed-by=Helm
                          app.kubernetes.io/name=ingress-nginx
                          app.kubernetes.io/version=1.0.4
                          helm.sh/chart=ingress-nginx-4.0.6
Annotations:              meta.helm.sh/release-name: nginx-ingress
                          meta.helm.sh/release-namespace: ingress
                          service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: true
                          service.beta.kubernetes.io/aws-load-balancer-internal: true
                          service.beta.kubernetes.io/aws-load-balancer-type: nlb-ip
Selector:                 app.kubernetes.io/component=controller,app.kubernetes.io/instance=nginx-ingress,app.kubernetes.io/name=ingress-nginx
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       172.20.49.237
IPs:                      172.20.49.237
Port:                     http  80/TCP
TargetPort:               http/TCP
NodePort:                 http  30241/TCP
Endpoints:                10.94.191.54:8080
Port:                     https  443/TCP
TargetPort:               https/TCP
NodePort:                 https  31598/TCP
Endpoints:                10.94.191.54:8081
Session Affinity:         None
External Traffic Policy:  Cluster
Events:
  Type    Reason                Age   From                Message
  ----    ------                ----  ----                -------
  Normal  EnsuringLoadBalancer  10m   service-controller  Ensuring load balancer

This is the same behavior I had seen in an identical EKS cluster that I configured separately in another environment/workspace to use ec2 exclusively for its node impl (instead of Fargate) but everything else the same in the terraform project provisioning it.. until I added kubernetes.io/cluster/<cluster-name> tags on the subnets in use, because this.

But I've got the same cluster-specific tags on my fargate-based EKS cluster as well, and the above behavior is still happening. You can see in the above ingress controller service description what annotations I'm using. Perhaps there's another annotation I need here?

Any ideas?

@TBBle
Copy link

TBBle commented Nov 20, 2021

Just to be certain, you do have the AWS Load Balancer Controller installed and functional on this cluster, right? That's the prerequisite for using nlb-ip or external for service.beta.kubernetes.io/aws-load-balancer-type.

Note that per secure by default, you may also need to add an annotation service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing to ensure you get an Internet-facing NLB, which I'm assuming is what you want.

@domeales-paloit
Copy link

Well, @TBBle the setup is "working" in the sense that the helm install process will succeed, the ingress controller pods will schedule and be run, but the ingress controller service will be stuck indefinitely in "Pending" EXTERNAL-IP, apparently "Ensuring Load Balancer" never completes:

[snip]

This is the same behavior I had seen in an identical EKS cluster that I configured separately in another environment/workspace to use ec2 exclusively for its node impl (instead of Fargate) but everything else the same in the terraform project provisioning it.. until I added kubernetes.io/cluster/<cluster-name> tags on the subnets in use, because this.

But I've got the same cluster-specific tags on my fargate-based EKS cluster as well, and the above behavior is still happening. You can see in the above ingress controller service description what annotations I'm using. Perhaps there's another annotation I need here?

Any ideas?

@timblaktu Did you ever solve this? I have the same issue where the service load balancer is never created or becomes ready. The only thing I can think of is that the start-up time of the Fargate pod for the controller is impacting the ability for the load balancer to be created/resolved.

I also have the same set up on another cluster, and I got it working eventually, but I have no idea how I did it.

@TBBle
Copy link

TBBle commented Aug 9, 2022

You might be able to be better-helped if you share the description of the Ingress Nginx controller service, as @timblaktu did earlier.

Looking back that that message, I realise that I was probably right about it being missing AWS Load Balancer Controller, because there was an EnsuringLoadBalancer message from service-controller, which I think is the one in the in-tree AWS Cloud Provider, but nothing from the AWS Load Balancer Controller. The latter is necessary to support nlb-ip mode, which is necessary to support Fargate. (If an identical setup is working with EC2 nodes elsewhere, then I'm reasonably sure this is the problem, because the in-tree AWS Cloud Provider possibly ignores that tag and set up an ELB instead of the NLB requested, which'd work fine and transparently for EC2.)

@domeales-paloit
Copy link

Actually it turned out that in my case the half deployed AWS Load Balancer Controller - and this was the problem. I thought it has been deployed but it was missing a ServiceAccount from another process.

This works as expected on Fargate for me when using this #4888 (comment) (and a properly deployed AWS Load Balancer Controller)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests