Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore cert-manager in LBC's webhooks #16179

Merged
merged 3 commits into from
Dec 20, 2023
Merged

Conversation

rifelpet
Copy link
Member

LBC depends on cert-manager but kops can get in a circular dependency loop when applying these manifests on a new cluster.
The cert-manager pods wont be created because the LBC webhook on "CREATE services" isn't working yet, but LBC pod cant be created because it depends on a secret volume mount created by cert-manager

Observe the errors in these protokube logs:

W1218 22:01:20.531717 10658 results.go:63] error from apply on /v1, Kind=Service kube-system/cert-manager: error from apply: error patching object: Internal error occurred: failed calling webhook "mservice.elbv2.k8s.aws": failed to call webhook: Post "https://aws-load-balancer-webhook-service.kube-system.svc:443/mutate-v1-service?timeout=10s": dial tcp 100.69.153.238:443: connect: connection refused

W1218 22:01:22.130490 10658 results.go:63] error from apply on cert-manager.io/v1, Kind=Certificate kube-system/aws-load-balancer-serving-cert: error from apply: error patching object: Internal error occurred: failed calling webhook "webhook.cert-manager.io": failed to call webhook: Post "https://cert-manager-webhook.kube-system.svc:443/mutate?timeout=10s": service "cert-manager-webhook" not found

and kube-system event:

- count: 15
  eventTime: null
  firstTimestamp: "2023-12-18T21:48:54Z"
  involvedObject:
    apiVersion: v1
    kind: Pod
    name: aws-load-balancer-controller-d7cfbcbd4-s42pv
    namespace: kube-system
    resourceVersion: "932"
    uid: ffa0fc59-ed11-42d8-8114-5fc65d33bda4
  lastTimestamp: "2023-12-18T22:03:13Z"
  message: 'MountVolume.SetUp failed for volume "cert" : secret "aws-load-balancer-webhook-tls"
    not found'

This should fix the flakiness in this prow job that started after we upgraded LBC (#16155) that added the "CREATE services" webhook configuration.

LBC depends on cert-manager but kops can get in a circular dependency loop when applying these manifests on a new cluster.
The cert-manager pods wont be created because the LBC webhook on "CREATE pods" isn't working yet, but LBC pod cant be created because it depends on a secret volume mount created by cert-manager
Signed-off-by: Peter Rifel <[email protected]>
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Dec 19, 2023
@rifelpet
Copy link
Member Author

/test pull-kops-e2e-aws-load-balancer-controller

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 19, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hakman

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 19, 2023
@hakman
Copy link
Member

hakman commented Dec 19, 2023

/test pull-kops-e2e-aws-load-balancer-controller

@hakman
Copy link
Member

hakman commented Dec 19, 2023

/lgtm cancel

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 19, 2023
@rifelpet
Copy link
Member Author

/test pull-kops-e2e-aws-load-balancer-controller

using amd64 nodes for this job because LBC's e2e test uses single-arch images (example)

@rifelpet
Copy link
Member Author

/test pull-kops-e2e-aws-load-balancer-controller

@rifelpet
Copy link
Member Author

/cc @hakman

@k8s-ci-robot k8s-ci-robot requested a review from hakman December 20, 2023 02:28
@hakman
Copy link
Member

hakman commented Dec 20, 2023

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 20, 2023
@k8s-ci-robot k8s-ci-robot merged commit 40ec87b into kubernetes:master Dec 20, 2023
25 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.29 milestone Dec 20, 2023
@jams008
Copy link

jams008 commented Aug 22, 2024

Hi, i found same issue after enable ALB controller via addos kops. how can fix this issue?
I use kops version client 1.29.0

MountVolume.SetUp failed for volume "cert" : secret "aws-load-balancer-webhook-tls" not found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/addons cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants