Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed calling webhook "mservice.elbv2.k8s.aws" #458

Open
mayurbhagia opened this issue Feb 28, 2024 · 4 comments
Open

failed calling webhook "mservice.elbv2.k8s.aws" #458

mayurbhagia opened this issue Feb 28, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@mayurbhagia
Copy link

mayurbhagia commented Feb 28, 2024

Installing Spark Operator with YuniKorn on Cloud9 in my AWS account and install.sh is ending with below two errors:

Error: 2 errors occurred:
│ * Internal error occurred: failed calling webhook "mservice.elbv2.k8s.aws": failed to call webhook: Post "https://aws-load-balancer-webhook-service.kube-system.svc:443/mutate-v1-service?timeout=10s": dial tcp 100.64.184.123:9443: connect: connection refused
│ * Internal error occurred: failed calling webhook "mservice.elbv2.k8s.aws": failed to call webhook: Post "https://aws-load-balancer-webhook-service.kube-system.svc:443/mutate-v1-service?timeout=10s": dial tcp 100.64.184.123:9443: connect: connection refused

@vara-bonthu
Copy link
Collaborator

I think it's a timing issue. if you try to run terraform apply or rerun install.sh again then it should fix the issue.

Please feel free to update troubleshooting guide https://github.com/awslabs/data-on-eks/blob/main/website/docs/blueprints/troubleshooting/troubleshooting.md if the issue resolved by the above approach.

@raykrueger
Copy link
Contributor

This consistently requires two executions of install.sh currently.

@raykrueger
Copy link
Contributor

raykrueger commented Mar 18, 2024

I'm betting we need to bump up that 10s timeout, but currently we'd be blocked on...
kubernetes-sigs/aws-load-balancer-controller#2711

@askulkarni2 askulkarni2 added the bug Something isn't working label Mar 19, 2024
@askulkarni2
Copy link
Collaborator

This is due to a mutating webhook introduced for LBC v2.5+. Per the docs...

The AWS LBC provides a mutating webhook for service resources to set the spec.loadBalancerClass field for service of type LoadBalancer on create. This makes the AWS LBC the default controller for service of type LoadBalancer. You can disable this feature and revert to set Cloud Controller Manager (in-tree controller) as the default by setting the helm chart value enableServiceMutatorWebhook to false with --set enableServiceMutatorWebhook=false . You will no longer be able to provision new Classic Load Balancer (CLB) from your kubernetes service unless you disable this feature. Existing CLB will continue to work fine.

If you do not need to have the webhook enabled then you can disable it as shown here.

  # Turn off mutation webhook for services to avoid ordering issue
  enable_aws_load_balancer_controller = true
  aws_load_balancer_controller = {
    set = [{
      name  = "enableServiceMutatorWebhook"
      value = "false"
    }]
  }

Ref: https://github.com/aws-ia/terraform-aws-eks-blueprints-addons/blob/257677adeed1be54326637cf919cf24df6ad7c06/tests/complete/main.tf#L120-L125

We should add this to our blueprints, will mark it as a bug for tracking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants