Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Webhook Errors on Clean Install #2902

Closed
bwagner5 opened this issue Nov 21, 2022 · 35 comments · Fixed by kubernetes-sigs/karpenter#91
Closed

Webhook Errors on Clean Install #2902

bwagner5 opened this issue Nov 21, 2022 · 35 comments · Fixed by kubernetes-sigs/karpenter#91
Assignees
Labels
bug Something isn't working v1 Issues requiring resolution by the v1 milestone webhook Issues related to the webhooks

Comments

@bwagner5
Copy link
Contributor

Version

Karpenter Version: v0.19.1

Kubernetes Version: v1.23.13

Expected Behavior

Expect Karpenter to start without error logs on a clean install.

Actual Behavior

Karpenter errors, seemingly on a race condition with the webhook controller trying to update the CA bundle.

> kubectl logs deploy/karpenter -n karpenter -f
2022-11-21T18:49:55.965Z        ERROR   webhook.ValidationWebhook       Reconcile error {"commit": "27a51c0", "knative.dev/traceid": "18a04250-5241-42dd-a30a-cfcbf244bb4a", "knative.dev/key": "validation.webhook.karpenter.sh", "duration": "19.873µs", "error": "secret \"karpenter-cert\" is missing \"ca-cert.pem\" key"}
2022-11-21T18:49:55.965Z        ERROR   webhook.ValidationWebhook       Reconcile error {"commit": "27a51c0", "knative.dev/traceid": "a5d1b623-be09-46e3-aff5-126cdd954644", "knative.dev/key": "karpenter/karpenter-cert", "duration": "40.708µs", "error": "secret \"karpenter-cert\" is missing \"ca-cert.pem\" key"}
2022-11-21T18:49:55.965Z        ERROR   webhook.ConfigMapWebhook        Reconcile error {"commit": "27a51c0", "knative.dev/traceid": "00b44b85-49cd-42c4-b279-c059caa21d1a", "knative.dev/key": "karpenter/karpenter-cert", "duration": "9.842µs", "error": "secret \"karpenter-cert\" is missing \"ca-cert.pem\" key"}
2022-11-21T18:49:55.965Z        ERROR   webhook.ConfigMapWebhook        Reconcile error {"commit": "27a51c0", "knative.dev/traceid": "2bb50735-e836-4dd0-9b0f-f766c69f9bff", "knative.dev/key": "validation.webhook.config.karpenter.sh", "duration": "21.691µs", "error": "secret \"karpenter-cert\" is missing \"ca-cert.pem\" key"}
2022-11-21T18:49:56.040Z        ERROR   webhook.ValidationWebhook       Reconcile error {"commit": "27a51c0", "knative.dev/traceid": "cb6357f6-ee74-4439-865a-8a37bc4b3414", "knative.dev/key": "validation.webhook.karpenter.k8s.aws", "duration": "66.640235ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}
2022-11-21T18:49:56.052Z        INFO    controller.aws.pricing  updated spot pricing with instance types and offerings  {"commit": "27a51c0", "instance-type-count": 561, "offering-count": 1436}
2022-11-21T18:49:56.055Z        INFO    controller      Starting workers        {"commit": "27a51c0", "controller": "provisioner-state", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "worker count": 10}
2022-11-21T18:49:56.056Z        ERROR   webhook.ConfigMapWebhook        Reconcile error {"commit": "27a51c0", "knative.dev/traceid": "664c932b-0c4e-483a-99e8-3ce1c24f6670", "knative.dev/key": "validation.webhook.config.karpenter.sh", "duration": "81.512896ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.config.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2022-11-21T18:49:56.060Z        ERROR   webhook.ValidationWebhook       Reconcile error {"commit": "27a51c0", "knative.dev/traceid": "b5fb8bcc-c5ea-47e2-bea2-a93ea1eb8aee", "knative.dev/key": "validation.webhook.karpenter.sh", "duration": "84.146703ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2022-11-21T18:49:56.060Z        ERROR   webhook.DefaultingWebhook       Reconcile error {"commit": "27a51c0", "knative.dev/traceid": "17105fbd-855a-48ef-806b-e3157f35e09c", "knative.dev/key": "defaulting.webhook.karpenter.sh", "duration": "82.45796ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2022-11-21T18:49:56.065Z        INFO    controller      Starting workers        {"commit": "27a51c0", "controller": "node", "controllerGroup": "", "controllerKind": "Node", "worker count": 10}
2022-11-21T18:49:56.066Z        INFO    controller      Starting workers        {"commit": "27a51c0", "controller": "termination", "controllerGroup": "", "controllerKind": "Node", "worker count": 10}
2022-11-21T18:49:56.066Z        INFO    controller      Starting workers        {"commit": "27a51c0", "controller": "counter", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "worker count": 10}
2022-11-21T18:49:56.066Z        INFO    controller      Starting workers        {"commit": "27a51c0", "controller": "provisionermetrics", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "worker count": 1}
2022-11-21T18:49:56.066Z        INFO    controller      Starting workers        {"commit": "27a51c0", "controller": "inflightchecks", "controllerGroup": "", "controllerKind": "Node", "worker count": 10}
2022-11-21T18:49:56.092Z        ERROR   webhook.DefaultingWebhook       Reconcile error {"commit": "27a51c0", "knative.dev/traceid": "f6435a18-8bac-4b0e-bcff-47b4f662a930", "knative.dev/key": "defaulting.webhook.karpenter.k8s.aws", "duration": "114.811691ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}
2022-11-21T18:49:57.143Z        INFO    controller.aws.pricing  updated on-demand pricing       {"commit": "27a51c0", "instance-type-count": 499}
2022-11-21T18:51:17.179Z        DEBUG   controller.deprovisioning       discovered EC2 instance types   {"commit": "27a51c0", "instance-type-count": 499}
2022-11-21T18:51:17.250Z        DEBUG   controller.deprovisioning       discovered subnets      {"commit": "27a51c0", "subnets": ["subnet-02fd4171d23ef0007 (us-east-2a)", "subnet-068de41e5a1d85cfd (us-east-2b)", "subnet-0a92ff703b80768c1 (us-east-2a)", "subnet-067ac2435f80fbe02 (us-east-2b)"]}
2022-11-21T18:51:17.369Z        DEBUG   controller.deprovisioning       discovered EC2 instance types zonal offerings for subnets       {"commit": "27a51c0", "subnet-selector": "{\"alpha.eksctl.io/cluster-name\":\"eksworkshop-eksctl\"}"}

Steps to Reproduce the Problem

...
export KARPENTER_VERSION=v0.19.1
> helm upgrade --install --namespace karpenter --create-namespace \
>   karpenter oci://public.ecr.aws/karpenter/karpenter \
>   --version ${KARPENTER_VERSION}\
>   --set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=${KARPENTER_IAM_ROLE_ARN} \
>   --set settings.aws.clusterName=${CLUSTER_NAME} \
>   --set settings.aws.clusterEndpoint=${CLUSTER_ENDPOINT} \
>   --set settings.aws.defaultInstanceProfile=KarpenterNodeInstanceProfile-${CLUSTER_NAME} \
>   --set settings.aws.interruptionQueueName=${CLUSTER_NAME} \
>   --set nodeSelector.intent=control-apps \
>   --wait
Release "karpenter" does not exist. Installing it now.
NAME: karpenter
LAST DEPLOYED: Mon Nov 21 18:49:51 2022
NAMESPACE: karpenter
STATUS: deployed
REVISION: 1
TEST SUITE: None

Resource Specs and Logs

See above for Actual Behavior

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@bwagner5 bwagner5 added the bug Something isn't working label Nov 21, 2022
@bwagner5
Copy link
Contributor Author

#2876

@ellistarn
Copy link
Contributor

Been playing with this a fair bit. I think that this is due to leader election being disabled in the knative reconcilers. However, if we reenable leader election, things get a lot noisier with the two replicas fighting over the lease.

@njtran
Copy link
Contributor

njtran commented Dec 6, 2022

Reopening this as the issue is still here and lies with knative.

@Dmitry1987
Copy link

happens on a clean install in AWS EKS , version 0.19.3, how can we fix? is it critical or just noise that settled after leader election? these messages only shown when pods start, later Karpenter works as expected and nodes are spawned.

2022-12-09T11:37:32.796Z ERROR webhook.ConfigMapWebhook Reconcile error {"commit": "683d4b0", "knative.dev/traceid": "55e2e637-4c5e-4da8-87e2-3df075627951", "knative.dev/key": "validation.webhook.config.karpenter.sh", "duration": "55.431539ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.config.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2022-12-09T11:37:32.803Z ERROR webhook.ValidationWebhook Reconcile error {"commit": "683d4b0", "knative.dev/traceid": "c0df4868-8646-46a2-9d8d-1a4e5f5c4944", "knative.dev/key": "validation.webhook.karpenter.k8s.aws", "duration": "62.174113ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}
2022-12-09T11:37:32.816Z INFO controller.aws.pricing updated spot pricing with instance types and offerings {"commit": "683d4b0", "instance-type-count": 562, "offering-count": 1680}
2022-12-09T11:37:32.835Z ERROR webhook.ValidationWebhook Reconcile error {"commit": "683d4b0", "knative.dev/traceid": "77892349-fd32-4fb8-88f7-f37a32980014", "knative.dev/key": "validation.webhook.karpenter.sh", "duration": "94.138809ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2022-12-09T11:37:32.835Z ERROR webhook.ValidationWebhook Reconcile error {"commit": "683d4b0", "knative.dev/traceid": "17b4d468-f8cf-4009-9fc0-e9ffd8bceb0a", "knative.dev/key": "karpenter/karpenter-cert", "duration": "91.610995ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}
2022-12-09T11:37:32.835Z ERROR webhook.DefaultingWebhook Reconcile error {"commit": "683d4b0", "knative.dev/traceid": "190d534f-c21c-44ac-b241-bc2250ebc841", "knative.dev/key": "defaulting.webhook.karpenter.k8s.aws", "duration": "92.665285ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}
2022-12-09T11:37:32.836Z ERROR webhook.DefaultingWebhook Reconcile error {"commit": "683d4b0", "knative.dev/traceid": "8a823689-fd28-4942-baac-432d2eab067f", "knative.dev/key": "defaulting.webhook.karpenter.sh", "duration": "92.806907ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2022-12-09T11:37:32.851Z ERROR webhook.DefaultingWebhook Reconcile error {"commit": "683d4b0", "knative.dev/traceid": "3cf2997a-a103-4b91-924b-36e884779475", "knative.dev/key": "karpenter/karpenter-cert", "duration": "59.566759ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}
2022-12-09T11:37:41.824Z INFO controller.aws.pricing updated on-demand pricing {"commit": "683d4b0", "instance-type-count": 595}
I1209 11:37:49.507135       1 leaderelection.go:258] successfully acquired lease karpenter/karpenter-leader-election

@NaiduVeeraVishnuVardhan
Copy link

NaiduVeeraVishnuVardhan commented Dec 19, 2022

Happens on upgrading karpenter from 0.16.3 to 0.20.0 as well. Is there any fix for the issue?
2022-12-09T11:37:32.796Z ERROR webhook.ConfigMapWebhook Reconcile error {"commit": "683d4b0", "knative.dev/traceid": "55e2e637-4c5e-4da8-87e2-3df075627951", "knative.dev/key": "validation.webhook.config.karpenter.sh", "duration": "55.431539ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.config.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"} 2022-12-09T11:37:32.803Z ERROR webhook.ValidationWebhook Reconcile error {"commit": "683d4b0", "knative.dev/traceid": "c0df4868-8646-46a2-9d8d-1a4e5f5c4944", "knative.dev/key": "validation.webhook.karpenter.k8s.aws", "duration": "62.174113ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"} 2022-12-09T11:37:32.816Z INFO controller.aws.pricing updated spot pricing with instance types and offerings {"commit": "683d4b0", "instance-type-count": 562, "offering-count": 1680} 2022-12-09T11:37:32.835Z ERROR webhook.ValidationWebhook Reconcile error {"commit": "683d4b0", "knative.dev/traceid": "77892349-fd32-4fb8-88f7-f37a32980014", "knative.dev/key": "validation.webhook.karpenter.sh", "duration": "94.138809ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"} 2022-12-09T11:37:32.835Z ERROR webhook.ValidationWebhook Reconcile error {"commit": "683d4b0", "knative.dev/traceid": "17b4d468-f8cf-4009-9fc0-e9ffd8bceb0a", "knative.dev/key": "karpenter/karpenter-cert", "duration": "91.610995ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"} 2022-12-09T11:37:32.835Z ERROR webhook.DefaultingWebhook Reconcile error {"commit": "683d4b0", "knative.dev/traceid": "190d534f-c21c-44ac-b241-bc2250ebc841", "knative.dev/key": "defaulting.webhook.karpenter.k8s.aws", "duration": "92.665285ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"} 2022-12-09T11:37:32.836Z ERROR webhook.DefaultingWebhook Reconcile error {"commit": "683d4b0", "knative.dev/traceid": "8a823689-fd28-4942-baac-432d2eab067f", "knative.dev/key": "defaulting.webhook.karpenter.sh", "duration": "92.806907ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"} 2022-12-09T11:37:32.851Z ERROR webhook.DefaultingWebhook Reconcile error {"commit": "683d4b0", "knative.dev/traceid": "3cf2997a-a103-4b91-924b-36e884779475", "knative.dev/key": "karpenter/karpenter-cert", "duration": "59.566759ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"} 2022-12-09T11:37:41.824Z INFO controller.aws.pricing updated on-demand pricing {"commit": "683d4b0", "instance-type-count": 595} I1209 11:37:49.507135 1 leaderelection.go:258] successfully acquired lease karpenter/karpenter-leader-election

@armujahid
Copy link
Contributor

armujahid commented Dec 20, 2022

Happened with me on clean install of karpenter v0.20.0 that has been deployed using v4.18.1 of https://github.com/aws-ia/terraform-aws-eks-blueprints/releases/tag/v4.18.1
This repo has "examples/karpenter` that can be used to create a new eks cluster with karpenter.

ev/key": "karpenter/karpenter-cert", "duration": "117.122341ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2022-12-19T07:52:09.323Z	INFO	controller.provisioner	found provisionable pod(s)	{"commit": "f60dacd", "pods": 3}
2022-12-19T07:52:09.323Z	INFO	controller.provisioner	computed new node(s) to fit pod(s)	{"commit": "f60dacd", "nodes": 1, "pods": 3}
2022-12-19T07:52:09.326Z	INFO	controller.provisioner	launching node with 3 pods requesting {"cpu":"3205m","memory":"170Mi","pods":"8"} from types t3a.xlarge, m5a.xlarge, m6a.xlarge, m5ad.xlarge, t3.xlarge and 102 other(s)	{"commit": "f60dacd", "provisioner": "default-lt"}
2022-12-19T07:52:11.155Z	INFO	controller.provisioner.cloudprovider	launched new instance	{"commit": "f60dacd", "provisioner": "default-lt", "launched-instance": "i-0b42e194fd56487d3", "hostname": "ip-10-20-22-228.ap-south-1.compute.internal", "type": "t3a.xlarge", "zone": "ap-south-1b", "capacity-type": "on-demand"}
2022-12-19T07:53:34.643Z	INFO	controller.deprovisioning	deprovisioning via consolidation delete, terminating 1 nodes ip-10-20-22-228.ap-south-1.compute.internal/t3a.xlarge/on-demand	{"commit": "f60dacd"}
2022-12-19T07:53:34.693Z	INFO	controller.termination	cordoned node	{"commit": "f60dacd", "node": "ip-10-20-22-228.ap-south-1.compute.internal"}
2022-12-19T07:53:34.906Z	INFO	controller.termination	deleted node	{"commit": "f60dacd", "node": "ip-10-20-22-228.ap-south-1.compute.internal"}
2022-12-19T07:56:09.444Z	DEBUG	controller.deprovisioning	discovered EC2 instance types	{"commit": "f60dacd", "instance-type-count": 369}
2022-12-16T10:46:32.331Z	ERROR	webhook.DefaultingWebhook	Reconcile error	{"commit": "f60dacd", "knative.dev/traceid": "20595618-b5cf-44f2-ac69-7623f346f6a9", "knative.dev/key": "defaulting.webhook.karpenter.k8s.aws", "duration": "92.305087ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}
2022-12-16T10:46:32.397Z	ERROR	webhook.ConfigMapWebhook	Reconcile error	{"commit": "f60dacd", "knative.dev/traceid": "2c584730-2e50-4621-801d-721dbe31106b", "knative.dev/key": "validation.webhook.config.karpenter.sh", "duration": "184.799247ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.config.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2022-12-16T10:46:32.397Z	ERROR	webhook.ValidationWebhook	Reconcile error	{"commit": "f60dacd", "knative.dev/traceid": "47a61cd2-8c8b-4a20-bb02-0e0407dad5de", "knative.dev/key": "karpenter/karpenter-cert", "duration": "74.579494ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}
2022-12-16T10:46:32.398Z	ERROR	webhook.ValidationWebhook	Reconcile error	{"commit": "f60dacd", "knative.dev/traceid": "1c45f60b-736e-4a07-86ac-ef186a49f4de", "knative.dev/key": "karpenter/karpenter-cert", "duration": "70.071203ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2022-12-16T10:46:32.410Z	ERROR	webhook.DefaultingWebhook	Reconcile error	{"commit": "f60dacd", "knative.dev/traceid": "34bda325-4907-428e-b47f-ad6e29e07f42", "knative.dev/key": "karpenter/karpenter-cert", "duration": "82.480505ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}
2022-12-16T10:46:33.006Z	INFO	controller.aws.pricing	updated on-demand pricing	{"commit": "f60dacd", "instance-type-count": 369}
2022-12-19T08:01:14.828Z	DEBUG	controller.deprovisioning	discovered EC2 instance types	{"commit": "f60dacd", "instance-type-count": 369}

These logs are from two pods. Note that errors are only in logs of one pod. Other pod's logs don't have these errors. Secondly, these errors were only seen on 2022-12-16 (When I created fresh cluster with karpenter). I am not noticing these errors now.

This comment is copied from my message at https://kubernetes.slack.com/archives/C02SFFZSA2K/p1671437472589809

@jonathan-innis jonathan-innis added the webhook Issues related to the webhooks label Dec 20, 2022
@FernandoMiguel
Copy link
Contributor

upgraded from v0.20 to v0.21 and enabled DriftEnabled. got this error

2022-12-28T13:02:14.804Z    DEBUG    controller    karpenter-global-settings config "karpenter-global-settings" config was added or updated: settings.Settings{BatchMaxDuration:v1.Duration{Duration:10000000000}, BatchIdleDuration:v1.Duration{Duration:5000000000}, DriftEnabled:true}    {"commit": "0c8536a-dirty"}                                                                                                                 │
2022-12-28T13:02:14.804Z    DEBUG    controller    karpenter-global-settings config "karpenter-global-settings" config was added or updated: settings.Settings{ClusterName:"fernando", ClusterEndpoint:"https://XXXX.gr7.us-east-1.eks.amazonaws.com", DefaultInstanceProfile:"", EnablePodENI:false, EnableENILimitedPodDensity:true, IsolatedVPC:false, NodeNameConvention:"resource-name", VMMemo │
ryOverheadPercent:0.075, InterruptionQueueName:"Karpenter-fernando", Tags:map[string]string{}}    {"commit": "0c8536a-dirty"}                                                                                                                                                                                                                                                                         │
2022-12-28T13:02:22.879Z    ERROR    webhook.DefaultingWebhook    Reconcile error    {"commit": "0c8536a-dirty", "knative.dev/traceid": "cf5dc332-ab4a-4da8-91c1-2dd25ac95fc3", "knative.dev/key": "defaulting.webhook.karpenter.sh", "duration": "20.341173ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.sh\": th │
e object has been modified; please apply your changes to the latest version and try again"}

restarting both pods seems to fix it

@ellistarn
Copy link
Contributor

Did the controller not work after these errors? They should just be transient errors that self heal, since both controllers are trying to reconcile the same webhook.

@FernandoMiguel
Copy link
Contributor

I didn't wait long enough. I restarted both pods and they went instantly OK

@ellistarn
Copy link
Contributor

kubernetes-sigs/karpenter#142 moves the problem, but results in a new class of error. Still self healing.

@mbevc1
Copy link
Contributor

mbevc1 commented Jan 29, 2023

Is this still an issue in 0.23.0? Still getting occasional errors:

2023-01-29T14:21:21.161Z	ERROR	webhook.DefaultingWebhook	Reconcile error	{"commit": "5a7faa0-dirty", "knative.dev/traceid": "6bfa8c67-4d2d-40e9-b8b8-5bd366011d3c", "knative.dev/key": "karpenter/karpenter-cert", "duration": "22.955234ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2023-01-29T14:21:21.164Z	ERROR	webhook.DefaultingWebhook	Reconcile error	{"commit": "5a7faa0-dirty", "knative.dev/traceid": "ed36960c-bba5-443d-937f-599515ddd11e", "knative.dev/key": "karpenter/karpenter-cert", "duration": "25.715147ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}
2023-01-29T14:21:21.165Z	ERROR	webhook.ConfigMapWebhook	Reconcile error	{"commit": "5a7faa0-dirty", "knative.dev/traceid": "d0fbf05c-30dd-4d36-9c54-c80b3944dbfe", "knative.dev/key": "karpenter/karpenter-cert", "duration": "26.418149ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.config.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2023-01-29T14:21:21.167Z	ERROR	webhook.ValidationWebhook	Reconcile error	{"commit": "5a7faa0-dirty", "knative.dev/traceid": "96175f82-8ac1-41c2-810a-64623019a8f5", "knative.dev/key": "validation.webhook.karpenter.k8s.aws", "duration": "28.288395ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}
2023-01-29T14:21:21.180Z	ERROR	webhook.ValidationWebhook	Reconcile error	{"commit": "5a7faa0-dirty", "knative.dev/traceid": "556a1f6c-b2b4-437b-ab59-b26e841b489c", "knative.dev/key": "karpenter/karpenter-cert", "duration": "41.295194ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}

@ellistarn
Copy link
Contributor

Still an issue -- we need to deep dive this w/ knative. Tbh, I'd prefer to just do kubernetes-sigs/karpenter#103. These webhooks are a pain.

@mbevc1
Copy link
Contributor

mbevc1 commented Jan 29, 2023

Sounds like a more straight forward approach and perhaps less complex 🤔 . Unless there is anything specific we're wanting webhooks in for?

@ellistarn
Copy link
Contributor

We haven't tried to migrate them to just crd builtins. There are some nontrivial defaults that may be hard. In the short term, can you live with the errors?

@mbevc1
Copy link
Contributor

mbevc1 commented Jan 29, 2023

Hey @ellistarn . Seems it's provisioning fine, it's just a bit noisy with the errors ATM. Short term should be fine and looking forward to solving this later on. Cheers!

@robertd
Copy link
Contributor

robertd commented Feb 16, 2023

I saw this in one of my clusters too... it's just a noise atm

2023-02-16T04:10:45.532Z	ERROR	webhook.DefaultingWebhook	Reconcile error	{"commit": "5a7faa0-dirty", "knative.dev/traceid": "681c03be-5f2c-4919-8169-82e6f0b5468d", "knative.dev/key": "defaulting.webhook.karpenter.sh", "duration": "81.929004ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2023-02-16T04:10:45.532Z	ERROR	webhook.DefaultingWebhook	Reconcile error	{"commit": "5a7faa0-dirty", "knative.dev/traceid": "ef42549f-439b-4cc4-be33-3bdb81a2ede6", "knative.dev/key": "karpenter/karpenter-cert", "duration": "81.772322ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}

@nalshamaajc
Copy link

Same issue here on Karpenter controller:v0.27.0 and EKS v1.25.6-eks-48e63af

2023-03-24T20:14:41.316Z	ERROR	webhook.ValidationWebhook	Reconcile error	{"commit": "dc3af1a", "knative.dev/traceid": "e7f736ae-cdaa-4403-b38b-6739e034f389", "knative.dev/key": "validation.webhook.karpenter.sh", "duration": "314.833408ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2023-03-29T00:14:41.121Z	ERROR	webhook.ConfigMapWebhook	Reconcile error	{"commit": "dc3af1a", "knative.dev/traceid": "c4790c77-02c0-4e6f-8249-92c48cc862c3", "knative.dev/key": "karpenter/karpenter-cert", "duration": "92.135963ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.config.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2023-03-29T00:14:41.125Z	ERROR	webhook.DefaultingWebhook	Reconcile error	{"commit": "dc3af1a", "knative.dev/traceid": "5ba55c17-e9c8-4ea7-b4ee-9c9b12e65f22", "knative.dev/key": "karpenter/karpenter-cert", "duration": "93.722463ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}

Is there a way to validate if these errors are just noise?

@nalshamaajc
Copy link

nalshamaajc commented Apr 4, 2023

Same issue here on Karpenter controller:v0.27.0 and EKS v1.25.6-eks-48e63af

2023-03-24T20:14:41.316Z	ERROR	webhook.ValidationWebhook	Reconcile error	{"commit": "dc3af1a", "knative.dev/traceid": "e7f736ae-cdaa-4403-b38b-6739e034f389", "knative.dev/key": "validation.webhook.karpenter.sh", "duration": "314.833408ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2023-03-29T00:14:41.121Z	ERROR	webhook.ConfigMapWebhook	Reconcile error	{"commit": "dc3af1a", "knative.dev/traceid": "c4790c77-02c0-4e6f-8249-92c48cc862c3", "knative.dev/key": "karpenter/karpenter-cert", "duration": "92.135963ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.config.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2023-03-29T00:14:41.125Z	ERROR	webhook.DefaultingWebhook	Reconcile error	{"commit": "dc3af1a", "knative.dev/traceid": "5ba55c17-e9c8-4ea7-b4ee-9c9b12e65f22", "knative.dev/key": "karpenter/karpenter-cert", "duration": "93.722463ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}

Is there a way to validate if these errors are just noise?

It seems that these errors were gone after updating to v0.27.1

But I encountered this issue (leaving it here for anyone who needs it).

EDIT
The issue is back.
@ellistarn does the latest release resolve this issue and is the below error relate to this issue and is the below error related?

2023-04-12T10:23:44.882Z	ERROR	webhook.WebhookCertificates	Reconcile error	{"commit": "7131be2-dirty", "knative.dev/traceid": "ce9f2f51-3ee5-4fdc-a1ce-19caa7807db5", "knative.dev/key": "karpenter/karpenter-cert", "duration": "45.405825ms", "error": "Operation cannot be fulfilled on secrets \"karpenter-cert\": the object has been modified; please apply your changes to the latest version and try again"}

@ellistarn
Copy link
Contributor

For folks concerned about this error, know that it's just noise unless it happens continuously without going away.

Ideally, we'd prevent it from happening in the first place, but this requires changes upstream to knative/pkg.

@mohnishbasha
Copy link

ran into this as well on a clean install following the instructions from the docs from here:
https://karpenter.sh/v0.27.5/getting-started/getting-started-with-karpenter/


2023-05-19T20:38:13.310Z	ERROR	webhook.ValidationWebhook	Reconcile error	{"commit": "698f22f-dirty", "knative.dev/traceid": "24b53c0d-c181-4540-bd90-37c07c9c5259", "knative.dev/key": "karpenter/karpenter-cert", "duration": "13.593µs", "error": "secret \"karpenter-cert\" is missing \"ca-cert.pem\" key"}
2023-05-19T20:38:13.310Z	ERROR	webhook.ValidationWebhook	Reconcile error	{"commit": "698f22f-dirty", "knative.dev/traceid": "018ae8f2-2ef6-4d8a-bf69-deec6982cb9f", "knative.dev/key": "validation.webhook.karpenter.sh", "duration": "6.312µs", "error": "secret \"karpenter-cert\" is missing \"ca-cert.pem\" key"}
2023-05-19T20:38:13.310Z	ERROR	webhook.ConfigMapWebhook	Reconcile error	{"commit": "698f22f-dirty", "knative.dev/traceid": "f82c5d03-2c01-460f-bf21-2f2c6559612a", "knative.dev/key": "validation.webhook.config.karpenter.sh", "duration": "7.864µs", "error": "secret \"karpenter-cert\" is missing \"ca-cert.pem\" key"}
2023-05-19T20:38:13.310Z	ERROR	webhook.ConfigMapWebhook	Reconcile error	{"commit": "698f22f-dirty", "knative.dev/traceid": "1f387258-a1d3-4379-834a-f236d7bf3562", "knative.dev/key": "karpenter/karpenter-cert", "duration": "7.302µs", "error": "secret \"karpenter-cert\" is missing \"ca-cert.pem\" key"}
2023-05-19T20:38:13.310Z	ERROR	webhook.DefaultingWebhook	Reconcile error	{"commit": "698f22f-dirty", "knative.dev/traceid": "ee05a42d-e18f-41f3-999d-42463506b78c", "knative.dev/key": "defaulting.webhook.karpenter.k8s.aws", "duration": "6.831µs", "error": "secret \"karpenter-cert\" is missing \"ca-cert.pem\" key"}
2023-05-19T20:38:13.310Z	ERROR	webhook.DefaultingWebhook	Reconcile error	{"commit": "698f22f-dirty", "knative.dev/traceid": "bee7256e-0029-40be-bbd3-724bde6d72fe", "knative.dev/key": "karpenter/karpenter-cert", "duration": "5.072µs", "error": "secret \"karpenter-cert\" is missing \"ca-cert.pem\" key"}
2023-05-19T20:38:13.314Z	ERROR	webhook.ValidationWebhook	Reconcile error	{"commit": "698f22f-dirty", "knative.dev/traceid": "9803f39d-453b-4c1d-9190-ce4f74bd90d1", "knative.dev/key": "karpenter/karpenter-cert", "duration": "8.842µs", "error": "secret \"karpenter-cert\" is missing \"ca-cert.pem\" key"}
2023-05-19T20:38:13.314Z	ERROR	webhook.ValidationWebhook	Reconcile error	{"commit": "698f22f-dirty", "knative.dev/traceid": "933ab433-39f1-4748-9284-91f532ae7266", "knative.dev/key": "validation.webhook.karpenter.k8s.aws", "duration": "4.613µs", "error": "secret \"karpenter-cert\" is missing \"ca-cert.pem\" key"}
2023-05-19T20:38:13.341Z	ERROR	webhook.WebhookCertificates	Reconcile error	{"commit": "698f22f-dirty", "knative.dev/traceid": "a415465c-750f-461a-812b-5db585705048", "knative.dev/key": "karpenter/karpenter-cert", "duration": "32.510804ms", "error": "Operation cannot be fulfilled on secrets \"karpenter-cert\": the object has been modified; please apply your changes to the latest version and try again"}

pods itself are running ok

 k get pods -n karpenter --context $iCP                       
NAME                         READY   STATUS    RESTARTS   AGE
karpenter-8676768564-f6tkp   1/1     Running   0          4m39s
karpenter-8676768564-zt22d   1/1     Running   0          4m39s

@vumdao
Copy link

vumdao commented Aug 27, 2023

I got this issue either on karpenter v0.27.6

2023-08-27T08:28:10.528Z	ERROR	webhook.ValidationWebhook	Reconcile error	{"commit": "5a2fe84-dirty", "knative.dev/traceid": "ae6b8685-42d2-42d5-8e2e-b788e777a59f", "knative.dev/key": "validation.webhook.karpenter.k8s.aws", "duration": "20.021538ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}
2023-08-27T08:28:10.529Z	ERROR	webhook.ValidationWebhook	Reconcile error	{"commit": "5a2fe84-dirty", "knative.dev/traceid": "f2516b7d-9ee1-48a5-bcb7-9b0f8b855c67", "knative.dev/key": "validation.webhook.karpenter.sh", "duration": "20.491148ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
2023-08-27T08:28:10.536Z	ERROR	webhook.DefaultingWebhook	Reconcile error	{"commit": "5a2fe84-dirty", "knative.dev/traceid": "08d994b8-4847-4a96-b396-08b44ae3e0dd", "knative.dev/key": "karpenter/karpenter-cert", "duration": "27.551199ms", "error": "failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}
2023-08-27T08:28:10.536Z	ERROR	webhook.ConfigMapWebhook	Reconcile error	{"commit": "5a2fe84-dirty", "knative.dev/traceid": "845e104e-97aa-4987-9eca-0df2ae887638", "knative.dev/key": "validation.webhook.config.karpenter.sh", "duration": "27.849126ms", "error": "failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.config.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
# k get pods -n karpenter
NAME                         READY   STATUS    RESTARTS   AGE
karpenter-846d564cbd-68h5n   1/1     Running   0          164m
karpenter-846d564cbd-f6msp   1/1     Running   0          164m

@johnjeffers
Copy link

This is happening to me with version 0.30.0 in EKS, clean installs, multiple clusters having the same problem. I installed via the helm chart. Only thing I did that's a little unusual is that the helm chart is installed via ArgoCD.

The problem's been happening about a week, and after multiple restarts, so whatever's supposed to be self-healing, isn't in my case.

@ellistarn
Copy link
Contributor

Typically this happens if you have webhooks leaked from old karpenter versions. ArgoCD can leak these. Can you print out your webhooks?

@johnjeffers
Copy link

@ellistarn This is a brand new install. Never used karpenter before. Started on v0.30.0, no upgrades, no old versions to leak from.

I redacted the caBundle data, but I can tell you that it's identical in all the webhooks.

---
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  labels:
    app.kubernetes.io/instance: karpenter-prod
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: karpenter
    app.kubernetes.io/version: 0.30.0
    argocd.argoproj.io/instance: karpenter-prod
    helm.sh/chart: karpenter-v0.30.0
  name: defaulting.webhook.karpenter.k8s.aws
  ownerReferences:
  - apiVersion: v1
    blockOwnerDeletion: true
    controller: true
    kind: Namespace
    name: karpenter
    uid: 6954e09a-2745-487f-a8c5-638eed42f028
  resourceVersion: "536867099"
  uid: 79908808-3985-40df-bd4c-f0c0d4ae0fa2
webhooks:
- admissionReviewVersions:
  - v1
  clientConfig:
    caBundle: REDACTED
    service:
      name: karpenter
      namespace: karpenter
      path: /default/karpenter.k8s.aws
      port: 8443
  failurePolicy: Fail
  matchPolicy: Equivalent
  name: defaulting.webhook.karpenter.k8s.aws
  namespaceSelector:
    matchExpressions:
    - key: webhooks.knative.dev/exclude
      operator: DoesNotExist
  objectSelector: {}
  reinvocationPolicy: IfNeeded
  rules:
  - apiGroups:
    - karpenter.k8s.aws
    apiVersions:
    - v1alpha1
    operations:
    - CREATE
    - UPDATE
    resources:
    - awsnodetemplates
    - awsnodetemplates/status
    scope: '*'
  - apiGroups:
    - karpenter.sh
    apiVersions:
    - v1alpha5
    operations:
    - CREATE
    - UPDATE
    resources:
    - provisioners
    - provisioners/status
    scope: '*'
  sideEffects: None
  timeoutSeconds: 10
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  labels:
    app.kubernetes.io/instance: karpenter-prod
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: karpenter
    app.kubernetes.io/version: 0.30.0
    argocd.argoproj.io/instance: karpenter-prod
    helm.sh/chart: karpenter-v0.30.0
  name: validation.webhook.config.karpenter.sh
  ownerReferences:
  - apiVersion: v1
    blockOwnerDeletion: true
    controller: true
    kind: Namespace
    name: karpenter
    uid: 6954e09a-2745-487f-a8c5-638eed42f028
  resourceVersion: "536867096"
  uid: 028de90c-37f6-496e-80ad-b0187491d817
webhooks:
- admissionReviewVersions:
  - v1
  clientConfig:
    caBundle: REDACTED
    service:
      name: karpenter
      namespace: karpenter
      path: /validate/config.karpenter.sh
      port: 8443
  failurePolicy: Fail
  matchPolicy: Equivalent
  name: validation.webhook.config.karpenter.sh
  namespaceSelector: {}
  objectSelector:
    matchLabels:
      app.kubernetes.io/part-of: karpenter
  rules:
  - apiGroups:
    - ""
    apiVersions:
    - v1
    operations:
    - CREATE
    - UPDATE
    resources:
    - configmaps/*
    scope: Namespaced
  sideEffects: None
  timeoutSeconds: 10
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  labels:
    app.kubernetes.io/instance: karpenter-prod
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: karpenter
    app.kubernetes.io/version: 0.30.0
    argocd.argoproj.io/instance: karpenter-prod
    helm.sh/chart: karpenter-v0.30.0
  name: validation.webhook.karpenter.sh
  ownerReferences:
  - apiVersion: v1
    blockOwnerDeletion: true
    controller: true
    kind: Namespace
    name: karpenter
    uid: 6954e09a-2745-487f-a8c5-638eed42f028
  resourceVersion: "536867097"
  uid: df4ef29f-ae39-4819-8231-e4c033379621
webhooks:
- admissionReviewVersions:
  - v1
  clientConfig:
    caBundle: REDACTED
    service:
      name: karpenter
      namespace: karpenter
      path: /validate/karpenter.sh
      port: 8443
  failurePolicy: Fail
  matchPolicy: Equivalent
  name: validation.webhook.karpenter.sh
  namespaceSelector:
    matchExpressions:
    - key: webhooks.knative.dev/exclude
      operator: DoesNotExist
  objectSelector: {}
  rules:
  - apiGroups:
    - karpenter.sh
    apiVersions:
    - v1alpha5
    operations:
    - CREATE
    - UPDATE
    resources:
    - provisioners
    - provisioners/status
    scope: '*'
  sideEffects: None
  timeoutSeconds: 10
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  labels:
    app.kubernetes.io/instance: karpenter-prod
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: karpenter
    app.kubernetes.io/version: 0.30.0
    argocd.argoproj.io/instance: karpenter-prod
    helm.sh/chart: karpenter-v0.30.0
  name: validation.webhook.karpenter.k8s.aws
  ownerReferences:
  - apiVersion: v1
    blockOwnerDeletion: true
    controller: true
    kind: Namespace
    name: karpenter
    uid: 6954e09a-2745-487f-a8c5-638eed42f028
  resourceVersion: "536867098"
  uid: 533fb637-4811-441a-9866-62491b741331
webhooks:
- admissionReviewVersions:
  - v1
  clientConfig:
    caBundle: REDACTED
    service:
      name: karpenter
      namespace: karpenter
      path: /validate/karpenter.k8s.aws
      port: 8443
  failurePolicy: Fail
  matchPolicy: Equivalent
  name: validation.webhook.karpenter.k8s.aws
  namespaceSelector:
    matchExpressions:
    - key: webhooks.knative.dev/exclude
      operator: DoesNotExist
  objectSelector: {}
  rules:
  - apiGroups:
    - karpenter.k8s.aws
    apiVersions:
    - v1alpha1
    operations:
    - CREATE
    - UPDATE
    resources:
    - awsnodetemplates
    - awsnodetemplates/status
    scope: '*'
  - apiGroups:
    - karpenter.sh
    apiVersions:
    - v1alpha5
    operations:
    - CREATE
    - UPDATE
    resources:
    - provisioners
    - provisioners/status
    scope: '*'
  sideEffects: None
  timeoutSeconds: 10

@ellistarn
Copy link
Contributor

Are the errors persistent, or do they eventually go away?

@johnjeffers
Copy link

They're not constant but they're persistent. Last ones were about 10 min ago, 15 min ago, and an hour ago. Approximately. They're not on any regular schedule I can see. Sometimes only one, sometimes a couple of them just a few seconds apart.

@ellistarn
Copy link
Contributor

This is due to a known bug in the knative certificate reconciliation. We're moving towards deprecating these webhooks in a future release. If it's not blocking your operations, you can safely ignore them for now.

@njtran
Copy link
Contributor

njtran commented Nov 1, 2023

This should be closed and fixed with v0.33.0, since the webhooks will be disabled by default.

@jonathan-innis
Copy link
Contributor

We can close this out now since we just released v0.33.

@jonathan-innis
Copy link
Contributor

Closed by #5159

@rppietrzak
Copy link

Hi,

This should be closed and fixed with v0.33.0, since the webhooks will be disabled by default.

I think problem still exists, in migration procedure to v1 it is required to enable conversion webhook:

I am updating from 0.37.0 to 0.37.2 according to this procedure with webhooks enabled.
This end up for me with same errors like mentioned in this ticket:

{"level":"ERROR","time":"2024-09-09T11:40:54.198Z","logger":"webhook.ValidationWebhook","message":"Reconcile error","commit":"17dd42b","knative.dev/traceid":"220e1f2e-2bb7-4fd8-95b0-23946f3c6c4b","knative.dev/key":"kube-system/karpenter-cert","duration":"49.36µs","error":"secret \"karpenter-cert\" is missing \"ca-cert.pem\" key"}
{"level":"ERROR","time":"2024-09-09T11:40:54.198Z","logger":"webhook.DefaultingWebhook","message":"Reconcile error","commit":"17dd42b","knative.dev/traceid":"56df304d-8ca8-4833-8d9a-36ecbe8bdc76","knative.dev/key":"kube-system/karpenter-cert","duration":"34.121µs","error":"secret \"karpenter-cert\" is missing \"ca-cert.pem\" key"}
{"level":"ERROR","time":"2024-09-09T11:40:54.276Z","logger":"webhook.ConfigMapWebhook","message":"Reconcile error","commit":"17dd42b","knative.dev/traceid":"1655effd-271f-49cf-96d7-694f87455a83","knative.dev/key":"validation.webhook.config.karpenter.sh","duration":"66.943301ms","error":"failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.config.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
{"level":"ERROR","time":"2024-09-09T11:40:54.277Z","logger":"webhook.ValidationWebhook","message":"Reconcile error","commit":"17dd42b","knative.dev/traceid":"11d1c743-dc15-4105-b76e-e6431dcd5c06","knative.dev/key":"validation.webhook.karpenter.sh","duration":"67.293104ms","error":"failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
{"level":"ERROR","time":"2024-09-09T11:40:54.277Z","logger":"webhook.DefaultingWebhook","message":"Reconcile error","commit":"17dd42b","knative.dev/traceid":"53a67769-2b41-4877-b42a-896de089a1de","knative.dev/key":"defaulting.webhook.karpenter.k8s.aws","duration":"67.182353ms","error":"failed to update webhook: Operation cannot be fulfilled on mutatingwebhookconfigurations.admissionregistration.k8s.io \"defaulting.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}
{"level":"ERROR","time":"2024-09-09T11:40:54.297Z","logger":"webhook.ValidationWebhook","message":"Reconcile error","commit":"17dd42b","knative.dev/traceid":"d8275716-38c1-43dc-a067-41a15a6003b3","knative.dev/key":"validation.webhook.karpenter.k8s.aws","duration":"88.525063ms","error":"failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}
{"level":"ERROR","time":"2024-09-09T11:40:54.473Z","logger":"webhook.ValidationWebhook","message":"Reconcile error","commit":"17dd42b","knative.dev/traceid":"c8afe76c-b6d2-4ddb-a3fa-06aa9e7093f4","knative.dev/key":"kube-system/karpenter-cert","duration":"197.294627ms","error":"failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
{"level":"ERROR","time":"2024-09-09T11:40:54.496Z","logger":"webhook.ValidationWebhook","message":"Reconcile error","commit":"17dd42b","knative.dev/traceid":"97bdc097-70fc-4a05-9fe4-2d2f204ff94d","knative.dev/key":"kube-system/karpenter-cert","duration":"225.325787ms","error":"failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.karpenter.k8s.aws\": the object has been modified; please apply your changes to the latest version and try again"}
{"level":"ERROR","time":"2024-09-09T11:40:54.496Z","logger":"webhook.ConfigMapWebhook","message":"Reconcile error","commit":"17dd42b","knative.dev/traceid":"9b66dc1a-dcd7-4f08-b586-3ea77ee52e9b","knative.dev/key":"kube-system/karpenter-cert","duration":"220.199499ms","error":"failed to update webhook: Operation cannot be fulfilled on validatingwebhookconfigurations.admissionregistration.k8s.io \"validation.webhook.config.karpenter.sh\": the object has been modified; please apply your changes to the latest version and try again"}
{"level":"ERROR","time":"2024-09-09T11:43:42.006Z","logger":"webhook","message":"http: TLS handshake error from 10.112.8.31:42014: remote error: tls: bad certificate\n","commit":"17dd42b"}
{"level":"ERROR","time":"2024-09-09T11:43:42.427Z","logger":"webhook","message":"http: TLS handshake error from 10.112.8.31:42018: remote error: tls: bad certificate\n","commit":"17dd42b"}

@madhavdas
Copy link

Using 1.0.2, same errors present in the deployment logs and terrform too breaks whle applying manifest.
logs --
"https://karpenter.kube-system.svc:8443/?timeout=10s": tls: failed to verify certificate: x509: certificate signed by unknown authority
terraform applying k8s manifests ----
kubectl_manifest.karpenter_node_class: Creating...

│ Error: default failed to run apply: error when creating "/tmp/102807832kubectl_manifest.yaml": Internal error occurred: failed calling webhook "defaulting.webhook.karpenter.k8s.aws": failed to call webhook: Post "https://karpenter.kube-system.svc:8443/?timeout=10s": tls: failed to verify certificate: x509: certificate signed by unknown authority

│ with kubectl_manifest.karpenter_node_class,
│ on ekskarpenter.tf line 96, in resource "kubectl_manifest" "karpenter_node_class":
│ 96: resource "kubectl_manifest" "karpenter_node_class" {

@Wolfgang1966
Copy link

Wolfgang1966 commented Oct 4, 2024

we see the same error and most probably the reason is that path and caBundle are missing in the webhook config. I have a version working in one cluster:

Spec:
  Conversion:
    Strategy:  Webhook
    Webhook:
      Client Config:
        Ca Bundle:  LS0tL...
        Service:
          Name:       karpenter-karpenter
          Namespace:  karpenter
          Path:       /conversion/karpenter.k8s.aws
          Port:       8443

And another one giving the tls verification errors @madhavdas reports:

Spec:
  Conversion:
    Strategy:  Webhook
    Webhook:
      Client Config:
        Service:
          Name:       karpenter-karpenter
          Namespace:  karpenter
          Port:       8443

karpenter version 1.0.2, but also seen with 0.36.5. Installed from helm chart using flux.

From the description in https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/#write-a-conversion-webhook-server I assume that path and caBundle are required, but I was not able to find how to get them there.

Maybe noteworthy: Karpenter is running in namespace "karpenter" in both clusters. Configs are practically identical except cluster-names

Any idea?

@AnkitBhalla22
Copy link

Should we repoen it as we are seeing this same issue in 0.37.3 upgrade to karpenter: @ellistarn .Please comment.

@Wolfgang1966
Copy link

The error (and then the solution) in my case finally was that we used images from a lokal registry and forgot to update the image reference together with the helm chart version. After we corrected this the update went through successfully. So double-check that you really run the correct image versions when updating the helm charts.
Update the CRD chart BEFORE you update the Karpenter chart and ensure that the CRDs show both versions and the conversion hook before you continue. Obeying that, we updated without problems from 0.36.2 via 0.36.5 to 1.0.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working v1 Issues requiring resolution by the v1 milestone webhook Issues related to the webhooks
Projects
None yet
Development

Successfully merging a pull request may close this issue.