Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Webhook is causing random diff for provisioner's server-side apply #1729

Closed
tbondarchuk opened this issue Apr 27, 2022 · 3 comments · Fixed by #1731
Closed

Webhook is causing random diff for provisioner's server-side apply #1729

tbondarchuk opened this issue Apr 27, 2022 · 3 comments · Fixed by #1731
Assignees
Labels
bug Something isn't working

Comments

@tbondarchuk
Copy link
Contributor

Version

Karpenter: v0.9.0

Kubernetes: v1.22.6-eks-7d68063

Expected Behavior

No diffs for provisoner's server-side apply.

Actual Behavior

Provisioner created with FluxCD or with kubectl but with removed last-applied-configuration annotation is randomly showing diffs for kubectl diff due to constantly changing order of list items in requirements. (diff result provided below)

I suspect that webhook is somehow changing order of list items on validation, but since I have no experience with webhooks development I wasn't able to trace the issue further. Happy to provide any additional information if needed.

Steps to Reproduce the Problem

  1. create provisioner test with kubectl apply -f test.yaml
  2. run kubectl edit provisioner test and remove annotation kubectl.kubernetes.io/last-applied-configuration
  3. run kubectl diff -f test.yaml multiple times
  4. random diff is shown, sometimes no changes (provided below)
  5. delete Karpenter's webhooks: kubectl delete MutatingWebhookConfiguration defaulting.webhook.provisioners.karpenter.sh && kubectl delete ValidatingWebhookConfiguration validation.webhook.provisioners.karpenter.sh && kubectl delete ValidatingWebhookConfiguration validation.webhook.config.karpenter.sh
  6. now kubectl diff shows no diffs at all as expected.

Same issue when deploying provisioners with FluxCD: since Flux does not add last-applied-configuration annotation, it always updates provisioners on reconciliation, in my case every 15 min, resulting in multiple notifications. (I've first created an issue in Flux repo but was able to trace it to webhook)

Resource Specs and Logs

Provisioner manifest
---
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: test-flux
spec:
provider:
  securityGroupSelector:
    karpenter.sh/discovery: development
  subnetSelector:
    karpenter.sh/discovery: development
requirements:
- key: kubernetes.io/arch
  operator: In
  values: ["amd64", "arm64"]
- key: karpenter.sh/capacity-type
  operator: In
  values: ["on-demand","spot"]
- key: node.kubernetes.io/instance-type
  operator: In
  values: ["t3.small", "t3.medium", "t3.large"]
taints:
- effect: NoSchedule
  key: dedicated
  value: test-flux
kubectl diff results
➜  kubectl diff -f test-flux.yaml
diff -u -N /var/folders/1w/c8g5cv1n05n47_0mms4ws5fm0000gn/T/LIVE-4237038111/karpenter.sh.v1alpha5.Provisioner..test-flux /var/folders/1w/c8g5cv1n05n47_0mms4ws5fm0000gn/T/MERGED-1128983386/karpenter.sh.v1alpha5.Provisioner..test-flux
--- /var/folders/1w/c8g5cv1n05n47_0mms4ws5fm0000gn/T/LIVE-4237038111/karpenter.sh.v1alpha5.Provisioner..test-flux       2022-04-27 15:37:04.000000000 +0300
+++ /var/folders/1w/c8g5cv1n05n47_0mms4ws5fm0000gn/T/MERGED-1128983386/karpenter.sh.v1alpha5.Provisioner..test-flux     2022-04-27 15:37:04.000000000 +0300
@@ -2,7 +2,7 @@
kind: Provisioner
metadata:
 creationTimestamp: "2022-04-27T11:36:07Z"
-  generation: 1
+  generation: 2
 labels:
   kustomize.toolkit.fluxcd.io/name: karpenter-provisioner
   kustomize.toolkit.fluxcd.io/namespace: flux-system
@@ -45,22 +45,22 @@
   subnetSelector:
     karpenter.sh/discovery: development
 requirements:
+  - key: node.kubernetes.io/instance-type
+    operator: In
+    values:
+    - t3.small
+    - t3.medium
+    - t3.large
 - key: kubernetes.io/arch
   operator: In
   values:
-    - amd64
   - arm64
+    - amd64
 - key: karpenter.sh/capacity-type
   operator: In
   values:
   - on-demand
   - spot
-  - key: node.kubernetes.io/instance-type
-    operator: In
-    values:
-    - t3.small
-    - t3.medium
-    - t3.large
 taints:
 - effect: NoSchedule
   key: dedicated
Webhook logs when no diff is shown
2022-04-27T12:36:51.537Z	INFO	webhook	Kind: "karpenter.sh/v1alpha5, Kind=Provisioner" PatchBytes: null	{"commit": "9b1f078", "knative.dev/kind": "karpenter.sh/v1alpha5, Kind=Provisioner", "knative.dev/namespace": "", "knative.dev/name": "test-flux", "knative.dev/operation": "UPDATE", "knative.dev/resource": "karpenter.sh/v1alpha5, Resource=provisioners", "knative.dev/subresource": "", "knative.dev/userinfo": "{eksadmin aws-iam-authenticator:REDUCTED:REDUCTED [system:masters system:authenticated] map[accessKeyId:[REDUCTED] arn:[arn:aws:sts::REDUCTED:assumed-role/eks-admin/EKSGetTokenAuth] canonicalArn:[arn:aws:iam::REDUCTED:role/eks-admin] sessionName:[EKSGetTokenAuth]]}"}
2022-04-27T12:36:51.537Z	INFO	webhook	remote admission controller audit annotations=map[string]string(nil)	{"commit": "9b1f078", "knative.dev/kind": "karpenter.sh/v1alpha5, Kind=Provisioner", "knative.dev/namespace": "", "knative.dev/name": "test-flux", "knative.dev/operation": "UPDATE", "knative.dev/resource": "karpenter.sh/v1alpha5, Resource=provisioners", "knative.dev/subresource": "", "knative.dev/userinfo": "{eksadmin aws-iam-authenticator:REDUCTED:REDUCTED [system:masters system:authenticated] map[accessKeyId:[REDUCTED] arn:[arn:aws:sts::REDUCTED:assumed-role/eks-admin/EKSGetTokenAuth] canonicalArn:[arn:aws:iam::REDUCTED:role/eks-admin] sessionName:[EKSGetTokenAuth]]}", "admissionreview/uid": "c16568a3-6b46-4cca-b313-c45b52a5b163", "admissionreview/allowed": true, "admissionreview/result": "nil"}
2022-04-27T12:36:51.537Z	DEBUG	webhook	AdmissionReview patch={ type: JSONPatch, body: null }	{"commit": "9b1f078", "knative.dev/kind": "karpenter.sh/v1alpha5, Kind=Provisioner", "knative.dev/namespace": "", "knative.dev/name": "test-flux", "knative.dev/operation": "UPDATE", "knative.dev/resource": "karpenter.sh/v1alpha5, Resource=provisioners", "knative.dev/subresource": "", "knative.dev/userinfo": "{eksadmin aws-iam-authenticator:REDUCTED:REDUCTED [system:masters system:authenticated] map[accessKeyId:[REDUCTED] arn:[arn:aws:sts::REDUCTED:assumed-role/eks-admin/EKSGetTokenAuth] canonicalArn:[arn:aws:iam::REDUCTED:role/eks-admin] sessionName:[EKSGetTokenAuth]]}", "admissionreview/uid": "c16568a3-6b46-4cca-b313-c45b52a5b163", "admissionreview/allowed": true, "admissionreview/result": "nil"}
webhook logs when diff IS shown
2022-04-27T12:37:04.197Z	INFO	webhook	Kind: "karpenter.sh/v1alpha5, Kind=Provisioner" PatchBytes: [{"op":"replace","path":"/spec/requirements/0/key","value":"node.kubernetes.io/instance-type"},{"op":"add","path":"/spec/requirements/0/values/2","value":"t3.large"},{"op":"replace","path":"/spec/requirements/0/values/1","value":"t3.medium"},{"op":"replace","path":"/spec/requirements/0/values/0","value":"t3.small"},{"op":"replace","path":"/spec/requirements/1/key","value":"kubernetes.io/arch"},{"op":"replace","path":"/spec/requirements/1/values/1","value":"amd64"},{"op":"replace","path":"/spec/requirements/1/values/0","value":"arm64"},{"op":"remove","path":"/spec/requirements/2/values/2"},{"op":"replace","path":"/spec/requirements/2/values/1","value":"spot"},{"op":"replace","path":"/spec/requirements/2/values/0","value":"on-demand"},{"op":"replace","path":"/spec/requirements/2/key","value":"karpenter.sh/capacity-type"}]	{"commit": "9b1f078", "knative.dev/kind": "karpenter.sh/v1alpha5, Kind=Provisioner", "knative.dev/namespace": "", "knative.dev/name": "test-flux", "knative.dev/operation": "UPDATE", "knative.dev/resource": "karpenter.sh/v1alpha5, Resource=provisioners", "knative.dev/subresource": "", "knative.dev/userinfo": "{eksadmin aws-iam-authenticator:REDUCTED:REDUCTED [system:masters system:authenticated] map[accessKeyId:[REDUCTED] arn:[arn:aws:sts::REDUCTED:assumed-role/eks-admin/EKSGetTokenAuth] canonicalArn:[arn:aws:iam::REDUCTED:role/eks-admin] sessionName:[EKSGetTokenAuth]]}"}
2022-04-27T12:37:04.197Z	INFO	webhook	remote admission controller audit annotations=map[string]string(nil)	{"commit": "9b1f078", "knative.dev/kind": "karpenter.sh/v1alpha5, Kind=Provisioner", "knative.dev/namespace": "", "knative.dev/name": "test-flux", "knative.dev/operation": "UPDATE", "knative.dev/resource": "karpenter.sh/v1alpha5, Resource=provisioners", "knative.dev/subresource": "", "knative.dev/userinfo": "{eksadmin aws-iam-authenticator:REDUCTED:REDUCTED [system:masters system:authenticated] map[accessKeyId:[REDUCTED] arn:[arn:aws:sts::REDUCTED:assumed-role/eks-admin/EKSGetTokenAuth] canonicalArn:[arn:aws:iam::REDUCTED:role/eks-admin] sessionName:[EKSGetTokenAuth]]}", "admissionreview/uid": "e7ecb32f-10d4-411c-ae87-37d0d4c79ba5", "admissionreview/allowed": true, "admissionreview/result": "nil"}
2022-04-27T12:37:04.197Z	DEBUG	webhook	AdmissionReview patch={ type: JSONPatch, body: [{"op":"replace","path":"/spec/requirements/0/key","value":"node.kubernetes.io/instance-type"},{"op":"add","path":"/spec/requirements/0/values/2","value":"t3.large"},{"op":"replace","path":"/spec/requirements/0/values/1","value":"t3.medium"},{"op":"replace","path":"/spec/requirements/0/values/0","value":"t3.small"},{"op":"replace","path":"/spec/requirements/1/key","value":"kubernetes.io/arch"},{"op":"replace","path":"/spec/requirements/1/values/1","value":"amd64"},{"op":"replace","path":"/spec/requirements/1/values/0","value":"arm64"},{"op":"remove","path":"/spec/requirements/2/values/2"},{"op":"replace","path":"/spec/requirements/2/values/1","value":"spot"},{"op":"replace","path":"/spec/requirements/2/values/0","value":"on-demand"},{"op":"replace","path":"/spec/requirements/2/key","value":"karpenter.sh/capacity-type"}] }	{"commit": "9b1f078", "knative.dev/kind": "karpenter.sh/v1alpha5, Kind=Provisioner", "knative.dev/namespace": "", "knative.dev/name": "test-flux", "knative.dev/operation": "UPDATE", "knative.dev/resource": "karpenter.sh/v1alpha5, Resource=provisioners", "knative.dev/subresource": "", "knative.dev/userinfo": "{eksadmin aws-iam-authenticator:REDUCTED:REDUCTED [system:masters system:authenticated] map[accessKeyId:[REDUCTED] arn:[arn:aws:sts::REDUCTED:assumed-role/eks-admin/EKSGetTokenAuth] canonicalArn:[arn:aws:iam::REDUCTED:role/eks-admin] sessionName:[EKSGetTokenAuth]]}", "admissionreview/uid": "e7ecb32f-10d4-411c-ae87-37d0d4c79ba5", "admissionreview/allowed": true, "admissionreview/result": "nil"}
@tbondarchuk tbondarchuk added the bug Something isn't working label Apr 27, 2022
@tbondarchuk
Copy link
Contributor Author

UPD: traced it specifically to MutatingWebhookConfiguration defaulting.webhook.provisioners.karpenter.sh: removing other webhooks still produced diffs, only stopped after removing this one.

@tbondarchuk
Copy link
Contributor Author

Well, judging by the comments https://github.com/aws/karpenter/blob/089f4e7c10f3f86086c49e9e4bc98c1ec7b9e3c0/pkg/apis/provisioning/v1alpha5/requirements.go#L113 this behavior is by design.

It does make it hard to use with GitOps like FluxCD or ArgoCD (possible workaround is to disable notifications for just provisioner changes I guess).

Perhaps make it to sort requirements alphabetically or something?

@tzneal tzneal self-assigned this Apr 27, 2022
tzneal added a commit to tzneal/karpenter that referenced this issue Apr 27, 2022
@tzneal
Copy link
Contributor

tzneal commented Apr 27, 2022

@aliusmiles Thanks for the great bug report and reproduction steps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants