Do not compare generated resources against API server but against spec hash #500

pmalek · 2024-08-16T12:31:16Z

Problem statement

Comparing generates resource to decide whether a patch is required has proven to be difficult to get right (e.g. #239 )

Instead of this approach we can utilize a different technique:

Attach a spec hash annotation to generated resources
When patch/update is triggered, compare against that value
If those differ, patch/update

Acceptance criteria

As a User I do not experience unnecessary patch/update loops like in AKS' Admission Enforcer mutates ControlPlane's ValidatingWebhookConfiguration which causes a perpetual reconciliation loop #239

The text was updated successfully, but these errors were encountered:

akunszt · 2024-08-26T13:54:57Z

What about fetching the information from the managedFields? I did not check any code just thinking at the moment. Using a hash about the values can be problematic as it is common practice to use Mutating Webhooks to runtime modify resources (e.g. add an image prefix or modify the image to fetch from a local cache, change resource settings, remove CPU limits, use the same value for memory requests and limits, add sidecar containers for logging, observability, other nefarious reasons, etc.)

One reason using Mutating Webhooks is to standardize the workloads and many third party software components lacks the required level of configurability (they focus lies elsewhere and you want to provide a simpler interface to your users than the raw Kubernetes API).

pmalek · 2024-09-09T14:14:54Z

What about fetching the information from the managedFields? I did not check any code just thinking at the moment. Using a hash about the values can be problematic as it is common practice to use Mutating Webhooks to runtime modify resources (e.g. add an image prefix or modify the image to fetch from a local cache, change resource settings, remove CPU limits, use the same value for memory requests and limits, add sidecar containers for logging, observability, other nefarious reasons, etc.)

One reason using Mutating Webhooks is to standardize the workloads and many third party software components lacks the required level of configurability (they focus lies elsewhere and you want to provide a simpler interface to your users than the raw Kubernetes API).

@akunszt Thanks for your comment. The idea behind this issue is to

generate the podTemplateSpec (which is done currently - 1.3.0 - anyway) for a particular resource
based on that (prior to submitting to kube API server) calculate the hash on that spec
add that hash through e.g. annotation

Then, upon the update we do the steps above and compare what was generated (before submitting to kube API server) with the hash on the object that's in cluster. If those are equal then there's no need to change anything. This method does not produce false positives caused by MutatingWebhooks changing resources on admission.

scottaubrey · 2024-09-24T17:46:26Z

I spent a good chunk of my day wondering why the status of the data plane and control plane never go to ready despite the pods being ready. When I finally stumbled upon #239 and this issue, it finally made sense.

I support the idea of the hash so that resources can be mutated at runtime (I my instance, because of a kyverno policy).

pmalek · 2024-09-24T17:53:48Z

Hi @scottaubrey 👋

Thanks for commenting on the issue. We want to get that implemented to solve these pain points for suers but we're currently going through the prioritized backlog. We hope to hope some time to work on this in not so distant future.

pmalek · 2024-11-19T14:27:07Z

I feel the need to comment on this one just to make sure everyone involved is aware of the consequences of implementing this change:

When "fixed", this will change the current behavior of KGO where it continuously enforces its configuration (and therefore corrects in cluster changes of resource that it manages) into enforcing it only on spec updates.

This will make KGO's reconciliation loop not fight with any solutions that are installed in users' clusters that could potentially come in conflict with KGO's actions on its managed resources but this will in turn make its actions not perform the self healing/correction.

pmalek · 2025-01-24T18:42:05Z

I've prepared a short design doc for this: https://docs.google.com/document/d/1MSq_kPcJoCFPh4XqQtjWE5nsEaSMTGHs9qblPJ6DHDY/edit?tab=t.0. Let me know your thoughts.

zsedem mentioned this issue Aug 26, 2024

Dataplane and Controlplane does not become ready #522

Open

pmalek mentioned this issue Oct 17, 2024

fix: set DataPlane's ExternalTrafficPolicy on updates and patches #750

Merged

1 task

lahabana added this to the KGO v1.5.x milestone Oct 30, 2024

pmalek mentioned this issue Nov 12, 2024

fix(dataplane): properly default deprecated service account name #856

Merged

czeslavo mentioned this issue Nov 19, 2024

fix(dataplane): fix setting ExternalTrafficPolicy on ingress service #865

Merged

1 task

lahabana modified the milestones: KGO v1.5.x, KGO v1.6.x Dec 12, 2024

pmalek mentioned this issue Jan 20, 2025

AKS' Admission Enforcer mutates ControlPlane's ValidatingWebhookConfiguration which causes a perpetual reconciliation loop #239

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not compare generated resources against API server but against spec hash #500

Do not compare generated resources against API server but against spec hash #500

pmalek commented Aug 16, 2024 •

edited

Loading

akunszt commented Aug 26, 2024

pmalek commented Sep 9, 2024

scottaubrey commented Sep 24, 2024

pmalek commented Sep 24, 2024 •

edited

Loading

pmalek commented Nov 19, 2024

pmalek commented Jan 24, 2025

Do not compare generated resources against API server but against spec hash #500

Do not compare generated resources against API server but against spec hash #500

Comments

pmalek commented Aug 16, 2024 • edited Loading

Problem statement

Acceptance criteria

akunszt commented Aug 26, 2024

pmalek commented Sep 9, 2024

scottaubrey commented Sep 24, 2024

pmalek commented Sep 24, 2024 • edited Loading

pmalek commented Nov 19, 2024

pmalek commented Jan 24, 2025

pmalek commented Aug 16, 2024 •

edited

Loading

pmalek commented Sep 24, 2024 •

edited

Loading