📖 Remove simplistic advice about multiple controllers reconciling same CR #4537

alvaroaleman · 2025-02-03T17:28:00Z

This advice is simplyfing things and making an "It depends" situation look like there was a clear good and a clear bad way that is the same in all situations. Pretty much none of the issues stated will get better if each controller gets its own CR:

Race conditions: Conflict errors can always happen and all controllers need to be able to deal with them. If a full reconciliation is too expensive, they can use something like retry.OnConflict
Concurrency issues with different interpretations of state: This example sounds like just buggy software. Copying the state to a new CR doesn't eliminate this problem
Maintenance and support difficulties: This is definitely not going to get any better by adding more CRDs into the mix, if anything, it will get more complicated
Status tracking complications: This is why conditions exist and Kubernetes api guidelines explicitly state that controllers need to ignore unknown conditions: Objects may report multiple conditions, and new types of conditions may be added in the future or by 3rd party controllers., ref
Performance issues: If multiple controllers do the same thing, that is a bug regardless of all other considerations and can easily lead to correctness and performance issues. The workqueue locks items while they are reconciled to avoid exactly that, but obviously it doesn't work cross-controller

To illustrate the situation, think about the Pod object, in the lifecycle of a pod we usually have at least cluster-autoscaler, scheduler and kubelet. Making cluster-autoscaler act on a PodScaleRequest and scheduler on a PodScheduleRequest would be a complication, not a simplification.

This advice is simplyfing things and making an "It depends" situation look like there was a clear good and a clear bad way that is the same in all situations. Pretty much none of the issues stated will get better if each controller gets its own CR: * Race conditions: Conflict errors can always happen and all controllers need to be able to deal with them. If a full reconciliation is too expensive, they can use something like `retry.OnConflict` * Concurrency issues with different interpretations of state: This example sounds like just buggy software. Copying the state to a new CR doesn't eliminate this problem * Maintenance and support difficulties: This is definitely not going to get any better by adding more CRDs into the mix, if anything, it will get more complicated * Status tracking complications: This is why conditions exist and Kubernetes api guidelines explicitly state that controllers need to ignore unknown conditions: `Objects may report multiple conditions, and new types of conditions may be added in the future or by 3rd party controllers.`, [ref][0] * Performance issues: If multiple controllers do the same thing, that is a bug regardless of all other considerations and can easily lead to correctness and performance issues. The `workqueue` locks items while they are reconciled to avoid exactly that, but obviously it doesn't work cross-controller To illustrate the situation, think about the `Pod` object, in the lifecycle of a pod we usually have at least cluster-autoscaler, scheduler and kubelet. Making cluster-autoscaler act on a `PodScaleRequest` and scheduler on a `PodScheduleRequest` would be a complication, not a simplification. [0]: https://github.com/kubernetes/community/blob/322066e7dba7c5043071392fec427a57f8660734/contributors/devel/sig-architecture/api-conventions.md#typical-status-properties

sbueringer · 2025-02-03T17:29:46Z

👍 from my side

camilamacedo86

@alvaroaleman

Good catch! Thank you 🥇
I totally agree. I think the original intention might have gotten lost somewhere along the way, or I may have just overlooked it.

Anyway, good catcher.

/lgtm
/approve

k8s-ci-robot · 2025-02-03T18:20:32Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alvaroaleman, camilamacedo86

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [camilamacedo86]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

alvaroaleman · 2025-02-03T18:27:51Z

@camilamacedo86 thanks for the prompt review. Could you maybe restart the failed CI job? I don't think its related to the PR and I don't have perms to do that myself

camilamacedo86 · 2025-02-03T19:29:37Z

@alvaroaleman

I am looking on that.
But it should not prevent this PR.
Thank you again 👍

camilamacedo86 · 2025-02-03T19:50:07Z

Hi @alvaroaleman

@camilamacedo86 thanks for the prompt review. Could you maybe restart the failed CI job? I don't think its related to the PR and I don't have perms to do that myself

It is solved, too.
You will no longer face this issue if you want to push any other PR.
Please feel free to contribute within. Your help is very welcome.

k8s-ci-robot requested a review from camilamacedo86 February 3, 2025 17:28

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 3, 2025

k8s-ci-robot requested a review from rashmigottipati February 3, 2025 17:28

k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Feb 3, 2025

camilamacedo86 approved these changes Feb 3, 2025

View reviewed changes

k8s-ci-robot assigned camilamacedo86 Feb 3, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 3, 2025

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 3, 2025

camilamacedo86 merged commit 54dbd1f into kubernetes-sigs:master Feb 3, 2025
8 of 10 checks passed

alvaroaleman deleted the fixup branch February 3, 2025 19:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

📖 Remove simplistic advice about multiple controllers reconciling same CR #4537

📖 Remove simplistic advice about multiple controllers reconciling same CR #4537

alvaroaleman commented Feb 3, 2025

sbueringer commented Feb 3, 2025

camilamacedo86 left a comment

k8s-ci-robot commented Feb 3, 2025

alvaroaleman commented Feb 3, 2025

camilamacedo86 commented Feb 3, 2025

camilamacedo86 commented Feb 3, 2025

📖 Remove simplistic advice about multiple controllers reconciling same CR #4537

📖 Remove simplistic advice about multiple controllers reconciling same CR #4537

Conversation

alvaroaleman commented Feb 3, 2025

sbueringer commented Feb 3, 2025

camilamacedo86 left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Feb 3, 2025

alvaroaleman commented Feb 3, 2025

camilamacedo86 commented Feb 3, 2025

camilamacedo86 commented Feb 3, 2025