Harden the reconciliation loop to work best with the APIserver #29

metral · 2020-07-23T00:05:02Z

In the build out of the integration testing suite (#28), we uncovered issues in how the finalizer and status updates operated, mainly related to loop failures and as a result, extra loops spawning and causing indeterministic outcomes.

This PR hardens the reconciliation loop based on the findings from running the tests, borrowing approaches from the Helm operator in the operator-sdk examples.

(stack-cntlr): do not err on CreateStack if stack already exists
fix(stack-cntlr): harden status and finalizers based on Helm operator
The use of GET and UPDATE API calls to the apiserver operate on optimistic
locking, which can lead to invalid operations if an object becomes stale. This a feature in k8s, not a bug.

We harden status updates and finalizers using a best-effort approach of getting
the latest resource first before attempting to update it.

However, loops can still fail on outdated objects albeit less frequently, and practically
all events on a Stack can invoke yet another reconciliation loop. We defensively avoid loops
where possible, and lean on the resourceGeneration predicate to help lower eventing from invoking more loops than necessary.

See for more details: https://git.io/JJlcx
fix(stack-cntlr): retry HTTP 409s optionally, and requeue HTTP 404s

lblackstone

Looks good overall

pkg/controller/stack/stack_controller.go

The use of GET and UPDATE API calls to the apiserver operate on optimistic locking, which can lead to invalid operations if an object becomes stale. This a feature in k8s, not a bug. We harden status updates and finalizers using a best-effort approach of getting the latest resource first before attempting to update it. However, loops can still fail on outdated objects albeit less frequently, and practically all events on a Stack can invoke yet another reconciliation loop. We defensively avoid loops where possible, and lean on the resourceGeneration predicate to help lower eventing from invoking more loops than necessary. See for more details: https://git.io/JJlcx

pkg/controller/stack/stack_controller.go

metral requested review from lukehoban and lblackstone July 23, 2020 00:06

metral changed the title ~~Harden the reconciliation loop to work best with the APIserver and stack updates~~ Harden the reconciliation loop to work best with the APIserver Jul 23, 2020

lblackstone requested changes Jul 23, 2020

View reviewed changes

pkg/controller/stack/stack_controller.go Outdated Show resolved Hide resolved

pkg/controller/stack/stack_controller.go Outdated Show resolved Hide resolved

metral requested a review from lblackstone July 24, 2020 04:19

metral added 3 commits July 24, 2020 04:20

fix(stack-cntlr): do not err on CreateStack if stack already exists

76a63c3

fix(stack-cntlr): retry HTTP 409s optionally, and requeue HTTP 404s

ef4f834

metral force-pushed the metral/rework-reconcile-loop branch 2 times, most recently from e4afee8 to 271c098 Compare July 24, 2020 15:20

lblackstone approved these changes Jul 24, 2020

View reviewed changes

pkg/controller/stack/stack_controller.go Outdated Show resolved Hide resolved

metral added 3 commits July 24, 2020 15:32

Address feedback

d232572

fix(reconciler): don't UpdateStack if desired state reached

942ec9e

fix(reconciler): don't update status if outputs are empty

532533f

metral force-pushed the metral/rework-reconcile-loop branch from 271c098 to 532533f Compare July 24, 2020 15:32

metral merged commit 3b48172 into master Jul 24, 2020

pulumi-bot deleted the metral/rework-reconcile-loop branch July 24, 2020 15:37

metral mentioned this pull request Jul 29, 2020

Add more StackController loop hardening #34

Merged

viveklak mentioned this pull request Nov 30, 2020

Fix stack state message regression #107

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harden the reconciliation loop to work best with the APIserver #29

Harden the reconciliation loop to work best with the APIserver #29

metral commented Jul 23, 2020 •

edited

Loading

lblackstone left a comment

Harden the reconciliation loop to work best with the APIserver #29

Harden the reconciliation loop to work best with the APIserver #29

Conversation

metral commented Jul 23, 2020 • edited Loading

lblackstone left a comment

Choose a reason for hiding this comment

metral commented Jul 23, 2020 •

edited

Loading