Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

controllers: fix dynamic crd predicate #2911

Merged
merged 2 commits into from
Dec 5, 2024

Conversation

leelavg
Copy link
Contributor

@leelavg leelavg commented Nov 28, 2024

if a single predicate reject the event then the runtime doesn't queue the reconcile for that controller and for dynamic crd feature multiple crds should be checked for event and at the same time presence of a single matching crd with corresponding predicate functions should initiate the reconcile.

this commit uses helper function in predicate and creates the required logical conditions and refactor the crd predicate to directly take bools based on crd availability.

@leelavg
Copy link
Contributor Author

leelavg commented Nov 28, 2024

/hold

for testing.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 28, 2024
@leelavg
Copy link
Contributor Author

leelavg commented Nov 28, 2024

@rewantsoni ptal.

@leelavg
Copy link
Contributor Author

leelavg commented Nov 28, 2024

if a single predicate reject the event then the runtime doesn't queue the reconcile

	for _, p := range e.predicates {
		if !p.Create(c) {
			return
		}
	}

similar for other events can be found in same file.

@rewantsoni
Copy link
Member

rewantsoni commented Nov 28, 2024

We have moved to EventTypePredicate in ocs-client op to make the predicate more generic, could we align?

@leelavg
Copy link
Contributor Author

leelavg commented Nov 28, 2024

that doesn't solve the issue, possible to take a look at the description and mentioned EventTypePredicate solves it?

@leelavg leelavg force-pushed the predicate-test branch 2 times, most recently from c8765f5 to c170dcf Compare November 28, 2024 11:10
@rewantsoni
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 28, 2024
@leelavg
Copy link
Contributor Author

leelavg commented Dec 4, 2024

/unhold

Loaned a cluster which already had provider-client setup and didn't test for virt sc creation but tested for restart when the crd state changes as follows

disregard the slight time diff, seems to be time skew b/n my lab machine and ocp in another region.

================================= when crd is installed, ocs-op restarts within a second

> date --utc && k create -f virt-crd.yaml
Wed Dec  4 07:06:11 UTC 2024
customresourcedefinition.apiextensions.k8s.io/virtualmachines.kubevirt.io created

[...]

{"level":"error","ts":"2024-12-04T07:05:18Z","msg":"Reconciler error","controller":"storagecluster","controllerGroup":"ocs.openshift.io","controllerKind":"[...]
{"level":"info","ts":"2024-12-04T07:05:19Z","logger":"cmd","msg":"Go Version: go1.22.10"}
{"level":"info","ts":"2024-12-04T07:05:19Z","logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
{"level":"info","ts":"2024-12-04T07:05:19Z","logger":"cmd","msg":"Cluster is running on OpenShift."}


================================= when crd is deleted, ocs-op restarts within 5 seconds

> date --utc && k delete -f virt-crd.yaml
Wed Dec  4 07:09:04 UTC 2024
customresourcedefinition.apiextensions.k8s.io "virtualmachines.kubevirt.io" deleted

[...]

{"level":"error","ts":"2024-12-04T07:08:08Z","msg":"Reconciler error","controller":"storagecluster","controllerGroup":"ocs.openshift.io","controllerKind":"[...]
{"level":"info","ts":"2024-12-04T07:08:12Z","logger":"cmd","msg":"Go Version: go1.22.10"}
{"level":"info","ts":"2024-12-04T07:08:12Z","logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
{"level":"info","ts":"2024-12-04T07:08:12Z","logger":"cmd","msg":"Cluster is running on OpenShift."}

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 4, 2024
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Dec 4, 2024
@leelavg
Copy link
Contributor Author

leelavg commented Dec 4, 2024

#2911 (comment)

  • rebased and added a comment.

@leelavg
Copy link
Contributor Author

leelavg commented Dec 4, 2024

/jira backport release-4.18

@openshift-ci-robot
Copy link

@leelavg: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick release-4.18

In response to this:

/jira backport release-4.18

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-cherrypick-robot

@openshift-ci-robot: once the present PR merges, I will cherry-pick it on top of release-4.18 in a new PR and assign it to you.

In response to this:

@leelavg: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick release-4.18

In response to this:

/jira backport release-4.18

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@nb-ohad
Copy link
Contributor

nb-ohad commented Dec 4, 2024

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 4, 2024
@nb-ohad
Copy link
Contributor

nb-ohad commented Dec 4, 2024

@leelavg we do not want to couple to predicate, each one should stay separate. The event predicate is a generic predicate to filter events based on event type. If the relationship between the predicates is wrong (and AND operation) let's fix that and not change the predicate themselves. Same goes for the order

@leelavg
Copy link
Contributor Author

leelavg commented Dec 4, 2024

let's fix that and not change the predicate themselves

  • the fix is to change the predicate, older predicate function is updated in client-op and we reusing it here as well.

each one should stay separate.

  • we have the Or, And to mix the predicate and we aren't clubbing both

@nb-ohad
Copy link
Contributor

nb-ohad commented Dec 4, 2024

the fix is to change the predicate, the older predicate function is updated in client-op and we reusing it here as well.

Maybe I am misunderstanding the problem or the change. I can see the move from the coupled predicate to the simpler decoupled predicate (name predicate and event type predicate) What I don't understand is the need for an AND(OR(AND, AND))) structure. This does not seem reasonable to me

Can you please describe your filtering logic in words first (maybe in the description), that will help in deciphering the change and reviewing the code

@nb-ohad
Copy link
Contributor

nb-ohad commented Dec 4, 2024

Looking at the code again. You are using the tools the wrong way which complicates things.

Your condition is logically 2 distinct conditions:

  • filter for a fixed list of CRD names
  • filter events for creation only if CRD is does not exists or it is a delete event

This boils down to filtering first on names, then filtering on event types:

predicate.Or(
    util.NamePredicate(crd1Name),
    util.NamePredicaet(crd2Name),
),
util.EventTypePredicate(
    !crd1Available || !crd2Available2,
    false,
    true,
    false,
)

@leelavg
Copy link
Contributor Author

leelavg commented Dec 4, 2024

Can you please describe your filtering logic in words first

  • I mentioned the "why" in pr description and "what" is being done in the code comments.

Your condition is logically 2 distinct conditions:
filter for a fixed list of CRD names
filter events for creation only if CRD is does not exists or it is a delete event

  • I don't think so, the event (create/delete) for a crd (name) is one distinct set. With the suggestion, I need the change for addition or deletion on watching new CRD in two places

or it is a delete event

  • not for all delete events, if the required crd didn't exist when we started the controller we don't want a reconcile if a delete event is trigerred on that

@nb-ohad
Copy link
Contributor

nb-ohad commented Dec 4, 2024

I don't think so, the event (create/delete) for a crd (name) is one distinct set. With the suggestion, I need the change for addition or deletion on watching new CRD in two places

No it doesn't, the queued event does not care

not for all delete events, if the required crd didn't exist when we started the controller we don't want a reconcile if a delete event is trigerred on that

This is why we have the named filter to only trigger for named CRDs. Add to that the fact that you cannot get a DELETE event for a CR that does not exists from the start

@leelavg leelavg force-pushed the predicate-test branch 2 times, most recently from ba149f8 to 299f35f Compare December 4, 2024 10:37
if a single predicate reject the event then the runtime doesn't queue
the reconcile for that controller and for dynamic crd feature multiple
crds should be checked for event and at the same time presence of a
single matching crd with corresponding predicate functions should
initiate the reconcile.

this commit uses helper function in predicate and creates the required
logical conditions and refactor the crd predicate to directly take
bools based on crd availability.

Signed-off-by: Leela Venkaiah G <[email protected]>
Signed-off-by: Leela Venkaiah G <[email protected]>
@leelavg
Copy link
Contributor Author

leelavg commented Dec 4, 2024

This boils down to filtering first on names, then filtering on event types:

  • took the suggestion based on an offline discussion, thanks.

@nb-ohad
Copy link
Contributor

nb-ohad commented Dec 4, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 4, 2024
Copy link
Contributor

openshift-ci bot commented Dec 4, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: leelavg, nb-ohad, rewantsoni

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 4, 2024
@leelavg
Copy link
Contributor Author

leelavg commented Dec 5, 2024

/unhold
based on #2911 (comment)

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 5, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit 13988d9 into red-hat-storage:main Dec 5, 2024
11 checks passed
@openshift-cherrypick-robot

@openshift-ci-robot: new pull request created: #2922

In response to this:

@leelavg: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick release-4.18

In response to this:

/jira backport release-4.18

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants