-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Remove FeatureTracker code from Kserve reconciler #1521
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@grdryn: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
OwnsGVK(gvk.KnativeServing, reconciler.Dynamic()). | ||
OwnsGVK(gvk.ServiceMeshMember, reconciler.Dynamic()). | ||
OwnsGVK(gvk.AuthorizationPolicy, reconciler.Dynamic()). | ||
OwnsGVK(gvk.AuthorizationPolicyv1beta1, reconciler.Dynamic()). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably those dynamic watching requires a custom predicate that check for serverless being enabled or not ? similar to https://github.com/opendatahub-io/opendatahub-operator/blob/main/controllers/services/auth/auth_controller.go#L61
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the things that I was going to call out in a self-review, that these dynamic ones didn't seem to actually be triggering if the resources changed. Thanks for the suggestion & link, I'll take a look 👍
knativeIngressDomain := getKnativeDomain(ctx, cli, k) | ||
knativeCertificateSecret := getKnativeCertSecretName(k) | ||
|
||
return map[string]any{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the component and dsci are already part of the template data, so it may not be necessary to add additional elements (but would require to change the templates)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I noticed that, and had originally started down that path of modifying the templates here also. I hit some issues that I didn't care to troubleshoot at the time, so I instead just declared the data that the current templates expect, for now. I could try to address that here, or alternatively it could be done in a follow-up PR so as not to bloat this one (I slightly prefer the idea of that being in a separate PR, because of how we squash commits on merge here...I think it's a reasonable thing to have as a separate commit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fine with me, in the meantime I could add an additional method to the rendering action to be able to support dynamically computed values, so maybe we can have a good middle-ground
template.WithData(data), | ||
) | ||
|
||
return actionFn(ctx, rr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will cause the action to ignore any caching
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this isn't ideal (and it's the reason I didn't bother adding the template.WithCache()
ActionOpts): ideally we'd be able to add the action directly, rather than having it be recreated on each rr.
I did it this way to be able to have access to the Kserve and DSCI from the rr (and the ctx that gets passed to the action along with the rr at reconcile time), to be able to set the template data.
Do you have any suggestions on how this could be done better? One possibility might be to get the data in an earlier action, and store that somewhere. Is there any reasonable place to store it? The idea of it being some sort of global var makes me a bit nervous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll open a PR to add such support in the rendering action
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -241,28 +294,3 @@ func hashConfigMap(cm *corev1.ConfigMap) (string, error) { | |||
|
|||
return base64.RawURLEncoding.EncodeToString(h), nil | |||
} | |||
|
|||
func ownedViaFT(cli client.Client) handler.MapFunc { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably need an action to remove FT ownership in case of an upgrade
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the FT owner references on the resources that it had previously created will be removed when the FTs are removed, so I think all we need to do here is remove the FTs themselves, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that if the FT is removed, then all the owned object are removed as well so, to aboid side effects while upgrading, we should probably:
- re-own existing resources
- remove FTs
There is some logic in the deploy action , not sure if it is generically applicable to FTs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant about the ownership being removed when the FT is removed, is that the Kserve CR is already there as an additional owner at this stage (actually as the main owner because it has the controller: true
set, where the FT ref didn't) because this new iteration of the reconciler has deployed it now.
So I think as long as we run the deploy action first, and then remove the FTs, in my understanding everything will already have been re-owned. Does that sound correct?
There is some logic in the deploy action , not sure if it is generically applicable to FTs
The FTs weren't being added to the resources, so I don't believe it is: it was using the pre-existing Feature machinery to deploy them directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant about the ownership being removed when the FT is removed, is that the Kserve CR is already there as an additional owner at this stage (actually as the main owner because it has the controller: true set, where the FT ref didn't) because this new iteration of the reconciler has deployed it now.
So I think as long as we run the deploy action first, and then remove the FTs, in my understanding everything will already have been re-owned. Does that sound correct?
Ah yes, now I got it, so yes it should work. The only things is to be sure there is not such controller: true
in the current ownership list, in such case I guess the deploy would fail (well, the SetControllerReference
would)
Description
Jira: https://issues.redhat.com/browse/RHOAIENG-18045
This is a work in progress
How Has This Been Tested?
Screenshot or short clip
Merge criteria