-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for multiple gRPC client management #93
Add support for multiple gRPC client management #93
Conversation
Awesome! Thanks for putting in the effort on this one, let's run CI and see how we do. Also please do check out #42 (comment), we should be really close to completing our donation to the CNCF and we're trying to document CNCF CLA completion for all contributors as a requirement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for jumping in to do this one @levikobi 🥳
One thing I noticed while reviewing is that we've added a bit of machinery of our own to reconcile things, e.t.c the observer
interface and the SetupReconcilation
method as opposed to leaning into the controller-runtime
machinery more as we had been doing previously. Could you tell me more about the decisions made here, if there were difficulties with trying to rely more on the traditional reconciliation approach?
Thank you for letting me get involved @shaneutt. Regarding your question, this is my first interaction with the controller-runtime SDK, so I tried doing my best while also keeping in mind that this change might not stay for too long. I do understand from your comment though that I could have made a better usage out of the controller runtime.😅 If you can give me some tips about improving things, I will happily apply them. Anyway, I will try to go over the documentation again and see what I've missed. Thanks :) |
We really appreciate you jumping in here! We use operator-sdk (which uses kubebuilder under the hood, which then uses controller-runtime under the hood) and you can have that generate controllers for you that will provide a standard I wouldn't say we're strictly opposed to the solution you've put forth here, but I do think it changes how our controllers work in a way that handles more of the reconciliation machinery more manually than I would have expected. If we continued using our previous pattern I think we could just add one new controller that watches the
I think ideally we would try to do all this within the confines of the higher level mechanisms (reconciler callback types, and watches), though I admit number 4 is the thing that makes this hard to do. I think I would have to experiment a bit with the various watch sources provided by controller-runtime: using a channel source seems like it could be one reasonably simple way to do it, all dataplane client updates resulting in a channel message that triggers re-reconcilation of How about this: do some thinking and looking around let me know if anything stands out to you, and I'll do the same plus also do another round of review here and see if I can come up with anything. Ultimately I don't want you to feel like "you stepped in it" with this one like you're getting pulled into more than you bargained for, so just know that I'm prioritizing your time as a contributor as well as working through some maintenance concerns. |
Thanks for the honest feedback @shaneutt! I understand. |
3ffb31f
to
4c74635
Compare
Hi @shaneutt, I hope this version better fits your intentions. |
4c74635
to
8d65029
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a first pass of review, mostly I think we're looking pretty solid. I haven't dug deep into the route controllers yet. Most of my comments so far are minor, I do think the channel approach should be a workable solution thank you for putting the time and energy into this we really appreciate it! I'll try to get to my next pass soon.
4b3003b
to
8cc1915
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So keeping in mind that this functionality is not meant to exist long term, I think this is close to being ready and we can move forward with this. I added a comment about a TODO we should resolve or follow-up on. Also @astoycos should take a look too since this is a fairly large change (and there's multiple ways we could approach this) and two heads are better than one.
At a higher level though, it seems we're missing an integration/e2e style test that validates these changes. One way we might do this is make the default kind
cluster that's created for the tests include 3 nodes instead of 1, but by default cordon off 2/3 of them.
Our testing framework provides a WithConfig()
option for the kind
cluster builder:
I think it might be reasonable for us to provide a 3-node configuration to that in an e2e test that then cordons off the nodes, then removes the cordon to instigate the Daemonset
deleting, and then creating the pods, and then we could verify the configuration (for TCPRoute
/UDPRoute
) gets applied to the Pods
as they come up, and we could test the traffic and make sure things are working as expected?
Let me know your thoughts on this @levikobi, and if that sounds like a lot or if it sounds pretty straightforward? Curious if you have any other thoughts on testing as well. 🤔
There are also some lint issues that need to be addressed, according to CI: Error: : # github.com/kong/blixt
Error: ./main.go:95:54: undefined: Tee (typecheck) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work @levikobi, thanks so much for tackling this :)
I generally have a bunch of nits and a few questions otherwise I think this is a pretty cool design 👍. Lets just make sure we try and keep as clear of a separation as possible between operator-sdk enabled funcations/vars/uses and the custom channel logic you've added here with comments and smart variable naming. I just push this to not confuse folks in the future.
Additionally we are adding a little complexity so I'd like to make sure we have at least some unit/integration testing added to make sure all of the channel functionality is working as expected. Feel free to make an issue for this and tackle it in a follow up or ask someone else to :)
@@ -35,7 +36,9 @@ type TCPRouteReconciler struct { | |||
client.Client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We no longer neeed RBAC for daemonset's here right? (on the //+kubebuilder lines above
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make sure that all controller specific RBAC tags are aligned to each dedicated controller file and then any RBAC tags that are used in all of the controllers maybe goes in the dataplane.
Sorry I know it wasn't really a problem you introduced :/
|
||
return ctrl.NewControllerManagedBy(mgr). | ||
For(&appsv1.DaemonSet{}, | ||
builder.WithPredicates(predicate.NewPredicateFuncs(r.daemonsetHasMatchingAnnotations)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you possibly filter this even more? i.e Only care about events where the DS's status changes since it's where we get the ready pods from anyways?
status:
currentNumberScheduled: 1
desiredNumberScheduled: 1
numberAvailable: 1
numberMisscheduled: 0
numberReady: 1
observedGeneration: 1
updatedNumberScheduled: 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still relatively new to the tooling, did you mean something like that?
return ctrl.NewControllerManagedBy(mgr).
For(&appsv1.DaemonSet{},
builder.WithPredicates(predicate.NewPredicateFuncs(r.daemonsetHasMatchingAnnotations)),
).
WithEventFilter(predicate.Funcs{
UpdateFunc: func(e event.UpdateEvent) bool {
return true
},
}).
Complete(r)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah close to that see https://sdk.operatorframework.io/docs/building-operators/golang/references/event-filtering/
In bpfd we do something like
// Only reconcile if a bpfprogram has been created for the controller's program type.
func BpfProgramTypePredicate(kind string) predicate.Funcs {
return predicate.Funcs{
GenericFunc: func(e event.GenericEvent) bool {
return e.Object.(*bpfdiov1alpha1.BpfProgram).Spec.Type == kind
},
CreateFunc: func(e event.CreateEvent) bool {
return e.Object.(*bpfdiov1alpha1.BpfProgram).Spec.Type == kind
},
UpdateFunc: func(e event.UpdateEvent) bool {
return e.ObjectNew.(*bpfdiov1alpha1.BpfProgram).Spec.Type == kind
},
DeleteFunc: func(e event.DeleteEvent) bool {
return e.Object.(*bpfdiov1alpha1.BpfProgram).Spec.Type == kind
},
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(which is a bit verbose but has worked for us in the past)
Sound like a good idea, testing is definitely a must for this change. |
Thanks for the review @astoycos! |
Ok, reach out to me if you need any help with this there's some room for interpretation here, it's not terribly straightforward. |
Hey! 👋 Just checking in. Need any help to move this one forward? |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
d711cda
to
c5b10cf
Compare
439805b
to
2b2324f
Compare
This reverts commit c3e13af.
Co-authored-by: Shane Utt <[email protected]> Signed-off-by: Kobi Levi <[email protected]>
2b2324f
to
28f2ddb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked with @astoycos and rather than hold this PR up and making it bigger (since it's already pretty good size) we're aligned on taking what we have here, and following up (thanks for creating follow-ups like #127).
So we'll follow up with some tweaks and tests, but so far integration is passing, thank you for all your effort on this @levikobi! 🖖
/approve
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: levikobi, shaneutt The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Fixes #46.