-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrote ISM Policy reconciler #846
Rewrote ISM Policy reconciler #846
Conversation
I'm taking a second look at my code to see if there's anything I'm not happy about and I see naked returns are sometimes used and sometimes not. I see that the first reconciler (cluster reconciler) does not use naked return values and it seems somewhat random when we use naked returns and when we use explicit return values. I personally find the naked returns quite difficult to read and I have to backtrack in the code to see where the return value comes from so I would like to use explicit return values everywhere for the ISM Policy. Is this something you would have an issue with for this PR? @swoehrl-mw (I'm mentioning you since I've seen you the most here) |
I also don't understand why there are so many cases where the reconciliation is not requeued. Is this intended? And is it correctly understood that r.updateStatus is a boolean representing whether the operator should update the CR or is it something else? |
And a last question. Is there a reason why we check that the cluster ref has not been changed? Would we not expect the CR to be applied to any OpenSearch Cluster with the given name? Especially since the name has to be unique for each namespace |
@swoehrl-mw Is there a possibility of a community call or similar to discuss this? We would like to help improve this, but we need to align on the approach. |
Hi @cthtrifork. There currently is no community call for the operator. I'm quite swamped with work so don't really have the time for a 1:1 call, the best I can offer is an async discussion either here or you can hit me up in the opensearch slack. |
Hi @rkthtrifork. Sorry for not getting back to you, not much time and your comments got lost among all my messages.
This boils down to basically personal style. I'm fine with using explicit returns.
The
That was an early design decision to avoid inconsistencies and make it safer for the operator (like what should it do if the cluster changes, should it try to delete the role from the old one). |
Got it
Got it
We have been talking about why it does this since we are running a test cluster where we sometimes kill OpenSearch completely during downgrades (when we retry an upgrade) and when we have tested some stuff that didn't go as planned. When we kill OpenSearch completely, we have to delete all the custom resources as well and we were unsure why it was made this way. I have no issue whatsoever with adding it back, but I wanted to raise the question first |
@swoehrl-mw Should I add back the cluster ref check and fix the test and then you can take a look at the PR? |
@rkthtrifork Yes, please add back the ref check. That way all reconcilers are consistent. |
3cca2d0
to
a3b8382
Compare
cc6b2c8
to
623268a
Compare
Signed-off-by: rkthtrifork <[email protected]>
623268a
to
c0cc5bb
Compare
@swoehrl-mw I fixed up the unit tests and added back the cluster ref check. Would you or someone else take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @rkthtrifork for your contribution.
Hey @cthtrifork I'm open for setting up a community call, we even have a slack channel |
I joined the Slack channel as well. Idk if there's a reason for a community meeting in regards to discussing this now that it's already merged, but I still do think there are few inconsistencies with how the reconcilers handle different cases and I think I also saw some cases where some of the reconcilers didn't requeue when I would expect it to. Getting those aligned would be cool. I would have to check up on it before I have some concrete places in the code to mention |
Description
The ISM Policy reconciler was constantly trying to update the ISM Policy and it was not handling reconciliation requeue in some cases. There were possibly other issues as well. Below I have described what caused the different issues I encountered
One thing I am wondering is that I am not sure why we would want to create a CR without specifying the cluster ID and then the operator automatically links it to that cluster ID so it breaks if the OpenSearch CR is deleted. Is this intended and why? I'm talking about the section with the comment "Check cluster ref has not changed"
Tested cases:
The test for ISM Policies is currently failing miserably, but I decided to create the PR to get feedback before I dive into fixing it.
Issues Resolved
#833
#732
Possibly other issues
Check List
make lint
)If CRDs are changed:
make manifests
) and also copied into the helm chartPlease refer to the PR guidelines before submitting this pull request.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.