-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Namespace access #201
Namespace access #201
Conversation
This is a great and needed change since we require this as part of our Knative RabbitMQ eventing setup in the VEBA project which has the cluster in the vmware-system namespace but broker in a different namespace (vmware-functions). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial feedback, I'm still reviewing part of the code.
I ran the integration/controllers tests from this branch, and I have red tests for almost all (perhaps all?) controller tests, in the cases when X references a cluster from an allowed namespace
and when X references a cluster that allows all namespaces
.
I had to slightly tweak the Go mod file to compile the operator (see below). I also think some of the assertions in those tests are not correct. In the "happy case" when an object references a RabbitmqCluster
in the allow-list, the object should be created, and the status condition should reflect "success". I left comments in the sections for the binding controller, and I believe the same applies to the others (still have to go through all the code).
go mod edit -replace=github.com/rabbitmq/[email protected]=github.com/njbarber/cluster-operator@namespace-access
Yes that's what I did locally. I also imagine https://github.com/njbarber/messaging-topology-operator/blob/main/controllers/suite_test.go#L65 will have to be changed based on the version increment for the cluster operator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finished reviewing the code, everything is looking good with the latest changes. There's one last item I'm unsure about:
By adding the field .spec.messagingTopologyNamespaces
in cluster operator, we are adding a field in Cluster Operator that only makes sense in the context of Topology Operator. I feel we are creating a sort-of dependency/relationship in the Cluster Operator, towards the Topology Operator. Right now, the Cluster Operator doesn't "know" anything about the Topology Operator (which is a good thing).
I'm wondering if, in order to maintain this zero-topology-operator-knowledge in the Cluster Operator, should we take a different approach to declare the allow list?
For example, using annotations.
Sure that makes sense to me. Let me know if the recent changes match what you had in mind. Since I'm not familiar with the annotations pattern generally, my initial questions would be
|
The error message in the logs and the status condition seem sufficient to me from the Operator side. We will have to document this feature in RabbitMQ docs. I'm happy to take on that task. The current approach looks good to me. I usually look at AWS Load Balancer Controller for inspiration in annotations. Annotations have an optional prefix, that has to be a DNS domain. We could use |
fafe0c3
to
1761736
Compare
I updated the annotation name with your suggestion and rebased. Let me know if there's anything else. I also closed rabbitmq/cluster-operator#785 as it's no longer relevant. |
I'm happy with the current state. I will go one last time through the PR with somebody else within our team, and we'll merge if nothing else stands out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR workflow in GitHub Actions found a small issue. We have to include the following bits in the Deployment
manifest, otherwise the Topology Operator won't when you deploy it in a Kubernetes cluster:
Hi @njbarber, I was doing some QA with somebody from our team and we found that, since we are returning an error when the resource is not allowed to be created, Kubernetes requeues the request, and the Operator keeps trying indefinitely. If we know the request will never succeed (because it is a permission violation), we should stop trying. By retrying indefinitely, we will flood the logs with the same message over and over, and we can't know if the annotation in If the annotation in |
So just return I believe the default behavior when returning an error is to go into an exponential backoff (which is counterintuitive since there's no reason to try repeatedly initially when there's no chance of the problem being fixed). It will eventually space out the retries though. Were you seeing something different in your testing? I agree with forcing a reconcile on the relevant resource as being an acceptable approach as well - if you just want to go that route. |
@njbarber we don't have reasons to believe that in our testing the back-off was not happening. However, retrying an operation that you don't have the permissions to perform just generally seems pointless and counterproductive (will pollute logs, potentially generate alerts over and over, that kind of thing). |
Just returning Edit: The error in the checks seems to be a flake, I can run the same locally in a GKE cluster and it succeeds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did the last pass of QA. I'm happy with the current state. I will trigger another run of the PR workflow and merge once it passes.
This is what most users expect. Without this, the Messaging Topology Operator will not be able to manage objects from a different namespace. The implications are that all Brokers / Triggers / Sinks need to be created in the same namespace as the RabbitmqCluster. This annotation addresses that limitation. More context: - knative-extensions/eventing-rabbitmq#454 - rabbitmq/messaging-topology-operator#201 Signed-off-by: Gerhard Lazu <[email protected]>
#4532) This is what most users expect. Without this, the Messaging Topology Operator will not be able to manage objects from a different namespace. The implications are that all Brokers / Triggers / Sinks need to be created in the same namespace as the RabbitmqCluster. This annotation addresses that limitation. More context: - knative-extensions/eventing-rabbitmq#454 - rabbitmq/messaging-topology-operator#201 Signed-off-by: Gerhard Lazu <[email protected]>
This closes #147
Note to reviewers: remember to look at the commits in this PR and consider if they can be squashed
Note to contributors: remember to re-generate client set if there are any API changes
Summary Of Changes
When creating a new resource the namespace field of the cluster reference is checked against the
allowedTopologyNamespaces
field for a cluster (this effectively inverts the changes made as part of #83). If a namespace is not specified it assumes the cluster is in the same namespace as the resource being created.Additional Context
Related to rabbitmq/cluster-operator#785.