-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove cert-manager as dependency #727
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #727 +/- ##
==========================================
+ Coverage 80.20% 83.03% +2.82%
==========================================
Files 64 77 +13
Lines 4492 5978 +1486
==========================================
+ Hits 3603 4964 +1361
- Misses 600 669 +69
- Partials 289 345 +56
Flags with carried forward coverage won't be shown. Click here to find out more.
|
This makes sense to me @eguzki I guess once a release that supports all namespaces is available we can move back to having it as a dependency? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@mikenairn might be worth a look over also?
Seems fine, but should we not consider just adding this as a pattern for all polices that have a dependency on CRDs not provided by the kuadrant operator i.e. all of them? I know we control the other operators currently required, and therefore less likely to be an issue, but a consistent message on any policy that can't work due to missing CRDs might be beneficial no? Presumably this could be an issue for these operators also if they happened to be installed on cluster already, at least it seems we can't be sure what OLM is going to do. |
Added verification steps. I like doing that, only just for the record and future reference. |
Once this PR is merged, @maleck13 can you add the new "documentation requirement"? Or give me a hint about where should I add that. |
I can verify this fixes the automatic installation of RH cert-manager on Openshift. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look fine, but as pointed out in chat, some of the changes here might be an issue #715
Indeed, if the TLS policy controller starts watching not exiting types, the controller needs to be disabled, and therefore the status would not be populated. I proposed an independent common status controller to fulfill that need in the following issue #730 |
makes sense
Could be done as some other issue, but I think so yes; if someone installs our operator with some kube distro with a UI, and doesn't install stuff that put other CRDs we need for certain policies in place, we don't really surface that well right now. easy for users to end up with a broken install, with little idea as to why. |
I think it should probably go in here https://docs.kuadrant.io/0.8.0/kuadrant-operator/doc/install/install-openshift/ |
@@ -243,9 +261,9 @@ spec: | |||
storage: | |||
redis-cached: | |||
configSecretRef: | |||
name: redis-config | |||
EOF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My IDE does not like trailing spaces :) And me neither.
@maleck13 kindly asking for review regarding the last commit about the update on the openshift installation doc. |
|
||
Install one of the different flavours of the Cert-Manager. | ||
|
||
#### Install community version of the cert-manager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to specify a version or min version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did not specify version or require min version until now.
Is there really a min version of cert-manager that kuadrant supports? @mikenairn ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we did when the dependency was based on the operator, not on the API:
- type: olm.package
value:
packageName: cert-manager
version: "1.14.2"
Adding that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version of cert manager we are currently testing with, at least from this repo, is "1.12.1" https://github.com/Kuadrant/kuadrant-operator/blob/main/config/dependencies/cert-manager/kustomization.yaml#L2. Not sure what min version we support really, but I'd probably just say that version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Looks good. Minor comment but not critical |
b760133
to
52ac40c
Compare
rebased to fix conflicts with current Please review at your leisure @mikenairn @maleck13 |
controllers/tlspolicy_status.go
Outdated
|
||
return kuadrant.AcceptedCondition(tlsPolicy, specErr) | ||
} | ||
|
||
func (r *TLSPolicyReconciler) enforcedCondition(ctx context.Context, tlsPolicy *v1alpha1.TLSPolicy, targetNetworkObject client.Object) *metav1.Condition { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this relevant now since #715. The controller no longer gets started if the cert manager CRDs are missing so should never get here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, reverting the changes done for the status reporting as they are useless. Pending tasks for a good reporting captured in #730
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually the way we've chosen to go with the Helm Charts too. Same issue with namespacing is affecting Helm Chart dependencies, so it'd be explicitly stated as a dependency in the docs.
52ac40c
to
a0e847a
Compare
a0e847a
to
ef04624
Compare
@mikenairn some changes were reverted. The status logic does not change. Looking for review here |
What
While this older #680 change was removing cert-manager operator as dependency and, instead, add cert-manager API as dependency, this PR is completely removing cert-manager as dependency for the operator.
Additionally, TLS policy status reports with some understandable error when the cert-manager is not available in the cluster.
Why
The cert-manager is, in fact, a direct dependency of the kuadrant controller (regarding the TLS policy) and indirectly for authorino (required for the webhook deployment). Unfortunately, the dependency declaration of the cert-manager, either as operator or as API, creates unsolvable (from our side) issues in Openshift.
And the issue comes from the RH build of the cert-manager operator . This operator, only available for openshift, provides the same APIs as the upstream community version of the cert-manager operator. That overlapping APIs, was an issue when kuadrant had dependency on the operator. When the openshift cluster had the RH build of the cert-manager operator installed, the kuadrant requirement could not be met as upstream cert-manager was unable to install. On #680 we fixed this issue changing the dependency type from operator to API level. Unfortunately, there are still issues. And the main issue is that The RH build of the cert-manager cannot be installed in the same namespace as the kuadrant operator. The reason being that the kuadrant operator is cluster wide and the RH cert-manager is namespace scoped operator. The OperatorGroup does not enable installing different scoped operators in the same namespace.
Even if the RH cert manager is installed in another NS A, when installing kuadrant in some NS K, OLM tries to install yet another RH cert-manager operator instance in the NS K. Which fails for the reasons explained above.
Extracted from the installplan resource:
My guess (just a guess) is that OLM thinks that even if the GVK exists, nobody is watching for it in the specified namespace, hence it tries to install an operator that fulfill the needs in the specified namespace.
Interesting to note that even if the community cert-manager is installed before Kuadrant, the RH cert-manager will still try to be installed in the same namespace as kuadrant. Not sure OLM is behaving as it should. We might have found a bug here. If we re-add cert-manager APIs as dependencies, we need to verify that when the upstream operator is pre-installed, the RH build operator is not being installed in openshift.
The Plan
The plan now is to have the cert-manager API as documentation requirement. The pre-requisite for kuadrant to work as expected is that the cert-manager APIs are in place in the cluster.
The RH build of the cert-manager is planning cluster wide scope feature. Tracked here: openshift/cert-manager-operator#188.
Once a release of RH cert-manager that supports all namespaces is available, we can move back to having it as a dependency. One test we will need to do after bringing the cert-manager dependency back: verify that when the upstream operator is pre-installed, the RH build operator is not being installed in openshift. Created issue for this #729
Verification steps
The operators should be installed correctly. Wait until the CSV reports all the operators succeeded.
This is the key change of this PR: the cert-manager has NOT been installed
Let's test the TLSPolicy on this cert-manager-less scenario:
TLSPolicy
to configure TLS:The TLSPolicy controller should be disabled with the following error log
The TLSPolicy instance status should be empty as the controller is not enabled. There is an open issue #730 for adding some understandable error when the cert-manager is not available in the cluster.