Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Following OCP Route feature, Controller crash #30

Closed
tarilabs opened this issue Nov 18, 2023 · 2 comments · Fixed by #34
Closed

Following OCP Route feature, Controller crash #30

tarilabs opened this issue Nov 18, 2023 · 2 comments · Fixed by #34

Comments

@tarilabs
Copy link
Member

Describe the bug
Following #19
The MR Operator on OCP goes into CrashLoopBackOff.

To Reproduce
Steps to reproduce the behavior:

  1. Using :latest (:main-21f99da) containing 21f99da results in:
W1118 19:07:33.394044       1 reflector.go:535] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:229: failed to list *v1.Route: routes.route.openshift.io is forbidden: User "system:serviceaccount:model-registry-operator-system:model-registry-operator-controller-manager" cannot list resource "routes" in API group "route.openshift.io" at the cluster scope
E1118 19:07:33.394073       1 reflector.go:147] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:229: Failed to watch *v1.Route: failed to list *v1.Route: routes.route.openshift.io is forbidden: User "system:serviceaccount:model-registry-operator-system:model-registry-operator-controller-manager" cannot list resource "routes" in API group "route.openshift.io" at the cluster scope
2023-11-18T19:08:13Z	ERROR	Could not wait for Cache to sync	{"controller": "modelregistry", "controllerGroup": "modelregistry.opendatahub.io", "controllerKind": "ModelRegistry", "error": "failed to wait for modelregistry caches to sync: timed out waiting for cache to be synced for Kind *v1alpha1.ModelRegistry"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:203
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:208
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234
sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/runnable_group.go:223
2023-11-18T19:08:13Z	INFO	Stopping and waiting for non leader election runnables
2023-11-18T19:08:13Z	INFO	Stopping and waiting for leader election runnables
2023-11-18T19:08:13Z	INFO	Stopping and waiting for caches
2023-11-18T19:08:13Z	ERROR	controller-runtime.source.EventHandler	failed to get informer from cache	{"error": "Timeout: failed waiting for *v1.Route Informer to sync"}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/source/kind.go:68
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func1
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/loop.go:49
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/loop.go:50
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/source/kind.go:56
2023-11-18T19:08:13Z	INFO	Stopping and waiting for webhooks
2023-11-18T19:08:13Z	INFO	Stopping and waiting for HTTP servers
2023-11-18T19:08:13Z	INFO	shutting down server	{"kind": "health probe", "addr": "[::]:8081"}
2023-11-18T19:08:13Z	INFO	controller-runtime.metrics	Shutting down metrics server with timeout of 1 minute
2023-11-18T19:08:13Z	INFO	Wait completed, proceeding to shutdown the manager
2023-11-18T19:08:13Z	ERROR	setup	problem running manager	{"error": "failed to wait for modelregistry caches to sync: timed out waiting for cache to be synced for Kind *v1alpha1.ModelRegistry"}
main.main
	/workspace/cmd/main.go:151
runtime.main
	/usr/local/go/src/runtime/proc.go:267

With CrashLoopBackOff behaviour:

image

And so creating ModelRegistry CR does not sort any effect in the destination ODH Project namespace.

Expected behavior
Using previous container image main-5c1b0db which did not contain the OCP Route feature, works as expected:

diff --git a/config/manager/manager.yaml b/config/manager/manager.yaml
index 9154cef..663784b 100644
--- a/config/manager/manager.yaml
+++ b/config/manager/manager.yaml
@@ -70,7 +70,7 @@ spec:
         - /manager
         args:
         - --leader-elect
-        image: quay.io/opendatahub/model-registry-operator:latest
+        image: quay.io/opendatahub/model-registry-operator:main-5c1b0db
         name: manager
         env:
           - name: GRPC_IMAGE

Now creating ModelRegistry CR creates the MR deployment in the destination ODH Project namespace:

Screenshot 2023-11-18 at 17 09 47

Additional context
I hope this is of service :) glad to provide more info as needed

@tarilabs
Copy link
Member Author

@dhirajsb
Copy link
Contributor

Saw this error earlier, forgot to push the PR to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging a pull request may close this issue.

2 participants