Unable to access openshift master via URL #16946

Carol007robot · 2017-10-19T09:52:13Z

[unable to access openshift master after I reboot the clusters in a one day, one time only one machine will be restated in the cluster(include all the masters and nodes]

Version

[oc v3.6.0+c4dd4cf
kubernetes v1.6.1+5115d708d7
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ooc-lb.com.cn:8443
kubernetes v1.6.1+5115d708d7
]

####My environment
3 masters
1 load balance
3 nodes

###in kube-service-catalog project there is only one pod is running, there should also have manger-contorller pod running under this project

# oc get pods
NAME              READY     STATUS    RESTARTS   AGE
apiserver-6qbdp   1/1       Running   20         37d
[root@ooc-master01 ansible]# oc project

Steps To Reproduce

shutdown system
bring back system

Current Result

When I access my master in URL I got below error I'll provide the log in the attach file.
master.log

Error

Unable to load details about the server. If the problem continues, please contact your system administrator.
Check Server Connection

Return to the console.
``

Expected Result

If any suggestion to how to fix it that would be appreciated

Additional Information

[ $ oc adm diagnostics

ERROR: [DNet3001 from diagnostic NetworkCheck@openshift/origin/pkg/diagnostics/network/setup.go:64]
       Failed to create network diags test pod and service: [Creating network diagnostic test pod 'network-diag-ns-4nz96/network-diag-test-pod-nrtx6' on node "ooc-master01.com.cn" failed: pods "network-diag-test-pod-nrtx6" is forbidden: service account network-diag-ns-4nz96/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-4nz96/network-diag-test-pod-mtvd3' on node "ooc-master01.com.cn" failed: pods "network-diag-test-pod-mtvd3" is forbidden: service account network-diag-ns-4nz96/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-61j80/network-diag-test-pod-dww53' on node "ooc-master01.com.cn" failed: pods "network-diag-test-pod-dww53" is forbidden: service account network-diag-ns-61j80/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-61j80/network-diag-test-pod-xfnvz' on node "ooc-master01.com.cn" failed: pods "network-diag-test-pod-xfnvz" is forbidden: service account network-diag-ns-61j80/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-4nz96/network-diag-test-pod-kgfrd' on node "ooc-master03.com.cn" failed: pods "network-diag-test-pod-kgfrd" is forbidden: service account network-diag-ns-4nz96/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-4nz96/network-diag-test-pod-p3509' on node "ooc-master03.com.cn" failed: pods "network-diag-test-pod-p3509" is forbidden: service account network-diag-ns-4nz96/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-61j80/network-diag-test-pod-wv46g' on node "ooc-master03.com.cn" failed: pods "network-diag-test-pod-wv46g" is forbidden: service account network-diag-ns-61j80/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-61j80/network-diag-test-pod-n8htz' on node "ooc-master03.com.cn" failed: pods "network-diag-test-pod-n8htz" is forbidden: service account network-diag-ns-61j80/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-4nz96/network-diag-test-pod-30mwd' on node "ooc-node01.com.cn" failed: pods "network-diag-test-pod-30mwd" is forbidden: service account network-diag-ns-4nz96/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-4nz96/network-diag-test-pod-g7p4q' on node "ooc-node01.com.cn" failed: pods "network-diag-test-pod-g7p4q" is forbidden: service account network-diag-ns-4nz96/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-61j80/network-diag-test-pod-fksbd' on node "ooc-node01.com.cn" failed: pods "network-diag-test-pod-fksbd" is forbidden: service account network-diag-ns-61j80/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-61j80/network-diag-test-pod-xvghg' on node "ooc-node01.com.cn" failed: pods "network-diag-test-pod-xvghg" is forbidden: service account network-diag-ns-61j80/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-4nz96/network-diag-test-pod-h7d28' on node "ooc-node02.com.cn" failed: pods "network-diag-test-pod-h7d28" is forbidden: service account network-diag-ns-4nz96/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-4nz96/network-diag-test-pod-g0pc4' on node "ooc-node02.com.cn" failed: pods "network-diag-test-pod-g0pc4" is forbidden: service account network-diag-ns-4nz96/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-61j80/network-diag-test-pod-v0c59' on node "ooc-node02.com.cn" failed: pods "network-diag-test-pod-v0c59" is forbidden: service account network-diag-ns-61j80/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-61j80/network-diag-test-pod-92jvb' on node "ooc-node02.com.cn" failed: pods "network-diag-test-pod-92jvb" is forbidden: service account network-diag-ns-61j80/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-4nz96/network-diag-test-pod-w9tgc' on node "ooc-node03.com.cn" failed: pods "network-diag-test-pod-w9tgc" is forbidden: service account network-diag-ns-4nz96/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-4nz96/network-diag-test-pod-t6snx' on node "ooc-node03.com.cn" failed: pods "network-diag-test-pod-t6snx" is forbidden: service account network-diag-ns-4nz96/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-61j80/network-diag-test-pod-ccpwv' on node "ooc-node03.com.cn" failed: pods "network-diag-test-pod-ccpwv" is forbidden: service account network-diag-ns-61j80/default was not found, retry after the service account is created, Creating network diagnostic test pod 'network-diag-ns-61j80/network-diag-test-pod-dz37v' on node "ooc-node03.com.cn" failed: pods "network-diag-test-pod-dz37v" is forbidden: service account network-diag-ns-61j80/default was not found, retry after the service account is created]
              
              ERROR: [DNet2006 from diagnostic NetworkCheck@openshift/origin/pkg/diagnostics/network/run_pod.go:136]
                     Creating network diagnostic pod "network-diag-pod-d62h5" on node "ooc-master01.com.cn" with command "openshift infra network-diagnostic-pod -l 1" failed: pods "network-diag-pod-d62h5" is forbidden: service account network-diag-ns-4nz96/default was not found, retry after the service account is created

]

I debbug for couple of days, but I really don't know how to fix it, and I can't beleive this is broken because I reboot the cluster...

The text was updated successfully, but these errors were encountered:

Carol007robot · 2017-10-23T09:15:17Z

I noticed the controller-manager pos is keeping restarting, and here is the log of the controller-manager pod:
I1023 09:10:42.757858 1 flags.go:52] FLAG: --address="0.0.0.0" I1023 09:10:42.758310 1 flags.go:52] FLAG: --alsologtostderr="false" I1023 09:10:42.758316 1 flags.go:52] FLAG: --api-content-type="application/json" I1023 09:10:42.758321 1 flags.go:52] FLAG: --broker-relist-interval="24h0m0s" I1023 09:10:42.758325 1 flags.go:52] FLAG: --contention-profiling="false" I1023 09:10:42.758329 1 flags.go:52] FLAG: --enable-osb-api-context-profile="true" I1023 09:10:42.758332 1 flags.go:52] FLAG: --feature-gates="" I1023 09:10:42.758337 1 flags.go:52] FLAG: --k8s-api-server-url="" I1023 09:10:42.758339 1 flags.go:52] FLAG: --k8s-kubeconfig="" I1023 09:10:42.758345 1 flags.go:52] FLAG: --leader-elect="true" I1023 09:10:42.758348 1 flags.go:52] FLAG: --leader-elect-lease-duration="15s" I1023 09:10:42.758351 1 flags.go:52] FLAG: --leader-elect-renew-deadline="10s" I1023 09:10:42.758353 1 flags.go:52] FLAG: --leader-elect-resource-lock="endpoints" I1023 09:10:42.758356 1 flags.go:52] FLAG: --leader-elect-retry-period="2s" I1023 09:10:42.758359 1 flags.go:52] FLAG: --leader-election-namespace="kube-service-catalog" I1023 09:10:42.758361 1 flags.go:52] FLAG: --log-backtrace-at=":0" I1023 09:10:42.758367 1 flags.go:52] FLAG: --log-dir="" I1023 09:10:42.758370 1 flags.go:52] FLAG: --log-flush-frequency="5s" I1023 09:10:42.758373 1 flags.go:52] FLAG: --logtostderr="true" I1023 09:10:42.758375 1 flags.go:52] FLAG: --osb-api-preferred-version="2.13" I1023 09:10:42.758378 1 flags.go:52] FLAG: --port="10000" I1023 09:10:42.758383 1 flags.go:52] FLAG: --profiling="true" I1023 09:10:42.758386 1 flags.go:52] FLAG: --reconciliation-retry-duration="168h0m0s" I1023 09:10:42.758388 1 flags.go:52] FLAG: --resync-interval="5m0s" I1023 09:10:42.758391 1 flags.go:52] FLAG: --service-catalog-api-server-url="" I1023 09:10:42.758393 1 flags.go:52] FLAG: --service-catalog-insecure-skip-verify="false" I1023 09:10:42.758397 1 flags.go:52] FLAG: --service-catalog-kubeconfig="" I1023 09:10:42.758399 1 flags.go:52] FLAG: --stderrthreshold="2" I1023 09:10:42.758405 1 flags.go:52] FLAG: --v="5" I1023 09:10:42.758408 1 flags.go:52] FLAG: --version="false" I1023 09:10:42.758410 1 flags.go:52] FLAG: --vmodule="" I1023 09:10:42.758429 1 controller_manager.go:96] Building k8s kubeconfig I1023 09:10:42.759530 1 controller_manager.go:124] Building service-catalog kubeconfig for url: I1023 09:10:42.759540 1 controller_manager.go:131] Using inClusterConfig to talk to service catalog API server -- make sure your API server is registered with the aggregator I1023 09:10:42.759621 1 controller_manager.go:144] Starting http server and mux I1023 09:10:42.759633 1 controller_manager.go:173] Creating event broadcaster I1023 09:10:42.759708 1 controller_manager.go:213] Using namespace kube-service-catalog for leader election lock I1023 09:10:42.759721 1 leaderelection.go:174] attempting to acquire leader lease... I1023 09:10:42.759801 1 healthz.go:74] Installing healthz checkers:"ping", "checkAPIAvailableResources" E1023 09:10:42.812037 1 event.go:260] Could not construct reference to: '&v1.Endpoints{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"service-catalog-controller-manager", GenerateName:"", Namespace:"kube-service-catalog", SelfLink:"/api/v1/namespaces/kube-service-catalog/endpoints/service-catalog-controller-manager", UID:"0e3ccb61-b5e2-11e7-8d47-001a4a16019e", ResourceVersion:"4691457", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{sec:63644133616, nsec:0, loc:(*time.Location)(0x1a265c0)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{"control-plane.alpha.kubernetes.io/leader":"{\"holderIdentity\":\"controller-manager-gs044-external-service-catalog-controller\",\"leaseDurationSeconds\":15,\"acquireTime\":\"2017-10-20T22:00:19Z\",\"renewTime\":\"2017-10-23T09:10:42Z\",\"leaderTransitions\":0}"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Subsets:[]v1.EndpointSubset(nil)}' due to: 'no kind is registered for the type v1.Endpoints'. Will not report event: 'Normal' 'LeaderElection' 'controller-manager-gs044-external-service-catalog-controller became leader' I1023 09:10:42.812154 1 leaderelection.go:184] successfully acquired lease kube-service-catalog/service-catalog-controller-manager I1023 09:10:42.812206 1 controller_manager.go:297] Getting available resources I1023 09:10:42.812830 1 controller_manager.go:259] Created client for API discovery F1023 09:10:43.140136 1 controller_manager.go:198] error running controllers: failed to get supported resources from server: unable to retrieve the complete list of server APIs: servicecatalog.k8s.io/v1alpha1: the server could not find the requested resource
May I get some help soon, i've been stuck in here almost two weeks, but I still find any solution. And I'm not able to access the cluster before it be fixed.

Thank you!

seans3 · 2017-10-24T20:59:14Z

I ran into this same problem today. I believe this is caused by service catalog (servicecatalog.k8s.io) API recently changing from "v1alpha1" to "v1beta1".

Carol007robot · 2017-10-25T02:10:20Z

Hi Thanks for your reply, so how can I change it? How many settings should I modify? Any advice will be appreciated Thanks. - Carol

…

________________________________ From: Michael Kibbe <[email protected]> Sent: Wednesday, October 25, 2017 5:29:05 AM To: openshift/origin Cc: Carol Li; Mention Subject: Re: [openshift/origin] Unable to access openshift master via URL (#16946) @Carolhug<https://github.com/carolhug> Are any of your YAML files referencing servicecatalog.k8s.io/v1alpha1? We've moved the Service Catalog API to v1beta1 in the last few weeks, so the controller-manager should not be looking for the v1alpha1 API resource. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#16946 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AQ_30C5lKL4XWWYAzFX_nSrd-cOA82fTks5svlahgaJpZM4P--YQ>.

lalib · 2017-10-26T20:58:57Z

Hey, I am having the same problem. Any updates?

kibbles-n-bytes · 2017-10-27T00:12:43Z

@Carolhug @lalib You should search your Service Catalog installation YAML files for servicecatalog.k8s.io/v1alpha1. If that is present, your YAMLs are stale compared to your images and you should update to the latest versions in the Service Catalog repo.

Carol007robot · 2017-10-27T02:50:14Z

Hi Michael, So how can I update the service catalog version? I removed the origin-service-catalog pods from my master when I debugging.... Then I downloaded docker.io/openshift/origin-service-catalog:latest version and pushed it to my local repo, use oc new-app to install it, now the status is dc/origin-service-catalog deploys ooc-registry.anim.com.cn:5000/openshift/origin-service-catalog:latest deployment #1 waiting on update It is already the latest version, but it's waiting for the update, I'm not quite understood why it is "waiting" status, usually, after I use "oc new-app" command, it should automatically deploy the new app. - Carol

…

________________________________ From: Michael Kibbe <[email protected]> Sent: Friday, October 27, 2017 8:12:52 AM To: openshift/origin Cc: Carol Li; Mention Subject: Re: [openshift/origin] Unable to access openshift master via URL (#16946) @Carolhug<https://github.com/carolhug> @lalib<https://github.com/lalib> You should search your Service Catalog installation YAML files for servicecatalog.k8s.io/v1alpha1. If that is present, your YAMLs are stale compared to your images and you should update to the latest versions in the Service Catalog repo. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#16946 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AQ_30BJLxzsBYWtZ5UmB1C_-J9hicoX1ks5swSAEgaJpZM4P--YQ>.

openshift-bot · 2018-02-24T05:22:38Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2018-03-26T05:29:24Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2018-04-25T05:33:27Z

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

pweil- added kind/question priority/P2 sig/master labels Oct 19, 2017

pweil- assigned mfojtik Oct 19, 2017

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 24, 2018

openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 26, 2018

openshift-ci-robot closed this as completed Apr 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to access openshift master via URL #16946

Unable to access openshift master via URL #16946

Carol007robot commented Oct 19, 2017

Carol007robot commented Oct 23, 2017

seans3 commented Oct 24, 2017

Carol007robot commented Oct 25, 2017 via email

lalib commented Oct 26, 2017

kibbles-n-bytes commented Oct 27, 2017

Carol007robot commented Oct 27, 2017 via email

openshift-bot commented Feb 24, 2018

openshift-bot commented Mar 26, 2018

openshift-bot commented Apr 25, 2018

Unable to access openshift master via URL #16946

Unable to access openshift master via URL #16946

Comments

Carol007robot commented Oct 19, 2017

Version

Steps To Reproduce

Current Result

Expected Result

Additional Information

Carol007robot commented Oct 23, 2017

seans3 commented Oct 24, 2017

Carol007robot commented Oct 25, 2017 via email

lalib commented Oct 26, 2017

kibbles-n-bytes commented Oct 27, 2017

Carol007robot commented Oct 27, 2017 via email

openshift-bot commented Feb 24, 2018

openshift-bot commented Mar 26, 2018

openshift-bot commented Apr 25, 2018