Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] You must be logged in to the server (the server has asked for the client to provide credentials #508

Closed
SataQiu opened this issue Sep 28, 2021 · 8 comments · Fixed by #509
Labels
kind/bug kind/bug

Comments

@SataQiu
Copy link
Member

SataQiu commented Sep 28, 2021

What happened:
When I upgraded OpenYurt to the latest version(v0.5.0), I found that kubectl logs could not work properly and the following error occurred:

% kubectl logs -f -n kube-system          yurt-tunnel-agent-x9xkr  
error: You must be logged in to the server (the server has asked for the client to provide credentials ( pods/log yurt-tunnel-agent-x9xkr))

What you expected to happen:
kubectl logs can work well.

How to reproduce it (as minimally and precisely as possible):
Install OpenYurt v0.5.0 in kubernetes 1.20.2.

Anything else we need to know?:

Environment:

  • OpenYurt version: v0.5.0
  • Kubernetes version (use kubectl version): v1.20.2
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

others

/kind bug

@SataQiu SataQiu added the kind/bug kind/bug label Sep 28, 2021
@rambohe-ch
Copy link
Member

@SataQiu Thank you for raising issue.
Would you be able to upload logs of yurt-tunnel-agent?

@SataQiu
Copy link
Member Author

SataQiu commented Sep 28, 2021

tunnel-agent logs:

I0928 12:54:31.355689       1 start.go:51] yurttunnel-agent version: projectinfo.Info{GitVersion:"v0.5.0", GitCommit:"095cc4c", BuildDate:"2021-09-28T07:29:49Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"}
I0928 12:54:31.356569       1 options.go:136] ipv4=172.18.0.3&host=kind-worker3 is set for agent identifies
I0928 12:54:31.357121       1 options.go:141] neither --kube-config nor --apiserver-addr is set, will use /etc/kubernetes/kubelet.conf as the kubeconfig
I0928 12:54:31.357437       1 options.go:145] create the clientset based on the kubeconfig(/etc/kubernetes/kubelet.conf).
I0928 12:54:31.400735       1 start.go:87] yurttunnel-server address: 172.18.0.4:30749
W0928 12:54:31.401125       1 filestore_wrapper.go:49] unexpected error occurred when loading the certificate: no cert/key files read at "/var/lib/yurttunnel-agent/pki/yurttunnel-agent-current.pem", ("", "") or ("/var/lib/yurttunnel-agent/pki", "/var/lib/yurttunnel-agent/pki"), will regenerate it
I0928 12:54:36.401603       1 start.go:106] certificate yurttunnel-agent ok
I0928 12:54:36.402656       1 anpagent.go:57] start serving grpc request redirected from yurttunnel-server: 172.18.0.4:30749
I0928 12:54:36.403019       1 util.go:71] "start handling meta requests(metrics/pprof)" server endpoint="127.0.0.1:10266"
I0928 12:54:36.428922       1 client.go:224] "Connect to" server="88aa9e09-d873-45be-9c61-f141f6b0ec0f"
I0928 12:54:36.428996       1 clientset.go:190] "sync added client connecting to proxy server" serverID="88aa9e09-d873-45be-9c61-f141f6b0ec0f"
I0928 12:54:36.429085       1 client.go:326] "Start serving" serverID="88aa9e09-d873-45be-9c61-f141f6b0ec0f"
I0928 12:54:41.500828       1 client.go:224] "Connect to" server="5c55b9cf-11b6-49da-bc2f-430896ae81cc"
I0928 12:54:41.500854       1 clientset.go:190] "sync added client connecting to proxy server" serverID="5c55b9cf-11b6-49da-bc2f-430896ae81cc"
I0928 12:54:41.500880       1 client.go:326] "Start serving" serverID="5c55b9cf-11b6-49da-bc2f-430896ae81cc"
I0928 12:55:25.318094       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=1555795336100853140 and connID=1
I0928 12:55:25.332164       1 client.go:382] "close connection" connectionID=1
I0928 12:55:27.194041       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=3056332746016649150 and connID=2
I0928 12:55:27.213998       1 client.go:464] "connection read EOF" connID=2
I0928 12:55:27.214064       1 client.go:382] "close connection" connectionID=2
I0928 12:55:27.958057       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=7636288850561476182 and connID=3
I0928 12:55:27.966007       1 client.go:464] "connection read EOF" connID=3
I0928 12:55:27.966151       1 client.go:382] "close connection" connectionID=3
I0928 13:02:13.202390       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=4562621698571384881 and connID=4
I0928 13:02:13.222481       1 client.go:464] "connection read EOF" connID=4
I0928 13:02:13.222586       1 client.go:382] "close connection" connectionID=4
I0928 13:02:14.422322       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=6555455521637047278 and connID=5
I0928 13:02:14.428887       1 client.go:382] "close connection" connectionID=5
I0928 13:02:15.282107       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=2522543887406335640 and connID=6
I0928 13:02:15.291205       1 client.go:382] "close connection" connectionID=6
I0928 13:02:16.231664       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=3759749631911308224 and connID=7
I0928 13:02:16.238898       1 client.go:382] "close connection" connectionID=7
I0928 13:02:17.429475       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=8391066773399974457 and connID=8
I0928 13:02:17.437016       1 client.go:382] "close connection" connectionID=8
I0928 13:10:42.812960       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=2509171113612625690 and connID=9
I0928 13:10:42.822683       1 client.go:382] "close connection" connectionID=9
I0928 13:12:54.990022       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=6343871253887740107 and connID=10
I0928 13:12:55.001160       1 client.go:382] "close connection" connectionID=10
I0928 13:14:21.162805       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=4125643631056849026 and connID=11
I0928 13:14:21.176111       1 client.go:382] "close connection" connectionID=11
I0928 13:14:22.208597       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=1746652745667222531 and connID=12
I0928 13:14:22.217670       1 client.go:382] "close connection" connectionID=12
I0928 13:14:23.142232       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=911755937249800616 and connID=13
I0928 13:14:23.151149       1 client.go:464] "connection read EOF" connID=13
I0928 13:14:23.151378       1 client.go:382] "close connection" connectionID=13
I0928 13:14:29.711384       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=7226258822720332319 and connID=14
I0928 13:14:29.721247       1 client.go:464] "connection read EOF" connID=14
I0928 13:14:29.721294       1 client.go:382] "close connection" connectionID=14
I0928 13:16:03.354157       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=2504855395218381734 and connID=15
I0928 13:16:03.362926       1 client.go:382] "close connection" connectionID=15
I0928 13:16:04.889611       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=5991972744052607848 and connID=16
I0928 13:16:04.901694       1 client.go:464] "connection read EOF" connID=16
I0928 13:16:04.901747       1 client.go:382] "close connection" connectionID=16
I0928 13:16:07.067271       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=5685897123094948608 and connID=17
I0928 13:16:07.075385       1 client.go:382] "close connection" connectionID=17
I0928 13:16:12.315329       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=3288479499029815372 and connID=18
I0928 13:16:12.326525       1 client.go:464] "connection read EOF" connID=18
I0928 13:16:12.326572       1 client.go:382] "close connection" connectionID=18
I0928 13:24:54.871983       1 client.go:412] received dial request to tcp:172.18.0.3:10250 with random=675441569950723324 and connID=19
I0928 13:24:54.878810       1 client.go:382] "close connection" connectionID=19

close connection is OK, but connection read EOF is unexpected.

@DrmagicE
Copy link
Member

Same issue when I try to convert cluster with the yurtctl convert command.
tunnel-server logs:

I0928 13:54:13.406567       1 start.go:54] yurttunnel-server version: projectinfo.Info{GitVersion:"latest", GitCommit:"095cc4c", BuildDate:"2021-09-27T07:19:50Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"}
W0928 13:54:13.407283       1 client_config.go:552] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0928 13:54:13.409294       1 options.go:155] yurttunnel server config: &config.Config{EgressSelectorEnabled:false, EnableIptables:true, EnableDNSController:true, IptablesSyncPeriod:60, DNSSyncPeriod:1800, CertDNSNames:[]string{}, CertIPs:[]net.IP{}, ListenAddrForAgent:"192.168.33.220:10262", ListenAddrForMaster:"192.168.33.220:10263", ListenInsecureAddrForMaster:"192.168.33.220:10264", ListenMetaAddr:"192.168.33.220:10265", RootCert:(*x509.CertPool)(0xc00040f800), Client:(*kubernetes.Clientset)(0xc0002f3ce0), SharedInformerFactory:(*informers.sharedInformerFactory)(0xc0004692c0), ServerCount:1, ProxyStrategy:"destHost", InterceptorServerUDSFile:""}
I0928 13:54:13.409721       1 leaderelection.go:242] attempting to acquire leader lease  kube-system/tunnel-dns-controller...
I0928 13:54:13.421617       1 leaderelection.go:252] successfully acquired lease kube-system/tunnel-dns-controller
I0928 13:54:13.421999       1 dns.go:195] starting tunnel dns controller
I0928 13:54:13.422013       1 shared_informer.go:223] Waiting for caches to sync for tunnel-dns-controller
E0928 13:54:13.448876       1 iptables.go:200] failed to delete rule that nat chain OUTPUT jumps to TUNNEL-PORT: error checking rule: exit status 2: iptables v1.6.0: Couldn't load target `TUNNEL-PORT':No such file or directory

Try `iptables -h' or 'iptables --help' for more information.
I0928 13:54:18.461317       1 certmanager.go:82] subject of tunnel server certificate, ips=[]net.IP{net.IP{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xff, 0xff, 0xc0, 0xa8, 0x21, 0xdc}, net.IP{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xff, 0xff, 0xa, 0x63, 0xf9, 0xbb}, net.IP{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xff, 0xff, 0x7f, 0x0, 0x0, 0x1}, net.IP{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xff, 0xff, 0xc0, 0xa8, 0x21, 0xdc}, net.IP{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xff, 0xff, 0xa, 0x64, 0xde, 0x4c}}, dnsNames=[]string{"x-tunnel-server-svc", "x-tunnel-server-svc.kube-system", "x-tunnel-server-svc.kube-system.svc", "x-tunnel-server-svc.kube-system.svc.cluster.local", "x-tunnel-server-internal-svc", "x-tunnel-server-internal-svc.kube-system", "x-tunnel-server-internal-svc.kube-system.svc", "x-tunnel-server-internal-svc.kube-system.svc.cluster.local"}
W0928 13:54:18.461401       1 filestore_wrapper.go:49] unexpected error occurred when loading the certificate: no cert/key files read at "/var/lib/yurttunnel-server/pki/yurttunnel-server-current.pem", ("", "") or ("/var/lib/yurttunnel-server/pki", "/var/lib/yurttunnel-server/pki"), will regenerate it
I0928 13:54:18.462949       1 csrapprover.go:55] starting the crsapprover
I0928 13:54:18.504387       1 csrapprover.go:125] non-approved and non-denied csr, enqueue: csr-6txcs
I0928 13:54:18.522259       1 shared_informer.go:230] Caches are synced for tunnel-dns-controller
I0928 13:54:18.579981       1 csrapprover.go:193] successfully approve yurttunnel csr(csr-6txcs)
I0928 13:54:19.297302       1 iptables.go:456] clear conntrack entries for ports ["10250" "10255"] and nodes ["192.168.33.220" "192.168.33.221"]
E0928 13:54:19.302247       1 iptables.go:473] clear conntrack for 192.168.33.220:10250 failed: "conntrack v1.4.4 (conntrack-tools): 0 flow entries have been deleted.\n", error message: exit status 1
E0928 13:54:19.304269       1 iptables.go:473] clear conntrack for 192.168.33.221:10250 failed: "conntrack v1.4.4 (conntrack-tools): 0 flow entries have been deleted.\n", error message: exit status 1
E0928 13:54:19.306106       1 iptables.go:473] clear conntrack for 192.168.33.220:10255 failed: "conntrack v1.4.4 (conntrack-tools): 0 flow entries have been deleted.\n", error message: exit status 1
E0928 13:54:19.308030       1 iptables.go:473] clear conntrack for 192.168.33.221:10255 failed: "conntrack v1.4.4 (conntrack-tools): 0 flow entries have been deleted.\n", error message: exit status 1
I0928 13:54:19.308045       1 iptables.go:525] directly access nodes changed, [192.168.33.220 192.168.33.221] for ports [10250 10255]
E0928 13:54:20.053038       1 dns.go:355] failed to sync dns record as whole, failed to update configmap kube-system/yurt-tunnel-nodes, Operation cannot be fulfilled on configmaps "yurt-tunnel-nodes": the object has been modified; please apply your changes to the latest version and try again
I0928 13:54:23.491833       1 anpserver.go:106] start handling request from interceptor
I0928 13:54:23.491869       1 tracereq.go:63] 1 informer synced in traceReqMiddleware
I0928 13:54:23.492139       1 anpserver.go:142] start handling https request from master at 192.168.33.220:10263
I0928 13:54:23.492175       1 anpserver.go:156] start handling http request from master at 192.168.33.220:10264
I0928 13:54:23.492441       1 anpserver.go:194] start handling connection from agents
I0928 13:54:23.492676       1 util.go:71] "start handling meta requests(metrics/pprof)" server endpoint="192.168.33.220:10265"
I0928 13:55:14.196766       1 csrapprover.go:119] csr(csr-t7z44) is not yurttunnel csr
I0928 13:55:14.205125       1 csrapprover.go:119] csr(csr-t7z44) is not yurttunnel csr
I0928 13:55:14.209079       1 csrapprover.go:119] csr(csr-t7z44) is not yurttunnel csr
I0928 13:55:19.197838       1 csrapprover.go:119] csr(csr-lgb2j) is not yurttunnel csr
I0928 13:55:19.209308       1 csrapprover.go:119] csr(csr-lgb2j) is not yurttunnel csr
I0928 13:55:19.215775       1 csrapprover.go:119] csr(csr-lgb2j) is not yurttunnel csr
I0928 13:55:45.276241       1 csrapprover.go:125] non-approved and non-denied csr, enqueue: csr-2hhvr
I0928 13:55:45.287162       1 csrapprover.go:193] successfully approve yurttunnel csr(csr-2hhvr)
I0928 13:56:19.343746       1 iptables.go:456] clear conntrack entries for ports ["10250" "10255"] and nodes ["192.168.33.221"]
E0928 13:56:19.346479       1 iptables.go:473] clear conntrack for 192.168.33.221:10250 failed: "conntrack v1.4.4 (conntrack-tools): 0 flow entries have been deleted.\n", error message: exit status 1
E0928 13:56:19.348568       1 iptables.go:473] clear conntrack for 192.168.33.221:10255 failed: "conntrack v1.4.4 (conntrack-tools): 0 flow entries have been deleted.\n", error message: exit status 1
I0928 13:56:19.348584       1 iptables.go:525] directly access nodes changed, [192.168.33.220] for ports [10250 10255]
E0928 13:56:31.450596       1 server.go:725] "send to client stream failure" err="write unix /tmp/interceptor-proxier.sock->@: use of closed network connection"
E0928 13:56:31.450619       1 server.go:735] "could not get frontent client" err="can't find agentID worker-1 in the frontends"

tunnel-agent logs:

I0928 13:55:45.254244       1 start.go:51] yurttunnel-agent version: projectinfo.Info{GitVersion:"latest", GitCommit:"095cc4c", BuildDate:"2021-09-27T07:19:50Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"}
I0928 13:55:45.254308       1 options.go:136] ipv4=192.168.33.221&host=worker-1 is set for agent identifies
I0928 13:55:45.254313       1 options.go:141] neither --kube-config nor --apiserver-addr is set, will use /etc/kubernetes/kubelet.conf as the kubeconfig
I0928 13:55:45.254316       1 options.go:145] create the clientset based on the kubeconfig(/etc/kubernetes/kubelet.conf).
I0928 13:55:45.282917       1 start.go:87] yurttunnel-server address: 192.168.33.220:31008
W0928 13:55:45.283048       1 filestore_wrapper.go:49] unexpected error occurred when loading the certificate: no cert/key files read at "/var/lib/yurttunnel-agent/pki/yurttunnel-agent-current.pem", ("", "") or ("/var/lib/yurttunnel-agent/pki", "/var/lib/yurttunnel-agent/pki"), will regenerate it
I0928 13:55:50.286450       1 start.go:106] certificate yurttunnel-agent ok
I0928 13:55:50.286705       1 anpagent.go:57] start serving grpc request redirected from yurttunnel-server: 192.168.33.220:31008
I0928 13:55:50.286895       1 util.go:71] "start handling meta requests(metrics/pprof)" server endpoint="127.0.0.1:10266"
E0928 13:55:50.287754       1 util.go:75] "meta server could not listen" err="listen tcp 127.0.0.1:10266: bind: address already in use"
I0928 13:55:50.287767       1 util.go:77] "meta server stopped listening" server endpoint="127.0.0.1:10266"

@DrmagicE
Copy link
Member

After some digging, I found this bug is caused by "key usage" missing of the yurt-tunnel-server certificate.
The kubelet is complaining:

9月 28 22:11:53 worker-1 kubelet[11551]: E0928 22:11:53.851528   11551 server.go:253] Unable to authenticate the request due to an error: x509: certificate specifies an incompatible key usage
9月 28 22:14:18 worker-1 kubelet[11551]: E0928 22:14:18.833533   11551 server.go:253] Unable to authenticate the request due to an error: x509: certificate specifies an incompatible key usage
9月 28 22:22:21 worker-1 kubelet[11551]: E0928 22:22:21.122086   11551 server.go:253] Unable to authenticate the request due to an error: x509: certificate specifies an incompatible key usage
9月 28 22:22:25 worker-1 kubelet[11551]: E0928 22:22:25.126461   11551 server.go:253] Unable to authenticate the request due to an error: x509: certificate specifies an incompatible key usage
9月 28 22:22:30 worker-1 kubelet[11551]: E0928 22:22:30.554924   11551 server.go:253] Unable to authenticate the request due to an error: x509: certificate specifies an incompatible key usage

If I am not wrong, the certificate of yurt-tunnel-server will be used to establish TLS connection between yurt-tunnel-agent and kubelet when executing kubectl logs/exec.
The kubelet will request client certificate during the handshake, so I think certificates.UsageClientAuth is missing from the certmanager:

return newCertManager(
clientset,
projectinfo.GetServerName(),
fmt.Sprintf(constants.YurttunnelServerCertDir, projectinfo.GetServerName()),
constants.YurttunneServerCSRCN,
[]string{constants.YurttunneServerCSROrg, constants.YurttunnelCSROrg},
dnsNames,
[]certificates.KeyUsage{
certificates.UsageKeyEncipherment,
certificates.UsageDigitalSignature,
certificates.UsageServerAuth,
},
ips)
}

It works again after adding certificates.UsageClientAuth to certmanager.

@SataQiu
Copy link
Member Author

SataQiu commented Sep 29, 2021

Related PR: #475

@DrmagicE Can you submit a PR to fix this issue?

@rambohe-ch
Copy link
Member

@SataQiu @DrmagicE I'd like to use commit: 6926d2ae319d2b66b906eac4cd5454a9a32015d2 to release images of OpenYurt v0.5.0 again in order to make yurt-tunnel work.

@DrmagicE
Copy link
Member

@rambohe-ch Yes, I think we should.

@rambohe-ch
Copy link
Member

@rambohe-ch Yes, I think we should.

@DrmagicE @SataQiu OpenYurt v0.5.0 images have been updated based on commit: 6926d2a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug kind/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants