Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: newURLs [redacted] should never fail: URL scheme must be http, https, unix, or unixs: redacted #9173

Closed
tomwilkie opened this issue Jan 19, 2018 · 6 comments

Comments

@tomwilkie
Copy link
Contributor

tomwilkie commented Jan 19, 2018

In the process of updating etcd cluster to use certificates, https etc:

  • Deployed (peer-)trusted-ca-file, (peer-)cert-file and (peer-)key-file to each machine with appropriate certs.
  • One by one did # etcdctl member update b8b607c8533bf21f --peer-urls=https://redacted and then restarted corresponding node with https in ETCD_LISTEN_PEER_URLS, ETCD_LISTEN_CLIENT_URLS and ETCD_ADVERTISE_CLIENT_URLS.

Process was going fine, until I accidentally forgot to include the https:// in the etcdctl member update command. All nodes are now in a panic restart loop with:

panic: newURLs [redacted] should never fail: URL scheme must be http, https, unix, or unixs: redacted

goroutine 115 [running]:
panic(0xd460a0, 0xc820f3f500)
	/usr/local/go/src/runtime/panic.go:481 +0x3e6
github.com/coreos/etcd/cmd/vendor/github.com/coreos/pkg/capnslog.(*PackageLogger).Panicf(0xc8201113c0, 0x1232b20, 0x22, 0xc82059b508, 0x2, 0x2)
	/home/gyuho/go/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/pkg/capnslog/pkg_logger.go:75 +0x191
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/rafthttp.(*Transport).UpdatePeer(0xc82015e240, 0xb8b607c8533bf21f, 0xc8202ad200, 0x1, 0x4)
	/home/gyuho/go/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/rafthttp/transport.go:288 +0x2d0
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).applyConfChange(0xc820188b40, 0x5791610eb29ce902, 0x2, 0xb8b607c8533bf21f, 0xc82019fd40, 0x5a, 0x60, 0x0, 0x0, 0x0, ...)
	/home/gyuho/go/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/server.go:1150 +0x725
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).apply(0xc820188b40, 0xc820428048, 0xc6d, 0xc70, 0xc820282e80, 0x0, 0x0)
	/home/gyuho/go/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/server.go:1036 +0x22c
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).applyEntries(0xc820188b40, 0xc820282e80, 0xc820076420)
	/home/gyuho/go/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/server.go:756 +0x2bd
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).applyAll(0xc820188b40, 0xc820282e80, 0xc820076420)
	/home/gyuho/go/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/server.go:617 +0xb4
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).run.func2(0x7f070bdcba88, 0xc820282d80)
	/home/gyuho/go/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/server.go:596 +0x32
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/schedule.(*fifo).run(0xc82019f440)
	/home/gyuho/go/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/schedule/schedule.go:160 +0x323
created by github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/schedule.NewFIFOScheduler
	/home/gyuho/go/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/schedule/schedule.go:71 +0x27d

Its an old etcd version, 3.0.17.

@gyuho
Copy link
Contributor

gyuho commented Jan 19, 2018

Did you get to recover your cluster (e.g. from previous snapshot)? We don't have any docs around this case (e.g. how to recover) and early warnings to prevent this.

@tomwilkie
Copy link
Contributor Author

I tried various things, taking it back to one node and --force-new-cluster but ended up blowing it away. It was only out test env.

@gyuho
Copy link
Contributor

gyuho commented Jan 19, 2018

@tomwilkie

--force-new-cluster

Recommended way is restore a new cluster from snapshot (snapshot restore, #9177 (comment)). That said, I think this kind of misconfiguration should have been prevented in client-side, rather than waiting for panic in server (which is proposed in #9174).

@tomwilkie
Copy link
Contributor Author

tomwilkie commented Jan 19, 2018 via email

@gyuho
Copy link
Contributor

gyuho commented Jan 22, 2018

@tomwilkie We've added client-side safeguard to handle this (backported to 3.2/3.3 branches).

Thanks for reporting!

@gyuho gyuho closed this as completed Jan 22, 2018
@tomwilkie
Copy link
Contributor Author

tomwilkie commented Jan 22, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants