-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace etcd operator with StatefulSet #50
Comments
The friendly folk at Bitnami have a Helm chart for etcd which looks promising (and uses a stateful set). I'm testing it out now. https://github.com/bitnami/charts/tree/master/bitnami/etcd |
I have a work in progress PR here: #53 The verification scripts don't seem to be running properly. It just hangs:
All pods are up and running, the weaviate logs don't look super interesting:
Any thoughts on what might be happening or how to debug @etiennedi ? |
I'll take a look now, @idcrosby. Will let you know what I find. |
Simply from looking at the build logs my assumption would be that something around the distributed locking doesn't work as intended. This would lead to weaviate not starting up the http server properly which would in turn lead the verification script to not be able to download the swagger json. I'll check out the PR branch, build the helm chart and try and apply it to a minikube cluster. Maybe I can reproduce it. If not, we'd have to trigger a "real cluster" build without the destroy step and see the state there. |
@etiennedi I can quickly bring up a real cluster with this setup and debug, how could I verify the distributed locking setup? |
My first approach would simply be to check the logs for weaviate to see how far it gets. Additionally, the locks are regular key-value entries in |
Good idea with the real cluster, the resource requirements are definitely quite big, so minikube with the default VM doesn't work. I'll let you run the setup and instead watch this issue closely for your updates. Thanks. |
Etcd was configured with client auth enabled, but weaviate is not configured to authenticate (certs) with etcd. I think for our use case (etcd not being exposed externally and not storing any sensitive data) we can disable client authentication. I'm configuring this now. Also, would be a good idea to have weaviate log an error if it cannot connect to etcd cc @etiennedi |
Agreed. This reminds me, I noticed some of our k8s
What was the current behavior? I would have expected it to fail with an error because of https://github.com/creativesoftwarefdn/weaviate/blob/develop/restapi/configure_weaviate.go#L680-L683 |
@etiennedi weaviate stays running and doesn't log any errors, only thing in the logs is:
|
Interesting. So client creation doesn't error, but it simply never acquires the lock. Weird. But thanks for noticing. I'll create a separate issue over in creativesoftwarefdn/weaviate. |
Looks like everything is working now: https://travis-ci.com/SeMI-network/weaviate-infra/builds/103239676 I'll remove the WIP from the PR and assign to you @etiennedi to review. |
As described in #40 , the etcd operator is unfortunately not suitable for production use. A simple
StatefulSet
(I think there is such a chart inincubator
) might be better: coreos/etcd-operator#1323 (comment)The text was updated successfully, but these errors were encountered: