-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent revision and data after --force-new-cluster #14009
Comments
I've done some additional hacking at this an eliminated a couple possibilities:
However, one interesting finding is that it appears to be possible to take a snapshot from any node in an affected cluster, restore that snapshot, start with --force-new-cluster, and then rejoin the other nodes, and the cluster now returns consistent results. |
I did not reproduce this issue with 3.5.4. Please provide the detailed steps and the command you executed in each step. |
I have not been able to reproduce it without a Kubernetes cluster pointed at the etcd cluster. Not sure if it is a load issue, or something to do with the way Kubernetes uses transactions for its create/update operations. Is there a suggested load simulation tool that I could test with? |
You can try the benchmark tool. See an example command below,
|
brandond@dev01:~/etcd-split-brain$ etcd-dump-db iterate-bucket etcd-1 members brandond@dev01:~/etcd-split-brain$ etcd-dump-db iterate-bucket etcd-1 members_removed
It's interesting that there are two learners on etcd-2 and one learner on etcd-1. Could you provide detailed steps (and with command) on how to reproduce this issue? |
I really wish I knew how to reproduce it using just etcd and a bare etcd3 client or the benchmark tool. At the moment I can only reproduce it when I have a Kubernetes cluster pointed at etcd, and use the |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
What happened?
After starting etcd with
--force-new-cluster
, removing the database files off the secondary nodes, and rejoining them to the cluster, the cluster is now in a split-brain state. Reads from the 1st node (that was started with--force-new-cluster
) returns different data for some keys than reads from nodes that were deleted and subsequently rejoined to the cluster.The end result feels identical to #13766, but can be reproduced with a fairly trivial amount of traffic in conjunction with using
--force-new-cluster
.Examining the database from the nodes with etcd-dump-db and etc-dump logs show the same event sequence in the WAL, but the db itself shows different values in the keystore. I'm not pasting the WAL dump here but will attach the data-dir from both cluster members.
Also,the datastore for both nodes shows different values for the members and members_removed keys. I'm not sure if this is normal or not:
What did you expect to happen?
Consistent data returned by both cluster members.
How can we reproduce it (as minimally and precisely as possible)?
Start etcd with
--force-new-cluster
while a running Kubernetes apiserver is pointed at the etcd server. I have not been able to reproduce this ad-hoc with direct writes to a single key.Anything else we need to know?
No response
Etcd version (please run commands below)
Etcd configuration (command line flags or environment variables)
Etcd debug information (please run commands blow, feel free to obfuscate the IP address or FQDN in the output)
Relevant log output
data-dir from both cluster members:
etcd-split-brain.zip
The text was updated successfully, but these errors were encountered: