Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent deletion of indices and master failure can cause indices to be reimported #11665

Closed
brwe opened this issue Jun 15, 2015 · 5 comments
Closed
Assignees
Labels
>bug :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. help wanted adoptme v2.3.0 v5.0.0-alpha1

Comments

@brwe
Copy link
Contributor

brwe commented Jun 15, 2015

Currently, a data node deletes indices by evaluating the cluster state. If a new cluster state comes in it is compared to the last known cluster state, and if the new state does not contain an index that the node has in its last cluster state, then this index is deleted.

This could cause data to be deleted if the data folder of all master nodes was lost (#8823):

All master nodes of a cluster go down at the same time and their data folders cannot be recovered.
A new master is brought up but it does not have any indices in its cluster state because the data was lost.
Because all other node are data nodes it cannot get the cluster state from them too and therefore sends a cluster state without any indices in it to the data nodes. The data nodes then delete all their data.

On the master branch we prevent this now by checking if the current cluster state comes from a different master than the previous one and if so, we keep the indices and import them as dangling (see #9952, ClusterChangedEvent).

While this prevents the deletion, it also means that we might in other cases not delete indices although we should.

Example:

  1. two masters eligible nodes, m1 is master, one data node (d).
  2. m1, m2 and d are on cluster state version 1 that contains and index
  3. The index is deleted through the API, causing m1 to send cluster state 2 which does not contain the index to m2 and d that should trigger the actual index deletion.
  4. m1 goes down
  5. m2 receives the new cluster state but d does not (network issues etc)
  6. m2 is elected master and sends cluster state 3 to d which again does not contain the index
  7. d will not delete the index because the state comes from a different master than cluster state 1 (the last one it knows of) and will therefore not delete the index and instead import it back into the cluster

Currently there is no way for a data node to decide if an index should actually be deleted or not if the cluster state that triggers the delete comes from a new master. We chose between: (1) deleting all data in case a node receives an empty cluster state or (2) run the risk to keep indices around that should actually be deleted.

We decided for (2) in #9952. Just opening this issue so that this behavior is documented.

@clintongormley
Copy link
Contributor

@brwe what about making the delete index request wait for responses from the data nodes? then the request can report success/failure?

@bleskes
Copy link
Contributor

bleskes commented Jun 15, 2015

@clintongormley the delete index API does wait for data nodes to confirm the deletion. The above scenario will trigger the call to time out (it waits for an ack from the data node that will not come). If people then check the CS, they will see that the index was deleted. However, at a later stage, once the data rejoins the cluster and the new master, the index will be reimported.

@clintongormley
Copy link
Contributor

Ok understood. +1

brwe added a commit to brwe/elasticsearch that referenced this issue Jun 15, 2015
Some of the test for meta data are redundant. Also, since they
somewhat test service disruptions (start master with empty
data folder) we might move them to DiscoveryWithServiceDisruptionsTests.
Also, this commit adds a test for
elastic#11665
brwe added a commit to brwe/elasticsearch that referenced this issue Jul 27, 2015
Some of the test for meta data are redundant. Also, since they
somewhat test service disruptions (start master with empty
data folder) we might move them to DiscoveryWithServiceDisruptionsTests.
Also, this commit adds a test for
elastic#11665
@clintongormley
Copy link
Contributor

@bleskes is this still an issue?

@bleskes
Copy link
Contributor

bleskes commented Jan 19, 2016

Sadly it is. However, thinking about it again I realized that we can easily detect the “new empty” master danger by comparing cluster uuid - a new master will generate a new one. Agreed with marking as adopt me. Although it sounds scary it’s quite an easy fix and is a good entry point to the cluster state universe. If anyone wants to pick this up, please ping me :)

On 18 Jan 2016, at 21:28, Clinton Gormley [email protected] wrote:

@bleskes is this still an issue?


Reply to this email directly or view it on GitHub.

@abeyad abeyad self-assigned this Feb 21, 2016
abeyad pushed a commit to abeyad/elasticsearch that referenced this issue Mar 1, 2016
If a node was isolated from the cluster while a delete was happening,
the node will ignore the deleted operation when rejoining as we couldn't
detect whether the new master genuinely deleted the indices or it is a
new fresh "reset" master that was started without the old data folder.
We can now be smarter and detect these reset masters and actually delete
the indices on the node if its not the case of a reset master.

Note that this new protection doesn't hold if the node was shut down. In
that case it's indices will still be imported as dangling indices.

Closes elastic#11665
@abeyad abeyad closed this as completed in 83d1e09 Mar 1, 2016
@clintongormley clintongormley added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. and removed :Cluster labels Feb 13, 2018
fixmebot bot referenced this issue in VectorXz/elasticsearch Apr 22, 2021
fixmebot bot referenced this issue in VectorXz/elasticsearch May 28, 2021
fixmebot bot referenced this issue in VectorXz/elasticsearch Aug 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. help wanted adoptme v2.3.0 v5.0.0-alpha1
Projects
None yet
Development

No branches or pull requests

4 participants