Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index deletes not applied when cluster UUID has changed #16825

Closed

Conversation

abeyad
Copy link

@abeyad abeyad commented Feb 26, 2016

If a node was isolated from the cluster while a delete was happening, the node will ignore the deleted operation when rejoining as we couldn't detect whether the new master genuinely deleted the indices or it is a new fresh "reset" master that was started without the old data folder. We can now be smarter and detect these reset masters and actually delete the indices on the node if its not the case of a reset master.

Note that this new protection doesn't hold if the node was shut down. In that case it's indices will still be imported as dangling indices.

Closes #11665

@abeyad
Copy link
Author

abeyad commented Feb 26, 2016

@bleskes @dakrone @brwe feedback would be appreciated :)

@abeyad abeyad force-pushed the bug-concurrent-index-del-reimports branch 4 times, most recently from dddd41b to dddf9de Compare February 26, 2016 18:09
@@ -152,46 +168,35 @@ public boolean indexMetaDataChanged(IndexMetaData current) {
return true;
}

public boolean blocksChanged() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't remove this method here, it would make this a breaking change in that it would break any plugins that relied on watching for ClusterChangeEvents where the blocks have changed

@abeyad abeyad force-pushed the bug-concurrent-index-del-reimports branch from dddf9de to 011ac55 Compare February 26, 2016 18:23
@@ -134,10 +142,18 @@ public boolean indexRoutingTableChanged(String index) {
return deleted == null ? Collections.<String>emptyList() : deleted;
}

/**
* Returns <code>true</code> iff the metadata for the cluster has changed between
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a note saying this is an reference level equality and not a true equal.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I also added that note to every other reference equality check in the class

@bleskes
Copy link
Contributor

bleskes commented Feb 29, 2016

I left some very minor comments. Looks good. I thought we had a test somewhere about index deletion with master change (which we should be able to enable now), but maybe I'm wrong. @brwe do you remember?

@bleskes
Copy link
Contributor

bleskes commented Feb 29, 2016

also re the pr description:

This commit fixes the issue by only reimporting indices from data nodes if and only
if the cluster UUID on the master node is different from the cluster
UUID of the previous cluster state on the data node

This inaccurate - on disk dangling indices will still be imported - for example if a node was off while the index was deleted. This change only deals with the case where a node was isolated from the cluster while a delete happened (but was up!) and where a master failure happened during a delete operation (but after the delete was already committed). I think we should something like:

If a node was isloated from the cluster while a delete was happening, the node will ignore the deleted operation when rejoining as we couldn't detect whether the new master genuinely deleted the indices or is a new fresh "resetted" master that was started without the old data folder. We can now be smarter and detect these resetted masters and actually delete the indices if this is not the case.

Note that this new protection doesn't hold if the node was shut down. In that case it's indices will still be imported as dangling indices.

@brwe
Copy link
Contributor

brwe commented Feb 29, 2016

@brwe do you remember?

I think that is the one that @abeyad enabled here: https://github.com/elastic/elasticsearch/pull/16825/files#diff-c454159f60a0127854028ccb5502500eL1076 I know of no other one.

@bleskes
Copy link
Contributor

bleskes commented Feb 29, 2016

@brwe I missed that one line in the white space fixing. Jet lag... :( thanks!

@abeyad
Copy link
Author

abeyad commented Feb 29, 2016

I fixed the issues raised by @bleskes and @dakrone and all tests pass. I also updated the PR description and will change the commit message as well once the PR is approved.

@bleskes
Copy link
Contributor

bleskes commented Mar 1, 2016

LGTM. Thanks @abeyad

If a node was isolated from the cluster while a delete was happening,
the node will ignore the deleted operation when rejoining as we couldn't
detect whether the new master genuinely deleted the indices or it is a
new fresh "reset" master that was started without the old data folder.
We can now be smarter and detect these reset masters and actually delete
the indices on the node if its not the case of a reset master.

Note that this new protection doesn't hold if the node was shut down. In
that case it's indices will still be imported as dangling indices.

Closes elastic#11665
@abeyad abeyad force-pushed the bug-concurrent-index-del-reimports branch from 67dd695 to 45ae99c Compare March 1, 2016 14:21
@abeyad abeyad closed this in 83d1e09 Mar 1, 2016
abeyad pushed a commit to abeyad/elasticsearch that referenced this pull request Mar 1, 2016
This commit backports commit 83d1e09
from master to 2.x.

Relates elastic#16825
abeyad pushed a commit that referenced this pull request Mar 10, 2016
This commit backports commit 83d1e09
from master to 2.x.

Relates #16825
@clintongormley clintongormley added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. and removed :Cluster labels Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. v2.3.0 v5.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants