-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write index metadata on data nodes where shards allocated #8823
Comments
Are you running dedicated masters with no query/data load? |
@bobpoekert can you elaborate more about what happened before the data was lost - did you update the mapping (title suggest so)? If you can share your cluster layout (dedicated master nodes or not) it would be great. Also please the grab a copy of the logs of the nodes and save them. They might give more insight. |
Did you have data turned off on the master candidate as well in the configuration? |
@vjanelle No. The master candidate is also a data node. |
@bobpoekert thx. let me make sure I understand what you are saying:
From this point on I'm not clear. Why was node B elected as master? What happened to the residing master node A? |
|
@bobpoekert I'm confused. You said before you had nodes that are master candidates and are also data nodes. Can you confirm that node A has |
@bleskes |
OK. Clear. The cluster meta data (which we call cluster state) is stored and maintained on the master nodes. That meta data contains which indices are out there in the cluster. We only write the metadata on master eligible nodes, and rely on multiple masters for redundancy of this data set (compared to specific shard data). Since you only have one master node, that means there is no redundancy in your cluster meta data storage. After you deleted it there is nowhere to get it back from so the cluster becomes empty. We do have a feature that is called dangling indices, which is in charge of scanning data folders for indices that are found on disk but are not part the cluster state and automatically import then into the cluster. As it is today, this feature needs to find some part of the index meta data to work, but those are also stored on the master eligible nodes, which in your case there were none. Thinking about it, we can be more resilient in situations where users are running only a single master node (though we highly recommend running more than one), and store the index metadata wherever a shard copy is stored, so also on data nodes. So we can improve the dangling indices case to identify those as well. Lets keep this issue open and we will work on a PR to improve things based on the above. |
This definitely feels like a documentation issue as well. I asked on twitter and the reason for running single master was essentially to avoid other bugs. I don't think it's an unfair assumption that a user would expect having a quorum of preexisting data nodes to be enough to promote a new master or rebuild it without data loss. I would, from a purely semantic perspective, expect that data nodes would have ALL the data needed for the cluster and that the master's state would live with the rest of the "data" Also would it not make sense for maybe non-master eligible nodes to at least provide a backup of the master node cluster metadata for this case and as a safety precaution? Ftr, I have no direct impact from this issue. Just another production ES user who tracks this stuff. |
Sorry missed the part where you mention possibly storing the backup on data nodes. |
Repro of this bug: https://gist.github.com/grantr/a53a9b6b91005ad9807f This is more than a documentation issue. Even when running in a degraded configuration, shards should never be deleted if their metadata can't be found. |
This is indeed the plan |
Today if a shard contains a segment that is from Lucene 3.x and therefore throws an `IndexFormatTooOldException` the nodes goes into a wild allocation loop if the index is directly recovered from the gateway. If the problematic shard is allocated later due to other reasons the shard will fail allocation and downgrading the cluster might be impossible since new segments in other indices have already been written. This commit adds santiy checks to the GatewayMetaState that tries to read the SegmentsInfo for every shard on the node and fails if a shard is corrupted or the index is too new etc. With the new data_path per index feature nodes might not have enough information unless they are master eligable since we used to not persist the index and global state on nodes that are not master eligable. This commit changes this behavior and writes the state on all nodes that hold data. This in an enhacement itself since data nodes that are not master eligable are not selfcontained today. This change also fixes the issue see in elastic#8823 since metadata is written on all data nodes now. Closes elastic#8823
When a node was a data node only then the index state was not written. In case this node connected to a master that did not have the index in the cluster state, for example because a master was restarted and the data folder was lost, then the indices were not imported as dangling but instead deleted. This commit makes sure that index state for data nodes is also written if they have at least one shard of this index allocated. closes elastic#8823
When a node was a data node only then the index state was not written. In case this node connected to a master that did not have the index in the cluster state, for example because a master was restarted and the data folder was lost, then the indices were not imported as dangling but instead deleted. This commit makes sure that index state for data nodes is also written if they have at least one shard of this index allocated. closes elastic#8823
+1111 |
Can the deletion be delayed for a longer period by setting a high value for |
I'm running 1.4 and can not update now, so i would prefer to change the settings and have more time to react. |
When a node was a data node only then the index state was not written. In case this node connected to a master that did not have the index in the cluster state, for example because a master was restarted and the data folder was lost, then the indices were not imported as dangling but instead deleted. This commit makes sure that index state for data nodes is also written if they have at least one shard of this index allocated. closes elastic#8823
@polgl you can modify the settings yourself and just to import always? |
Hi, I did some tests and it seems like our cluster does not show this behavior. Thanks for you help |
When a node was a data node only then the index state was not written. In case this node connected to a master that did not have the index in the cluster state, for example because a master was restarted and the data folder was lost, then the indices were not imported as dangling but instead deleted. This commit makes sure that index state for data nodes is also written if they have at least one shard of this index allocated. closes elastic#8823
When a node was a data node only then the index state was not written. In case this node connected to a master that did not have the index in the cluster state, for example because a master was restarted and the data folder was lost, then the indices were not imported as dangling but instead deleted. This commit makes sure that index state for data nodes is also written if they have at least one shard of this index allocated. closes #8823 closes #9952
ES Version 2.2 Previous state - A cluster with 6 nodes (3 Master cum data nodes, 3 client nodes). What's the problem? |
correction - version is 2.1.1 |
@saurabh24292 I'm not sure what your problem is, but maybe you can ask on discuss.elastic.co? if we figure out it's related to this issue or is cause by something other problem we can re-open this or (more likely) open a new one |
If the master node thinks that an index does not exist, and another node thinks that it does, the conflict is currently resolved by having the node that has the index delete it. This can easily result in sudden unexpected data loss. The correct behavior would be for the conflict to be resolved by both nodes accepting the state of the node that thinks that the index exists.
The text was updated successfully, but these errors were encountered: