-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CorruptIndexException after upgrade from 0.20.6 to 1.4.2 #9140
Comments
I don't think 0.20.x computed a proper adler checksum for files like .tis/.tii that arent "append-only". this is because those formats seeked backwards and rewrote earlier bytes. I think we should just only verify the length for those extensions? The other case is segments_N, but it never wrote checksum for that so its ok. |
I will investigate with @gboanea procedure to see if the metadata length is reliable as well. We don't want an off-by-8 but we need to at least verify the length of the file to detect e.g. network disconnect or other problem transferring the file. |
@rmuir Don't know if this helps but I just had a go at replicating this. With a single node, the upgrade was smooth, but as soon as I added a second node (and thus replicas) this was easy to reproduce. |
@clintongormley was the problematic file a .tis or .tii? It should fail always with the 3.x index format (0.20.x). Thats because those files were never append-only, and so the adler's were incorrect. But they were used essentially only as hash values at the time so it was no problem. |
@rmuir yes, both |
OK, let me try to make a quick patch. The issue does not impact master, only 1.x. only Lucene 3.x (and 4.0) indexes are impacted. After 4.1 all files are append-only. |
I made an untested patch here: #9142 |
I manually tested, the patch works (on 0.20.x you just have to index enough to trigger a merge or turn off cfs-on-flush so you get .tis and .tii files). I opened #9143 to fix the bigger test issue. |
I believe this has been solved, but either way 0.20.6 is too old to worry about anymore |
After updating form ES 0.20.6 to 1.4.2 the cluster remains in a RED state and CorruptIndexExceptions are generated:
I reproduced the error with a vanilla ES, these are the steps:
Start cluster ES 0.20.6 nodes
Index data (one index ~100MB)
Cluster health
Update
Start cluster ES 1.4.2 nodes
Cluster health
Some logs from the log files:
The text was updated successfully, but these errors were encountered: