Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow truncation of clean translog #47866

Merged

Conversation

DaveCTurner
Copy link
Contributor

Today the elasticsearch-shard remove-corrupted-data tool will only truncate a
translog it determines to be corrupt. However there may be other cases in which
it is desirable to truncate the translog, for instance if an operation in the
translog cannot be replayed for some reason other than corruption. This commit
adds a --truncate-clean-translog option to skip the corruption check on the
translog and blindly truncate it.

Today the `elasticsearch-shard remove-corrupted-data` tool will only truncate a
translog it determines to be corrupt. However there may be other cases in which
it is desirable to truncate the translog, for instance if an operation in the
translog cannot be replayed for some reason other than corruption. This commit
adds a `--truncate-clean-translog` option to skip the corruption check on the
translog and blindly truncate it.
@DaveCTurner DaveCTurner added >bug :Distributed Indexing/Store Issues around managing unopened Lucene indices. If it touches Store.java, this is a likely label. v8.0.0 v7.5.0 v6.8.4 v7.4.1 labels Oct 10, 2019
@DaveCTurner DaveCTurner requested review from ywelsch and dnhatn October 10, 2019 14:28
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Store)

@DaveCTurner
Copy link
Contributor Author

I've marking this as a bug since we used to support this but lost the ability to do so in #33373.

@DaveCTurner
Copy link
Contributor Author

@elasticmachine please run elasticsearch-ci/2

@ywelsch
Copy link
Contributor

ywelsch commented Oct 10, 2019

However there may be other cases in which it is desirable to truncate the translog, for instance if an operation in the translog cannot be replayed for some reason other than corruption

Can you provide an example of this? If a single operation is problematic, would we truly like to throw away the full translog?

@DaveCTurner
Copy link
Contributor Author

Yes, we had a case recently of this exception:

Caused by: java.lang.IllegalArgumentException: number of documents in the index cannot exceed 2147483519
	at org.apache.lucene.index.DocumentsWriterPerThread.reserveOneDoc(DocumentsWriterPerThread.java:225) ~[lucene-core-8.1.0.jar:8.1.0 dbe5ed0b2f17677ca6c904ebae919363f2d36a0a - ishan - 2019-05-09 19:34:03]
	at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:234) ~[lucene-core-8.1.0.jar:8.1.0 dbe5ed0b2f17677ca6c904ebae919363f2d36a0a - ishan - 2019-05-09 19:34:03]
	at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:494) ~[lucene-core-8.1.0.jar:8.1.0 dbe5ed0b2f17677ca6c904ebae919363f2d36a0a - ishan - 2019-05-09 19:34:03]
	at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1594) ~[lucene-core-8.1.0.jar:8.1.0 dbe5ed0b2f17677ca6c904ebae919363f2d36a0a - ishan - 2019-05-09 19:34:03]
	at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1213) ~[lucene-core-8.1.0.jar:8.1.0 dbe5ed0b2f17677ca6c904ebae919363f2d36a0a - ishan - 2019-05-09 19:34:03]
	at org.elasticsearch.index.engine.InternalEngine.innerNoOp(InternalEngine.java:1516) ~[elasticsearch-7.3.0.jar:7.3.0]
	... 144 more

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DaveCTurner DaveCTurner merged commit 6943732 into elastic:master Oct 11, 2019
@DaveCTurner DaveCTurner deleted the 2019-10-10-truncate-clean-translog branch October 11, 2019 14:44
DaveCTurner added a commit that referenced this pull request Oct 11, 2019
Today the `elasticsearch-shard remove-corrupted-data` tool will only truncate a
translog it determines to be corrupt. However there may be other cases in which
it is desirable to truncate the translog, for instance if an operation in the
translog cannot be replayed for some reason other than corruption. This commit
adds a `--truncate-clean-translog` option to skip the corruption check on the
translog and blindly truncate it.
DaveCTurner added a commit that referenced this pull request Oct 11, 2019
Today the `elasticsearch-shard remove-corrupted-data` tool will only truncate a
translog it determines to be corrupt. However there may be other cases in which
it is desirable to truncate the translog, for instance if an operation in the
translog cannot be replayed for some reason other than corruption. This commit
adds a `--truncate-clean-translog` option to skip the corruption check on the
translog and blindly truncate it.
howardhuanghua pushed a commit to TencentCloudES/elasticsearch that referenced this pull request Oct 14, 2019
Today the `elasticsearch-shard remove-corrupted-data` tool will only truncate a
translog it determines to be corrupt. However there may be other cases in which
it is desirable to truncate the translog, for instance if an operation in the
translog cannot be replayed for some reason other than corruption. This commit
adds a `--truncate-clean-translog` option to skip the corruption check on the
translog and blindly truncate it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Indexing/Store Issues around managing unopened Lucene indices. If it touches Store.java, this is a likely label. v7.4.1 v7.5.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants