Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run CheckIndex on metadata index before loading #73239

Conversation

DaveCTurner
Copy link
Contributor

The metadata index is small and important and only read at startup.
Today we rely on Lucene to spot if any of its components is corrupt, but
Lucene does not necesssarily verify all checksums in order to catch a
corruption. With this commit we run CheckIndex on the metadata index
first, and fail on startup if a corruption is detected.

Closes #29358

The metadata index is small and important and only read at startup.
Today we rely on Lucene to spot if any of its components is corrupt, but
Lucene does not necesssarily verify all checksums in order to catch a
corruption. With this commit we run `CheckIndex` on the metadata index
first, and fail on startup if a corruption is detected.

Closes elastic#29358
@DaveCTurner DaveCTurner added >enhancement :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. v8.0.0 v7.14.0 labels May 19, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label May 19, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

"] in [" + dataPath + "] but expected [" + nodeId + "]");
if (isClean == false) {
if (logger.isErrorEnabled()) {
outputStream.bytes().utf8ToString().lines().forEach(l -> logger.error("checkIndex: {}", l));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't great: we materialise all the bytes, then convert them to a string, and then split them into lines. A streaming implementation is definitely possible but doesn't seem worth the effort here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I wonder if we want a tool that reads the state without check index and rewrites it fully? Perhaps an option to the "unsafe bootstrap master" tool?

if (logger.isErrorEnabled()) {
outputStream.bytes().utf8ToString().lines().forEach(l -> logger.error("checkIndex: {}", l));
}
throw new IllegalStateException("metadata index at [" + dataPath +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We often refer to this as either cluster state or global state. I wonder if we could call it global state metadata index, just to be sure this message is interpreted correctly by operators?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cluster state includes the metadata contained in this index but also includes ephemeral things like the routing table; conversely the metadata in this index comprises global metadata and index metadata. "Metadata index" is more correct than either, but I tried a few ideas and settled on the index containing the cluster metadata, see 95378d2.

onDiskState = loadOnDiskState(dataPath, directoryReader);

if (nodeId.equals(onDiskState.nodeId) == false) {
throw new IllegalStateException("unexpected node ID in metadata, found [" + onDiskState.nodeId +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment on metadata as above, perhaps global state metadata?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You had to be doing something pretty weird to hit this message anyway, and doubly so today since the node IDs that we're comparing both come from the user-data from the latest commit of this index. Harmonised with the other message in 9e89fed.

"] in [" + dataPath + "] but expected [" + nodeId + "]");
if (isClean == false) {
if (logger.isErrorEnabled()) {
outputStream.bytes().utf8ToString().lines().forEach(l -> logger.error("checkIndex: {}", l));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

@DaveCTurner
Copy link
Contributor Author

I wonder if we want a tool that reads the state without check index and rewrites it fully?

I have my doubts that we could implement such a thing robustly enough to be useful. I ran the corruption test 2000 times and saw only 44 cases (<2.5%) in which the state was readable after corruption anyway. Of course this test makes a pretty small state so the chances of hitting something vital with a one-byte error are maximised, but even so many of the exceptions seen are failures to parse a document in the index. Conversely in reality a corruption likely hits multiple bytes, perhaps even a whole disk sector at once, so I reckon we'd not be able to read the state in almost every case.

With a larger state there's a better chance that there are multiple segments, so CheckIndex#exorciseIndex might yield something useful, but I still think the probability of success is very low.

We could also perhaps extract the global metadata from an otherwise-broken index with slightly higher probability since we always write it first so it should be near the start of the stored fields of whichever segment it's in, and then we could fall back on dangling indices to recover the index metadata. This all seems overly heroic tho, I'd rather not go down this path.

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@DaveCTurner DaveCTurner merged commit a81fbb9 into elastic:master Jun 16, 2021
@DaveCTurner DaveCTurner deleted the 2021-05-19-check-metadata-index-before-loading branch June 16, 2021 09:27
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Jun 16, 2021
The metadata index is small and important and only read at startup.
Today we rely on Lucene to spot if any of its components is corrupt, but
Lucene does not necesssarily verify all checksums in order to catch a
corruption. With this commit we run `CheckIndex` on the metadata index
first, and fail on startup if a corruption is detected.

Closes elastic#29358
DaveCTurner added a commit that referenced this pull request Jun 16, 2021
The metadata index is small and important and only read at startup.
Today we rely on Lucene to spot if any of its components is corrupt, but
Lucene does not necesssarily verify all checksums in order to catch a
corruption. With this commit we run `CheckIndex` on the metadata index
first, and fail on startup if a corruption is detected.

Closes #29358
limingnihao pushed a commit to limingnihao/elasticsearch that referenced this pull request Jun 17, 2021
* master: (284 commits)
  [DOCS] Update central reporting image (elastic#74195)
  [DOCS] SQL: Document `null` handing for string functions (elastic#74201)
  Fix Snapshot Docs Listing Query Params in Body Incorrectly (elastic#74196)
  [DOCS] EQL: Note EQL uses `fields` parameter (elastic#74194)
  Mute failing MixedClusterClientYamlTestSuiteIT test {p0=indices.split/20_source_mapping/Split index ignores target template mapping} test (elastic#74198)
  Cleanup Duplicate Constants in Snapshot XContent Params (elastic#74114)
  [DOC] Add watcher to the threadpool doc (elastic#73935)
  [Rest Api Compatibility] Validate Query typed api (elastic#74171)
  Replace deprecated `script.cache.*` settings with `script.context.$constext.cache_*` in documentation. (elastic#74144)
  Pin Alpine Linux version in Docker builds (elastic#74169)
  Fix clone API settings docs bug (elastic#74175)
  [ML] refactor internal datafeed management (elastic#74018)
  Disable query cache for FunctionScoreQuery and ScriptScoreQuery (elastic#74060)
  Fork the sending of file chunks during recovery (elastic#74164)
  RuntimeField.Builder should not extend FieldMapper.Builder (elastic#73840)
  Run CheckIndex on metadata index before loading (elastic#73239)
  Deprecate setting version on analyzers (elastic#74073)
  Add test with null transform id in stats request (elastic#74130)
  Order imports when reformatting (elastic#74059)
  Move deprecation code from xpack core to deprecation module. (elastic#74120)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.14.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enhance BufferedChecksumIndexInput error when a state file is empty
4 participants