-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run CheckIndex on metadata index before loading #73239
Run CheckIndex on metadata index before loading #73239
Conversation
The metadata index is small and important and only read at startup. Today we rely on Lucene to spot if any of its components is corrupt, but Lucene does not necesssarily verify all checksums in order to catch a corruption. With this commit we run `CheckIndex` on the metadata index first, and fail on startup if a corruption is detected. Closes elastic#29358
Pinging @elastic/es-distributed (Team:Distributed) |
"] in [" + dataPath + "] but expected [" + nodeId + "]"); | ||
if (isClean == false) { | ||
if (logger.isErrorEnabled()) { | ||
outputStream.bytes().utf8ToString().lines().forEach(l -> logger.error("checkIndex: {}", l)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't great: we materialise all the bytes, then convert them to a string, and then split them into lines. A streaming implementation is definitely possible but doesn't seem worth the effort here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. I wonder if we want a tool that reads the state without check index and rewrites it fully? Perhaps an option to the "unsafe bootstrap master" tool?
if (logger.isErrorEnabled()) { | ||
outputStream.bytes().utf8ToString().lines().forEach(l -> logger.error("checkIndex: {}", l)); | ||
} | ||
throw new IllegalStateException("metadata index at [" + dataPath + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We often refer to this as either cluster state or global state. I wonder if we could call it global state metadata index
, just to be sure this message is interpreted correctly by operators?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cluster state includes the metadata contained in this index but also includes ephemeral things like the routing table; conversely the metadata in this index comprises global metadata and index metadata. "Metadata index" is more correct than either, but I tried a few ideas and settled on the index containing the cluster metadata
, see 95378d2.
onDiskState = loadOnDiskState(dataPath, directoryReader); | ||
|
||
if (nodeId.equals(onDiskState.nodeId) == false) { | ||
throw new IllegalStateException("unexpected node ID in metadata, found [" + onDiskState.nodeId + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment on metadata
as above, perhaps global state metadata
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You had to be doing something pretty weird to hit this message anyway, and doubly so today since the node IDs that we're comparing both come from the user-data from the latest commit of this index. Harmonised with the other message in 9e89fed.
"] in [" + dataPath + "] but expected [" + nodeId + "]"); | ||
if (isClean == false) { | ||
if (logger.isErrorEnabled()) { | ||
outputStream.bytes().utf8ToString().lines().forEach(l -> logger.error("checkIndex: {}", l)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
I have my doubts that we could implement such a thing robustly enough to be useful. I ran the corruption test 2000 times and saw only 44 cases (<2.5%) in which the state was readable after corruption anyway. Of course this test makes a pretty small state so the chances of hitting something vital with a one-byte error are maximised, but even so many of the exceptions seen are failures to parse a document in the index. Conversely in reality a corruption likely hits multiple bytes, perhaps even a whole disk sector at once, so I reckon we'd not be able to read the state in almost every case. With a larger state there's a better chance that there are multiple segments, so We could also perhaps extract the global metadata from an otherwise-broken index with slightly higher probability since we always write it first so it should be near the start of the stored fields of whichever segment it's in, and then we could fall back on dangling indices to recover the index metadata. This all seems overly heroic tho, I'd rather not go down this path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
The metadata index is small and important and only read at startup. Today we rely on Lucene to spot if any of its components is corrupt, but Lucene does not necesssarily verify all checksums in order to catch a corruption. With this commit we run `CheckIndex` on the metadata index first, and fail on startup if a corruption is detected. Closes elastic#29358
The metadata index is small and important and only read at startup. Today we rely on Lucene to spot if any of its components is corrupt, but Lucene does not necesssarily verify all checksums in order to catch a corruption. With this commit we run `CheckIndex` on the metadata index first, and fail on startup if a corruption is detected. Closes #29358
* master: (284 commits) [DOCS] Update central reporting image (elastic#74195) [DOCS] SQL: Document `null` handing for string functions (elastic#74201) Fix Snapshot Docs Listing Query Params in Body Incorrectly (elastic#74196) [DOCS] EQL: Note EQL uses `fields` parameter (elastic#74194) Mute failing MixedClusterClientYamlTestSuiteIT test {p0=indices.split/20_source_mapping/Split index ignores target template mapping} test (elastic#74198) Cleanup Duplicate Constants in Snapshot XContent Params (elastic#74114) [DOC] Add watcher to the threadpool doc (elastic#73935) [Rest Api Compatibility] Validate Query typed api (elastic#74171) Replace deprecated `script.cache.*` settings with `script.context.$constext.cache_*` in documentation. (elastic#74144) Pin Alpine Linux version in Docker builds (elastic#74169) Fix clone API settings docs bug (elastic#74175) [ML] refactor internal datafeed management (elastic#74018) Disable query cache for FunctionScoreQuery and ScriptScoreQuery (elastic#74060) Fork the sending of file chunks during recovery (elastic#74164) RuntimeField.Builder should not extend FieldMapper.Builder (elastic#73840) Run CheckIndex on metadata index before loading (elastic#73239) Deprecate setting version on analyzers (elastic#74073) Add test with null transform id in stats request (elastic#74130) Order imports when reformatting (elastic#74059) Move deprecation code from xpack core to deprecation module. (elastic#74120) ...
The metadata index is small and important and only read at startup.
Today we rely on Lucene to spot if any of its components is corrupt, but
Lucene does not necesssarily verify all checksums in order to catch a
corruption. With this commit we run
CheckIndex
on the metadata indexfirst, and fail on startup if a corruption is detected.
Closes #29358