Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent in-place downgrades and invalid upgrades #41731

Merged
merged 14 commits into from
May 21, 2019
Merged
63 changes: 60 additions & 3 deletions docs/reference/commands/node-tool.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,23 @@
The `elasticsearch-node` command enables you to perform certain unsafe
operations on a node that are only possible while it is shut down. This command
allows you to adjust the <<modules-node,role>> of a node and may be able to
recover some data after a disaster.
recover some data after a disaster or start a node even if it is incompatible
with the data on disk.

[float]
=== Synopsis

[source,shell]
--------------------------------------------------
bin/elasticsearch-node repurpose|unsafe-bootstrap|detach-cluster
bin/elasticsearch-node repurpose|unsafe-bootstrap|detach-cluster|override-version
[--ordinal <Integer>] [-E <KeyValuePair>]
[-h, --help] ([-s, --silent] | [-v, --verbose])
--------------------------------------------------

[float]
=== Description

This tool has three modes:
This tool has four modes:

* `elasticsearch-node repurpose` can be used to delete unwanted data from a
node if it used to be a <<data-node,data node>> or a
Expand All @@ -36,6 +37,11 @@ This tool has three modes:
cluster bootstrapping was not possible, it also enables you to move nodes
into a brand-new cluster.

* `elasticsearch-node override-version` enables you to start up a node
even if the data in the data path was written by an incompatible version of
{es}. This may sometimes allow you to downgrade to an earlier version of
{es}.

[[node-tool-repurpose]]
[float]
==== Changing the role of a node
Expand Down Expand Up @@ -109,6 +115,25 @@ way forward that does not risk data loss, but it may be possible to use the
`elasticsearch-node` tool to construct a new cluster that contains some of the
data from the failed cluster.

[[node-tool-override-version]]
[float]
==== Bypassing version checks

The data that {es} writes to disk is designed to be read by the current version
and a limited set of future versions. It cannot generally be read by older
versions, nor by versions that are more than one major version newer. The data
stored on disk includes the version of the node that wrote it, and {es} checks
that it is compatible with this version when starting up.

In rare circumstances it may be desirable to bypass this check and start up an
{es} node using data that was written by an incompatible version. This may not
work if the format of the stored data has changed, and it is a risky process
because it is possible for the format to change in ways that {es} may
misinterpret, silently leading to data loss.

To bypass this check, you can use the `elasticsearch-node override-version`
tool to overwrite the version number stored in the data path with the current
version, causing {es} to believe that it is compatible with the on-disk data.

[[node-tool-unsafe-bootstrap]]
[float]
Expand Down Expand Up @@ -262,6 +287,9 @@ one-node cluster.
`detach-cluster`:: Specifies to unsafely detach this node from its cluster so
it can join a different cluster.

`override-version`:: Overwrites the version number stored in the data path so
that a node can start despite being incompatible with the on-disk data.

`--ordinal <Integer>`:: If there is <<max-local-storage-nodes,more than one
node sharing a data path>> then this specifies which node to target. Defaults
to `0`, meaning to use the first node in the data path.
Expand Down Expand Up @@ -423,3 +451,32 @@ Do you want to proceed?
Confirm [y/N] y
Node was successfully detached from the cluster
----

[float]
==== Bypassing version checks

Run the `elasticsearch-node override-version` command to overwrite the version
stored in the data path so that a node can start despite being incompatible
with the data stored in the data path:

[source, txt]
----
node$ ./bin/elasticsearch-node override-version

WARNING: Elasticsearch MUST be stopped before running this tool.

This data path was last written by Elasticsearch version [x.x.x] and may no
longer be compatible with Elasticsearch version [y.y.y]. This tool will bypass
this compatibility check, allowing a version [y.y.y] node to start on this data
path, but a version [y.y.y] node may not be able to read this data or may read
it incorrectly leading to data loss.

You should not use this tool. Instead, continue to use a version [x.x.x] node
on this data path. If necessary, you can use reindex-from-remote to copy the
data from here into an older cluster.

Do you want to proceed?

Confirm [y/N] y
Successfully overwrote this node's metadata to bypass its version compatibility checks.
----
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
public abstract class ElasticsearchNodeCommand extends EnvironmentAwareCommand {
private static final Logger logger = LogManager.getLogger(ElasticsearchNodeCommand.class);
protected final NamedXContentRegistry namedXContentRegistry;
static final String DELIMITER = "------------------------------------------------------------------------\n";
protected static final String DELIMITER = "------------------------------------------------------------------------\n";

static final String STOP_WARNING_MSG =
DELIMITER +
Expand Down Expand Up @@ -81,9 +81,8 @@ protected void processNodePathsWithLock(Terminal terminal, OptionSet options, En
throw new ElasticsearchException(NO_NODE_FOLDER_FOUND_MSG);
}
processNodePaths(terminal, dataPaths, env);
} catch (LockObtainFailedException ex) {
throw new ElasticsearchException(
FAILED_TO_OBTAIN_NODE_LOCK_MSG + " [" + ex.getMessage() + "]");
} catch (LockObtainFailedException e) {
throw new ElasticsearchException(FAILED_TO_OBTAIN_NODE_LOCK_MSG, e);
}
}

Expand Down Expand Up @@ -177,6 +176,17 @@ protected void cleanUpOldMetaData(Terminal terminal, Path[] dataPaths, long newG
MetaData.FORMAT.cleanupOldFiles(newGeneration, dataPaths);
}

protected NodeEnvironment.NodePath[] toNodePaths(Path[] dataPaths) {
return Arrays.stream(dataPaths).map(ElasticsearchNodeCommand::createNodePath).toArray(NodeEnvironment.NodePath[]::new);
}

private static NodeEnvironment.NodePath createNodePath(Path path) {
try {
return new NodeEnvironment.NodePath(path);
} catch (IOException e) {
throw new ElasticsearchException("Unable to investigate path [" + path + "]", e);
}
}

//package-private for testing
OptionParser getParser() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import org.elasticsearch.cli.MultiCommand;
import org.elasticsearch.cli.Terminal;
import org.elasticsearch.env.NodeRepurposeCommand;
import org.elasticsearch.env.OverrideNodeVersionCommand;

// NodeToolCli does not extend LoggingAwareCommand, because LoggingAwareCommand performs logging initialization
// after LoggingAwareCommand instance is constructed.
Expand All @@ -39,6 +40,7 @@ public NodeToolCli() {
subcommands.put("repurpose", new NodeRepurposeCommand());
subcommands.put("unsafe-bootstrap", new UnsafeBootstrapMasterCommand());
subcommands.put("detach-cluster", new DetachClusterCommand());
subcommands.put("override-version", new OverrideNodeVersionCommand());
}

public static void main(String[] args) throws Exception {
Expand Down
11 changes: 8 additions & 3 deletions server/src/main/java/org/elasticsearch/env/NodeEnvironment.java
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
import org.apache.lucene.store.NativeFSLockFactory;
import org.apache.lucene.store.SimpleFSDirectory;
import org.elasticsearch.ElasticsearchException;
import org.elasticsearch.Version;
import org.elasticsearch.cluster.metadata.IndexMetaData;
import org.elasticsearch.cluster.node.DiscoveryNode;
import org.elasticsearch.common.CheckedFunction;
Expand Down Expand Up @@ -248,7 +249,7 @@ public NodeEnvironment(Settings settings, Environment environment) throws IOExce
sharedDataPath = null;
locks = null;
nodeLockId = -1;
nodeMetaData = new NodeMetaData(generateNodeId(settings));
nodeMetaData = new NodeMetaData(generateNodeId(settings), Version.CURRENT);
return;
}
boolean success = false;
Expand Down Expand Up @@ -393,7 +394,6 @@ private void maybeLogHeapDetails() {
logger.info("heap size [{}], compressed ordinary object pointers [{}]", maxHeapSize, useCompressedOops);
}


/**
* scans the node paths and loads existing metaData file. If not found a new meta data will be generated
* and persisted into the nodePaths
Expand All @@ -403,10 +403,15 @@ private static NodeMetaData loadOrCreateNodeMetaData(Settings settings, Logger l
final Path[] paths = Arrays.stream(nodePaths).map(np -> np.path).toArray(Path[]::new);
NodeMetaData metaData = NodeMetaData.FORMAT.loadLatestState(logger, NamedXContentRegistry.EMPTY, paths);
if (metaData == null) {
metaData = new NodeMetaData(generateNodeId(settings));
metaData = new NodeMetaData(generateNodeId(settings), Version.CURRENT);
} else {
metaData = metaData.upgradeToCurrentVersion();
}

// we write again to make sure all paths have the latest state file
assert metaData.nodeVersion().equals(Version.CURRENT) : metaData.nodeVersion() + " != " + Version.CURRENT;
NodeMetaData.FORMAT.writeAndCleanup(metaData, paths);

return metaData;
}

Expand Down
72 changes: 56 additions & 16 deletions server/src/main/java/org/elasticsearch/env/NodeMetaData.java
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@

package org.elasticsearch.env;

import org.elasticsearch.Version;
import org.elasticsearch.common.ParseField;
import org.elasticsearch.common.xcontent.ObjectParser;
import org.elasticsearch.common.xcontent.XContentBuilder;
Expand All @@ -31,66 +32,104 @@
import java.util.Objects;

/**
* Metadata associated with this node. Currently only contains the unique uuid describing this node.
* Metadata associated with this node: its persistent node ID and its version.
* The metadata is persisted in the data folder of this node and is reused across restarts.
*/
public final class NodeMetaData {

private static final String NODE_ID_KEY = "node_id";
private static final String NODE_VERSION_KEY = "node_version";

private final String nodeId;

public NodeMetaData(final String nodeId) {
private final Version nodeVersion;

public NodeMetaData(final String nodeId, final Version nodeVersion) {
this.nodeId = Objects.requireNonNull(nodeId);
this.nodeVersion = Objects.requireNonNull(nodeVersion);
}

@Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}

if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
NodeMetaData that = (NodeMetaData) o;

return Objects.equals(this.nodeId, that.nodeId);
return nodeId.equals(that.nodeId) &&
nodeVersion.equals(that.nodeVersion);
}

@Override
public int hashCode() {
return this.nodeId.hashCode();
return Objects.hash(nodeId, nodeVersion);
}

@Override
public String toString() {
return "node_id [" + nodeId + "]";
return "NodeMetaData{" +
"nodeId='" + nodeId + '\'' +
", nodeVersion=" + nodeVersion +
'}';
}

private static ObjectParser<Builder, Void> PARSER = new ObjectParser<>("node_meta_data", Builder::new);

static {
PARSER.declareString(Builder::setNodeId, new ParseField(NODE_ID_KEY));
PARSER.declareInt(Builder::setNodeVersionId, new ParseField(NODE_VERSION_KEY));
}

public String nodeId() {
return nodeId;
}

public Version nodeVersion() {
return nodeVersion;
}

public NodeMetaData upgradeToCurrentVersion() {
if (nodeVersion.equals(Version.V_EMPTY)) {
assert Version.CURRENT.major <= Version.V_7_0_0.major + 1 : "version is required in the node metadata from v9 onwards";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not to use V_8_0_0 instead of V_7_0_0+1 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we upgrade master to version 9 the constant Version.V_8_0_0 will remain in existence, but Version.V_7_0_0 should be removed, giving a compile-time failure of this assertion.

return new NodeMetaData(nodeId, Version.CURRENT);
}

if (nodeVersion.before(Version.CURRENT.minimumIndexCompatibilityVersion())) {
throw new IllegalStateException(
"cannot upgrade a node from version [" + nodeVersion + "] directly to version [" + Version.CURRENT + "]");
}

if (nodeVersion.after(Version.CURRENT)) {
throw new IllegalStateException(
"cannot downgrade a node from version [" + nodeVersion + "] to version [" + Version.CURRENT + "]");
}

return nodeVersion.equals(Version.CURRENT) ? this : new NodeMetaData(nodeId, Version.CURRENT);
}

private static class Builder {
String nodeId;
Version nodeVersion;

public void setNodeId(String nodeId) {
this.nodeId = nodeId;
}

public void setNodeVersionId(int nodeVersionId) {
this.nodeVersion = Version.fromId(nodeVersionId);
}

public NodeMetaData build() {
return new NodeMetaData(nodeId);
final Version nodeVersion;
if (this.nodeVersion == null) {
assert Version.CURRENT.major <= Version.V_7_0_0.major + 1 : "version is required in the node metadata from v9 onwards";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not to use V_8_0_0 instead of V_7_0_0+1 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we upgrade master to version 9 the constant Version.V_8_0_0 will remain in existence, but Version.V_7_0_0 should be removed, giving a compile-time failure of this assertion.

nodeVersion = Version.V_EMPTY;
} else {
nodeVersion = this.nodeVersion;
}

return new NodeMetaData(nodeId, nodeVersion);
}
}


public static final MetaDataStateFormat<NodeMetaData> FORMAT = new MetaDataStateFormat<NodeMetaData>("node-") {

@Override
Expand All @@ -103,10 +142,11 @@ protected XContentBuilder newXContentBuilder(XContentType type, OutputStream str
@Override
public void toXContent(XContentBuilder builder, NodeMetaData nodeMetaData) throws IOException {
builder.field(NODE_ID_KEY, nodeMetaData.nodeId);
builder.field(NODE_VERSION_KEY, nodeMetaData.nodeVersion.id);
}

@Override
public NodeMetaData fromXContent(XContentParser parser) throws IOException {
public NodeMetaData fromXContent(XContentParser parser) {
return PARSER.apply(parser, null).build();
}
};
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -172,10 +172,6 @@ private String toIndexName(NodeEnvironment.NodePath[] nodePaths, String uuid) {
}
}

private NodeEnvironment.NodePath[] toNodePaths(Path[] dataPaths) {
return Arrays.stream(dataPaths).map(NodeRepurposeCommand::createNodePath).toArray(NodeEnvironment.NodePath[]::new);
}

private Set<String> indexUUIDsFor(Set<Path> indexPaths) {
return indexPaths.stream().map(Path::getFileName).map(Path::toString).collect(Collectors.toSet());
}
Expand Down Expand Up @@ -221,19 +217,11 @@ private void removePath(Path path) {

@SafeVarargs
@SuppressWarnings("varargs")
private final Set<Path> uniqueParentPaths(Collection<Path>... paths) {
private Set<Path> uniqueParentPaths(Collection<Path>... paths) {
// equals on Path is good enough here due to the way these are collected.
return Arrays.stream(paths).flatMap(Collection::stream).map(Path::getParent).collect(Collectors.toSet());
}

private static NodeEnvironment.NodePath createNodePath(Path path) {
try {
return new NodeEnvironment.NodePath(path);
} catch (IOException e) {
throw new ElasticsearchException("Unable to investigate path: " + path + ": " + e.getMessage());
}
}

//package-private for testing
OptionParser getParser() {
return parser;
Expand Down
Loading