diff --git a/server/src/main/java/org/elasticsearch/snapshots/SnapshotsService.java b/server/src/main/java/org/elasticsearch/snapshots/SnapshotsService.java index a23e4e650cb83..edee69266b54b 100644 --- a/server/src/main/java/org/elasticsearch/snapshots/SnapshotsService.java +++ b/server/src/main/java/org/elasticsearch/snapshots/SnapshotsService.java @@ -476,6 +476,8 @@ private static Map getInFlightIndexIds(List listener) { final String repositoryName = request.repository(); Repository repository = repositoriesService.repository(repositoryName); diff --git a/server/src/main/java/org/elasticsearch/snapshots/package-info.java b/server/src/main/java/org/elasticsearch/snapshots/package-info.java index e9fea864ff7c7..eb50b9821da93 100644 --- a/server/src/main/java/org/elasticsearch/snapshots/package-info.java +++ b/server/src/main/java/org/elasticsearch/snapshots/package-info.java @@ -100,7 +100,39 @@ * * *

Cloning a Snapshot

- * TODO: write up the steps in a snapshot clone properly + * + *

Cloning part of a snapshot is a process executed entirely on the master node. On a high level, the process of cloning a snapshot is + * analogous to that of creating a snapshot from data in the cluster except that the source of data files is the snapshot repository + * instead of the data nodes. It begins with cloning all shards and then finalizes the cloned snapshot the same way a normal snapshot would + * be finalized. Concretely, it is executed as follows:

+ * + *
    + *
  1. First, {@link org.elasticsearch.snapshots.SnapshotsService#cloneSnapshot} is invoked which will place a placeholder entry into + * {@code SnapshotsInProgress} that does not yet contain any shard clone assignments. Note that unlike in the case of snapshot + * creation, the shard level clone tasks in {@link org.elasticsearch.cluster.SnapshotsInProgress.Entry#clones} are not created in the + * initial cluster state update as is done for shard snapshot assignments in + * {@link org.elasticsearch.cluster.SnapshotsInProgress.Entry#shards}. This is due to the fact that shard snapshot assignments are + * computed purely from information in the current cluster state while shard clone assignments require information to be read from the + * repository, which is too slow of a process to be done inside a cluster state update. Loading this information ahead of creating a + * task in the cluster state, runs the risk of race conditions where the source snapshot is being deleted before the clone task is + * enqueued in the cluster state.
  2. + *
  3. Once a placeholder task for the clone operation is put into the cluster state, we must determine the number of shards in each + * index that is to be cloned as well as ensure the health of the index snapshots in the source snapshot. In order to determine the + * shard count for each index that is to be cloned, we load the index metadata for each such index using the repository's + * {@link org.elasticsearch.repositories.Repository#getSnapshotIndexMetaData} method. In order to ensure the health of the source index + * snapshots, we load the {@link org.elasticsearch.snapshots.SnapshotInfo} for the source snapshot and check for shard snapshot + * failures of the relevant indices.
  4. + *
  5. Once all shard counts are known and the health of all source indices data has been verified, we populate the + * {@code SnapshotsInProgress.Entry#clones} map for the clone operation with the the relevant shard clone tasks.
  6. + *
  7. After the clone tasks have been added to the {@code SnapshotsInProgress.Entry}, master executes them on its snapshot thread-pool + * by invoking {@link org.elasticsearch.repositories.Repository#cloneShardSnapshot} for each shard that is to be cloned. Each completed + * shard snapshot triggers a call to the {@link org.elasticsearch.snapshots.SnapshotsService#SHARD_STATE_EXECUTOR} which updates the + * clone's {@code SnapshotsInProgress.Entry} to mark the shard clone operation completed.
  8. + *
  9. Once all the entries in {@code SnapshotsInProgress.Entry#clones} have completed, the clone is finalized just like any other + * snapshot through {@link org.elasticsearch.snapshots.SnapshotsService#endSnapshot}. The only difference being that the metadata that + * is written out for indices and the global metadata are read from the source snapshot in the repository instead of the cluster state. + *
  10. + *
* *

Concurrent Snapshot Operations

*