-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up Snapshot Finalization #47283
Speed up Snapshot Finalization #47283
Conversation
As a result of elastic#45689 snapshot finalization started to take significantly longer than before. This may be a little unfortunate since it increases the likelihood of failing to finalize after having written out all the segment blobs. This change parallelizes all the metadata writes that can safely run in parallel in the finalization step to speed the finalization step up again. Also, this will generally speed up the snapshot process overall in case of large number of indices.
Pinging @elastic/es-distributed |
Jenkins run elasticsearch-ci/bwc |
Jenkins run elasticsearch-ci/bwc |
final RepositoryData updatedRepositoryData = getRepositoryData().addSnapshot(snapshotId, blobStoreSnapshot.state(), indices); | ||
snapshotFormat.write(blobStoreSnapshot, blobContainer(), snapshotId.getUUID(), false); | ||
writeIndexGen(updatedRepositoryData, repositoryStateId); | ||
} catch (FileAlreadyExistsException ex) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This catch is gone now, it was dead code because we don't do the exists
check for this blob anymore in the line above where we write the snap-
blob.
indexMetaDataFormat.write(clusterMetaData.index(index.getName()), indexContainer(index), snapshotId.getUUID(), false); | ||
} | ||
} catch (IOException ex) { | ||
throw new SnapshotException(metadata.name(), snapshotId, "failed to write metadata for snapshot", ex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this specific rethrow because we write the index meta in parallel to the root level snap-
blob with this change anyway so throwing with a separate message here seemed pointless.
|
||
public class MockEventuallyConsistentRepositoryTests extends ESTestCase { | ||
|
||
private Environment environment; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just dead-code. Saw it when making adjustments here and just removed it when because I figured it wasn't worth a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments, nothing to worry about as it looks great already
server/src/main/java/org/elasticsearch/repositories/Repository.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/repositories/blobstore/BlobStoreRepository.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/snapshots/SnapshotsService.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/core/src/main/java/org/elasticsearch/snapshots/SourceOnlySnapshotRepository.java
Show resolved
Hide resolved
Thanks @tlrx , all points addressed I think :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice change
Jenkins run elasticsearch-ci/packaging-sample |
Thanks Tanguy! |
As a result of elastic#45689 snapshot finalization started to take significantly longer than before. This may be a little unfortunate since it increases the likelihood of failing to finalize after having written out all the segment blobs. This change parallelizes all the metadata writes that can safely run in parallel in the finalization step to speed the finalization step up again. Also, this will generally speed up the snapshot process overall in case of large number of indices. This is also a nice to have for elastic#46250 since we add yet another step (deleting of old index- blobs in the shards to the finalization.
As a result of #45689 snapshot finalization started to take significantly longer than before. This may be a little unfortunate since it increases the likelihood of failing to finalize after having written out all the segment blobs. This change parallelizes all the metadata writes that can safely run in parallel in the finalization step to speed the finalization step up again. Also, this will generally speed up the snapshot process overall in case of large number of indices. This is also a nice to have for #46250 since we add yet another step (deleting of old index- blobs in the shards to the finalization.
This pull request is a backport of elastic/elasticsearch#47283 The purpose of this pull request is to speed up the snapshot finalization. This is archived by parallelizing the writes of the metadata in the snapshot finalization step. Also, this will generally speed up the snapshot process overall in case of large number of indices. This improvement makes sense, because the snapshot finalization takes much longer since #9327 is integrated.
This pull request is a backport of elastic/elasticsearch#47283 The purpose of this pull request is to speed up the snapshot finalization. This is archived by parallelizing the writes of the metadata in the snapshot finalization step. Also, this will generally speed up the snapshot process overall in case of large number of indices. This improvement makes sense, because the snapshot finalization takes much longer since #9327 is integrated.
This pull request is a backport of elastic/elasticsearch#47283 The purpose of this pull request is to speed up the snapshot finalization. This is archived by parallelizing the writes of the metadata in the snapshot finalization step. Also, this will generally speed up the snapshot process overall in case of large number of indices. This improvement makes sense, because the snapshot finalization takes much longer since #9327 is integrated.
This pull request is a backport of elastic/elasticsearch#47283 The purpose of this pull request is to speed up the snapshot finalization. This is archived by parallelizing the writes of the metadata in the snapshot finalization step. Also, this will generally speed up the snapshot process overall in case of large number of indices. This improvement makes sense, because the snapshot finalization takes much longer since #9327 is integrated.
This pull request is a backport of elastic/elasticsearch#47283 The purpose of this pull request is to speed up the snapshot finalization. This is archived by parallelizing the writes of the metadata in the snapshot finalization step. Also, this will generally speed up the snapshot process overall in case of large number of indices. This improvement makes sense, because the snapshot finalization takes much longer since #9327 is integrated.
This pull request is a backport of elastic/elasticsearch#47283 The purpose of this pull request is to speed up the snapshot finalization. This is archived by parallelizing the writes of the metadata in the snapshot finalization step. Also, this will generally speed up the snapshot process overall in case of large number of indices. This improvement makes sense, because the snapshot finalization takes much longer since #9327 is integrated.
This pull request is a backport of elastic/elasticsearch#47283 The purpose of this pull request is to speed up the snapshot finalization. This is archived by parallelizing the writes of the metadata in the snapshot finalization step. Also, this will generally speed up the snapshot process overall in case of large number of indices. This improvement makes sense, because the snapshot finalization takes much longer since #9327 is integrated.
This pull request is a backport of elastic/elasticsearch#47283 The purpose of this pull request is to speed up the snapshot finalization. This is archived by parallelizing the writes of the metadata in the snapshot finalization step. Also, this will generally speed up the snapshot process overall in case of large number of indices. This improvement makes sense, because the snapshot finalization takes much longer since #9327 is integrated. (cherry picked from commit 3091e26)
This pull request is a backport of elastic/elasticsearch#47283 The purpose of this pull request is to speed up the snapshot finalization. This is archived by parallelizing the writes of the metadata in the snapshot finalization step. Also, this will generally speed up the snapshot process overall in case of large number of indices. This improvement makes sense, because the snapshot finalization takes much longer since #9327 is integrated. (cherry picked from commit 3091e26)
This pull request is a backport of elastic/elasticsearch#47283 The purpose of this pull request is to speed up the snapshot finalization. This is archived by parallelizing the writes of the metadata in the snapshot finalization step. Also, this will generally speed up the snapshot process overall in case of large number of indices. This improvement makes sense, because the snapshot finalization takes much longer since #9327 is integrated. (cherry picked from commit 3091e26)
As a result of #45689 snapshot finalization started to
take significantly longer than before. This may be a
little unfortunate since it increases the likelihood
of failing to finalize after having written out all
the segment blobs.
This change parallelizes all the metadata writes that
can safely run in parallel in the finalization step to
speed the finalization step up again. Also, this will
generally speed up the snapshot process overall in case
of large number of indices.
This is also a nice to have for #46250 since we add yet
another step (deleting of old
index-
blobs in the shardsto the finalization.