Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot deletion and creation slow down as number of snapshots in repository grows #8958

Closed
imotov opened this issue Dec 15, 2014 · 4 comments · Fixed by #8969
Closed

Snapshot deletion and creation slow down as number of snapshots in repository grows #8958

imotov opened this issue Dec 15, 2014 · 4 comments · Fixed by #8969
Assignees
Labels
>bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs

Comments

@imotov
Copy link
Contributor

imotov commented Dec 15, 2014

In order to create a new snapshot or delete an existing snapshot, elasticsearch has to load all existing shard level snapshots to figure out which files need to be copied and which files can be cleaned. The number of files to be checked is equal to number_of_shards * number_of_snapshots, which on a large clusters and frequent snapshots can lead to very long operation times especially with non-filesystem repositories. See elastic/elasticsearch-cloud-aws#150 and this group post for examples of issues that this behavior is causing.

@imotov imotov added >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs labels Dec 15, 2014
@imotov imotov self-assigned this Dec 15, 2014
imotov added a commit to imotov/elasticsearch that referenced this issue Dec 15, 2014
imotov added a commit to imotov/elasticsearch that referenced this issue Jan 23, 2015
@nickcanz
Copy link
Contributor

Just wanted to chime in, this issue has affected us a great deal as well. It made "sense" after I thought it through, how ES snapshotting works, but was an unpleasant surprise.

imotov added a commit to imotov/elasticsearch that referenced this issue Feb 6, 2015
…th large number of snapshots

Each shard repository consists of snapshot file for each snapshot - this file contains a map between original physical file that is snapshotted and its representation in repository. This data includes original filename, checksum and length. When a new snapshot is created, elasticsearch needs to read all these snapshot files to figure which file are already present in the repository and which files still have to be copied there. This change adds a new index file that contains all this information combined into a single file. So, if a repository has 1000 snapshots with 1000 shards elasticsearch will only need to read 1000 blobs (one per shard) instead of 1,000,000 to delete a snapshot. This change should also improve snapshot creation speed on repositories with large number of snapshot and high latency.

Fixes elastic#8958
imotov added a commit to imotov/elasticsearch that referenced this issue Mar 10, 2015
…th large number of snapshots

Each shard repository consists of snapshot file for each snapshot - this file contains a map between original physical file that is snapshotted and its representation in repository. This data includes original filename, checksum and length. When a new snapshot is created, elasticsearch needs to read all these snapshot files to figure which file are already present in the repository and which files still have to be copied there. This change adds a new index file that contains all this information combined into a single file. So, if a repository has 1000 snapshots with 1000 shards elasticsearch will only need to read 1000 blobs (one per shard) instead of 1,000,000 to delete a snapshot. This change should also improve snapshot creation speed on repositories with large number of snapshot and high latency.

Fixes elastic#8958
imotov added a commit to imotov/elasticsearch that referenced this issue Jun 2, 2015
…th large number of snapshots

Each shard repository consists of snapshot file for each snapshot - this file contains a map between original physical file that is snapshotted and its representation in repository. This data includes original filename, checksum and length. When a new snapshot is created, elasticsearch needs to read all these snapshot files to figure which file are already present in the repository and which files still have to be copied there. This change adds a new index file that contains all this information combined into a single file. So, if a repository has 1000 snapshots with 1000 shards elasticsearch will only need to read 1000 blobs (one per shard) instead of 1,000,000 to delete a snapshot. This change should also improve snapshot creation speed on repositories with large number of snapshot and high latency.

Fixes elastic#8958
@niemyjski
Copy link
Contributor

I seem to be seeing this behavior with azure blob storage after upgrading to 1.7.5

@imotov
Copy link
Contributor Author

imotov commented Mar 21, 2016

@niemyjski It was fixed in #8969 in 2.0.0 and above. The fix wasn't backported to 1.7.5.

@tamsky
Copy link

tamsky commented Jun 29, 2016

And, if you've read this far and were wondering if the fix for this might ever get backported to 1.x... the answer is apparently not:

@ #8969 (comment) : imotov says

this was a significant change that required changing the snapshot file format and it was too big of a change for a patch level release. So we didn't port to 1.x and there are no current plans to do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs
Projects
None yet
4 participants