Snapshot creations have huge heap footprint after abrupt full-cluster restart #89952

DaveCTurner · 2022-09-09T07:47:49Z

If the cluster shuts down while updating the root repository data blob then it will set BlobStoreRepository#uncleanStart on startup, which causes Elasticsearch to skip the caching of RepositoryData in favour of reading the blob afresh from the repository each time it's needed.

If on startup ILM finds indices waiting to move to the searchable snapshot phase then it will attempt to create snapshots of each such index. Each create-snapshot task holds a reference to the RepositoryData it captured when the task was submitted.

The trouble is that each RepositoryData instance could be tens of MBs in size and while uncleanStart is set there is no sharing between these instances. In the case of this I saw, RepositoryData was ~58MiB and there were 17 create-snapshot tasks in the queue, so these tasks alone consumed almost 1GiB of heap. There were also 6 snapshot_meta threads all busy loading more copies of RepositoryData with a total of 530MiB of local state.

Relates #77466

Workaround

Clearing the uncleanStart flag should restore the caching (and hence sharing) of RepositoryData again:

Disable ILM (needs to happen immediately after startup before it triggers any snapshots).
Take a single snapshot manually to complete the pending write of the root metadata blob. The content of the snapshot doesn't matter, so you may as well restrict it to just a single small index.
When that snapshot completes, it is safe to enable ILM again.

The text was updated successfully, but these errors were encountered:

elasticsearchmachine · 2022-09-09T07:48:12Z

Pinging @elastic/es-distributed (Team:Distributed)

Low effort imporovement to elastic#89952 mostly. We should be turning the index identifier lookup into an immutable map when parsing these right away. We do this conversion/copy for both the adding and removing of snapshots from the `IndexMetadataGenerations` later on anyways so this doesn't add any CPU cost overall. What it does however is save a massive amount of heap for single index snapshots (where the overhead of hash map over the immutable map is the greatest) when first parsing this structure from the repo and potentially having it duplicated on heap many times over due to elastic#89952.

Low effort imporovement to #89952 mostly. We should be turning the index identifier lookup into an immutable map when parsing these right away. We do this conversion/copy for both the adding and removing of snapshots from the `IndexMetadataGenerations` later on anyways so this doesn't add any CPU cost overall. What it does however is save a massive amount of heap for single index snapshots (where the overhead of hash map over the immutable map is the greatest) when first parsing this structure from the repo and potentially having it duplicated on heap many times over due to #89952.

…ng repository instance This makes use of the new deduplicator infrastructure to move to more efficient deduplication mechanics. The existing solution hardly ever deduplicated because it would only deduplicate after the repository entered a consistent state. The adjusted solution is much simpler, in that it simply deduplicates such that only a single loading of `RepositoryData` will ever happen at a time, fixing memory issues from massively concurrent loading of the repo data as described in elastic#89952. closes elastic#89952

…ng repository instance (#91851) This makes use of the new deduplicator infrastructure to move to more efficient deduplication mechanics. The existing solution hardly ever deduplicated because it would only deduplicate after the repository entered a consistent state. The adjusted solution is much simpler, in that it simply deduplicates such that only a single loading of `RepositoryData` will ever happen at a time, fixing memory issues from massively concurrent loading of the repo data as described in #89952. closes #89952

…ng repository instance (elastic#91851) This makes use of the new deduplicator infrastructure to move to more efficient deduplication mechanics. The existing solution hardly ever deduplicated because it would only deduplicate after the repository entered a consistent state. The adjusted solution is much simpler, in that it simply deduplicates such that only a single loading of `RepositoryData` will ever happen at a time, fixing memory issues from massively concurrent loading of the repo data as described in elastic#89952. closes elastic#89952

…-caching repository instance (#91851) (#91866) * Simplify and optimize deduplication of RepositoryData for a non-caching repository instance (#91851) This makes use of the new deduplicator infrastructure to move to more efficient deduplication mechanics. The existing solution hardly ever deduplicated because it would only deduplicate after the repository entered a consistent state. The adjusted solution is much simpler, in that it simply deduplicates such that only a single loading of `RepositoryData` will ever happen at a time, fixing memory issues from massively concurrent loading of the repo data as described in #89952. closes #89952 * fix compile

…-caching repository instance (elastic#91851) (elastic#91866) * Simplify and optimize deduplication of RepositoryData for a non-caching repository instance (elastic#91851) This makes use of the new deduplicator infrastructure to move to more efficient deduplication mechanics. The existing solution hardly ever deduplicated because it would only deduplicate after the repository entered a consistent state. The adjusted solution is much simpler, in that it simply deduplicates such that only a single loading of `RepositoryData` will ever happen at a time, fixing memory issues from massively concurrent loading of the repo data as described in elastic#89952. closes elastic#89952 * fix compile

…-caching repository instance (#91851) (#91866) (#92661) * Simplify and optimize deduplication of RepositoryData for a non-caching repository instance (#91851) This makes use of the new deduplicator infrastructure to move to more efficient deduplication mechanics. The existing solution hardly ever deduplicated because it would only deduplicate after the repository entered a consistent state. The adjusted solution is much simpler, in that it simply deduplicates such that only a single loading of `RepositoryData` will ever happen at a time, fixing memory issues from massively concurrent loading of the repo data as described in #89952. closes #89952 * fix compile

DaveCTurner added >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs labels Sep 9, 2022

elasticsearchmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Sep 9, 2022

DaveCTurner mentioned this issue Sep 9, 2022

Fix Large Shard Count Scalability Issues #77466

Open

97 tasks

original-brownbear self-assigned this Sep 22, 2022

original-brownbear mentioned this issue Nov 22, 2022

Build more compact RepositoryData when parsing from JSON #91817

Merged

original-brownbear mentioned this issue Nov 23, 2022

Simplify and optimize deduplication of RepositoryData for a non-caching repository instance #91851

Merged

original-brownbear closed this as completed in #91851 Nov 23, 2022

original-brownbear mentioned this issue Jan 4, 2023

[7.17] Simplify and optimize deduplication of RepositoryData for a non-caching repository instance (#91851) (#91866) #92661

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Snapshot creations have huge heap footprint after abrupt full-cluster restart #89952

Snapshot creations have huge heap footprint after abrupt full-cluster restart #89952

DaveCTurner commented Sep 9, 2022 •

edited

Loading

elasticsearchmachine commented Sep 9, 2022

Snapshot creations have huge heap footprint after abrupt full-cluster restart #89952

Snapshot creations have huge heap footprint after abrupt full-cluster restart #89952

Comments

DaveCTurner commented Sep 9, 2022 • edited Loading

Workaround

elasticsearchmachine commented Sep 9, 2022

DaveCTurner commented Sep 9, 2022 •

edited

Loading