Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot creations have huge heap footprint after abrupt full-cluster restart #89952

Closed
Tracked by #77466
DaveCTurner opened this issue Sep 9, 2022 · 1 comment · Fixed by #91851
Closed
Tracked by #77466

Snapshot creations have huge heap footprint after abrupt full-cluster restart #89952

DaveCTurner opened this issue Sep 9, 2022 · 1 comment · Fixed by #91851
Assignees
Labels
>bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

Comments

@DaveCTurner
Copy link
Contributor

DaveCTurner commented Sep 9, 2022

If the cluster shuts down while updating the root repository data blob then it will set BlobStoreRepository#uncleanStart on startup, which causes Elasticsearch to skip the caching of RepositoryData in favour of reading the blob afresh from the repository each time it's needed.

If on startup ILM finds indices waiting to move to the searchable snapshot phase then it will attempt to create snapshots of each such index. Each create-snapshot task holds a reference to the RepositoryData it captured when the task was submitted.

The trouble is that each RepositoryData instance could be tens of MBs in size and while uncleanStart is set there is no sharing between these instances. In the case of this I saw, RepositoryData was ~58MiB and there were 17 create-snapshot tasks in the queue, so these tasks alone consumed almost 1GiB of heap. There were also 6 snapshot_meta threads all busy loading more copies of RepositoryData with a total of 530MiB of local state.

Relates #77466


Workaround

Clearing the uncleanStart flag should restore the caching (and hence sharing) of RepositoryData again:

  1. Disable ILM (needs to happen immediately after startup before it triggers any snapshots).
  2. Take a single snapshot manually to complete the pending write of the root metadata blob. The content of the snapshot doesn't matter, so you may as well restrict it to just a single small index.
  3. When that snapshot completes, it is safe to enable ILM again.
@DaveCTurner DaveCTurner added >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs labels Sep 9, 2022
@elasticsearchmachine elasticsearchmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Sep 9, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@original-brownbear original-brownbear self-assigned this Sep 22, 2022
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this issue Nov 22, 2022
Low effort imporovement to elastic#89952 mostly.

We should be turning the index identifier lookup into an immutable map
when parsing these right away. We do this conversion/copy for both the adding
and removing of snapshots from the `IndexMetadataGenerations` later on anyways
so this doesn't add any CPU cost overall. What it does however is save a massive
amount of heap for single index snapshots (where the overhead of hash map over
the immutable map is the greatest) when first parsing this structure from the repo
and potentially having it duplicated on heap many times over due to elastic#89952.
original-brownbear added a commit that referenced this issue Nov 23, 2022
Low effort imporovement to #89952 mostly.

We should be turning the index identifier lookup into an immutable map
when parsing these right away. We do this conversion/copy for both the adding
and removing of snapshots from the `IndexMetadataGenerations` later on anyways
so this doesn't add any CPU cost overall. What it does however is save a massive
amount of heap for single index snapshots (where the overhead of hash map over
the immutable map is the greatest) when first parsing this structure from the repo
and potentially having it duplicated on heap many times over due to #89952.
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this issue Nov 23, 2022
…ng repository instance

This makes use of the new deduplicator infrastructure to move to more
efficient deduplication mechanics.
The existing solution hardly ever deduplicated because it would only
deduplicate after the repository entered a consistent state. The
adjusted solution is much simpler, in that it simply deduplicates such
that only a single loading of `RepositoryData` will ever happen at a
time, fixing memory issues from massively concurrent loading of the repo
data as described in elastic#89952.

closes elastic#89952
original-brownbear added a commit that referenced this issue Nov 23, 2022
…ng repository instance (#91851)

This makes use of the new deduplicator infrastructure to move to more
efficient deduplication mechanics.
The existing solution hardly ever deduplicated because it would only
deduplicate after the repository entered a consistent state. The
adjusted solution is much simpler, in that it simply deduplicates such
that only a single loading of `RepositoryData` will ever happen at a
time, fixing memory issues from massively concurrent loading of the repo
data as described in #89952.

closes #89952
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this issue Nov 23, 2022
…ng repository instance (elastic#91851)

This makes use of the new deduplicator infrastructure to move to more
efficient deduplication mechanics.
The existing solution hardly ever deduplicated because it would only
deduplicate after the repository entered a consistent state. The
adjusted solution is much simpler, in that it simply deduplicates such
that only a single loading of `RepositoryData` will ever happen at a
time, fixing memory issues from massively concurrent loading of the repo
data as described in elastic#89952.

closes elastic#89952
elasticsearchmachine pushed a commit that referenced this issue Jan 3, 2023
…-caching repository instance (#91851) (#91866)

* Simplify and optimize deduplication of RepositoryData for a non-caching repository instance (#91851)

This makes use of the new deduplicator infrastructure to move to more
efficient deduplication mechanics.
The existing solution hardly ever deduplicated because it would only
deduplicate after the repository entered a consistent state. The
adjusted solution is much simpler, in that it simply deduplicates such
that only a single loading of `RepositoryData` will ever happen at a
time, fixing memory issues from massively concurrent loading of the repo
data as described in #89952.

closes #89952

* fix compile
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this issue Jan 4, 2023
…-caching repository instance (elastic#91851) (elastic#91866)

* Simplify and optimize deduplication of RepositoryData for a non-caching repository instance (elastic#91851)

This makes use of the new deduplicator infrastructure to move to more
efficient deduplication mechanics.
The existing solution hardly ever deduplicated because it would only
deduplicate after the repository entered a consistent state. The
adjusted solution is much simpler, in that it simply deduplicates such
that only a single loading of `RepositoryData` will ever happen at a
time, fixing memory issues from massively concurrent loading of the repo
data as described in elastic#89952.

closes elastic#89952

* fix compile
elasticsearchmachine pushed a commit that referenced this issue Jan 4, 2023
…-caching repository instance (#91851) (#91866) (#92661)

* Simplify and optimize deduplication of RepositoryData for a non-caching repository instance (#91851)

This makes use of the new deduplicator infrastructure to move to more
efficient deduplication mechanics.
The existing solution hardly ever deduplicated because it would only
deduplicate after the repository entered a consistent state. The
adjusted solution is much simpler, in that it simply deduplicates such
that only a single loading of `RepositoryData` will ever happen at a
time, fixing memory issues from massively concurrent loading of the repo
data as described in #89952.

closes #89952

* fix compile
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.
Projects
None yet
3 participants