Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MasterService#patchVersions is rather inefficient #77888

Closed
Tracked by #77466
DaveCTurner opened this issue Sep 16, 2021 · 4 comments
Closed
Tracked by #77466

MasterService#patchVersions is rather inefficient #77888

DaveCTurner opened this issue Sep 16, 2021 · 4 comments
Labels
>bug :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

Comments

@DaveCTurner
Copy link
Contributor

DaveCTurner commented Sep 16, 2021

Today MasterService#patchVersions creates a new Metadata$Builder, increments the version, and then calls build() which expensively recomputes a bunch of data structures, e.g. Metadata#indicesLookup. This is quite inefficient, we could just copy those data structures from the previous Metadata and save a bunch of work and garbage-generation.

This is what it looks like in a flamegraph:

image

That's only ~7.5% of the total time on this otherwise busy master, but we could still cut this in half.

@DaveCTurner DaveCTurner added >bug :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. labels Sep 16, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Sep 16, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@original-brownbear
Copy link
Member

Link #77466

This one is still very interesting to the many shards effort. I just looked at a recent profile and as we make improvements to various CS update steps this is actually starting to make up a larger and larger portion of what the master update thread spends its time on.
We can probably do a similar thing to what the fix here looks like to org.elasticsearch.cluster.routing.allocation.IndexMetadataUpdater#applyChanges as well, which also gets hit by rebuilding the full index lookup even though it only changes part of the Metadata that have not relevance to the index lookup as well.

@original-brownbear
Copy link
Member

Looks like #79004 is going to mostly address this now from the looks of it (I just benchmarked it and that PR gets us a large chunk of the obvious improvements that are possible for this method). Please hold off on working on this for now.

martijnvg added a commit that referenced this issue Oct 26, 2021
In cases when indices, aliases and data streams aren't modified then
the indices lookup can be reused.

For example in:
* The IndexMetadataUpdater#applyChanges(...) method builds a new metadata
instance, but only primary term or insync allocations may be updated.
No new indices, aliases or data streams are added, so re-building indices
lookup is not necessary.
* MasterService#patchVersions

Additionally the logic that checks when indices lookup can be reused,
this logic also checks the hidden and system flags of indices/datastreams.

In clusters with many indices the cost of building indices lookup is
non-neglectable and should be avoided in this case.

Closes #78980
Partially addresses to #77888
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Oct 26, 2021
Backporting elastic#79004 to 7.x branch.

In cases when indices, aliases and data streams aren't modified then
the indices lookup can be reused.

For example in:
* The IndexMetadataUpdater#applyChanges(...) method builds a new metadata
instance, but only primary term or insync allocations may be updated.
No new indices, aliases or data streams are added, so re-building indices
lookup is not necessary.
* MasterService#patchVersions

Additionally the logic that checks when indices lookup can be reused,
this logic also checks the hidden and system flags of indices/datastreams.

In clusters with many indices the cost of building indices lookup is
non-neglectable and should be avoided in this case.

Closes elastic#78980
Partially addresses elastic#77888
martijnvg added a commit that referenced this issue Oct 26, 2021
Backporting #79004 to 7.16 branch.

In cases when indices, aliases and data streams aren't modified then
the indices lookup can be reused.

For example in:
* The IndexMetadataUpdater#applyChanges(...) method builds a new metadata
instance, but only primary term or insync allocations may be updated.
No new indices, aliases or data streams are added, so re-building indices
lookup is not necessary.
* MasterService#patchVersions

Additionally the logic that checks when indices lookup can be reused,
this logic also checks the hidden and system flags of indices/datastreams.

In clusters with many indices the cost of building indices lookup is
non-neglectable and should be avoided in this case.

Closes #78980
Partially addresses #77888
lockewritesdocs pushed a commit to lockewritesdocs/elasticsearch that referenced this issue Oct 28, 2021
In cases when indices, aliases and data streams aren't modified then
the indices lookup can be reused.

For example in:
* The IndexMetadataUpdater#applyChanges(...) method builds a new metadata
instance, but only primary term or insync allocations may be updated.
No new indices, aliases or data streams are added, so re-building indices
lookup is not necessary.
* MasterService#patchVersions

Additionally the logic that checks when indices lookup can be reused,
this logic also checks the hidden and system flags of indices/datastreams.

In clusters with many indices the cost of building indices lookup is
non-neglectable and should be avoided in this case.

Closes elastic#78980
Partially addresses to elastic#77888
@arteam arteam removed their assignment Jan 7, 2022
@original-brownbear
Copy link
Member

Closing this, this has disappeared from profiling entirely in master now due to the linked fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.
Projects
None yet
Development

No branches or pull requests

4 participants