Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Regression for every CS update from ILM's org.elasticsearch.cluster.metadata.Metadata#isIndexManagedByILM #98992

Open
original-brownbear opened this issue Aug 29, 2023 · 4 comments
Labels
>bug :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team

Comments

@original-brownbear
Copy link
Member

original-brownbear commented Aug 29, 2023

Going over the many shards benchmark bootstrapping I noticed it slowed down quite a bit recently.

Turns out a big contributor to this is org.elasticsearch.cluster.metadata.Metadata#isIndexManagedByILM called from
org.elasticsearch.xpack.ilm.IndexLifecycleService#triggerPolicies on every cluster state update and costing O(N) in the number of indices.

image

This could be made more efficient in various ways:

At least we should:

  • remove setting read in the hot loop
  • stop using Metadata.getIndicesLookup, this one is extremely expensive on the applier thread

a first quick fix would be to first check if any datastreams even use DLM and if the answer is no, the whole logic can be skipped. This currently introduces an about 5% overhead into every CS update (relative to stuff like create index and shard allocation in the many shards benchmark) at 25k indices in a cluster and the overhead grows in O(number_of_indices).

@original-brownbear original-brownbear added >bug :Data Management/ILM+SLM Index and Snapshot lifecycle management labels Aug 29, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Aug 29, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@nielsbauman
Copy link
Contributor

I had a look at the options that @original-brownbear suggested and tried to come up with some other options myself as well.

  1. remove setting read in the hot loop

    The PREFER_ILM_SETTING is configured in the settings of individual indices, so I don't think extracting it to outside the loop is possible.

  2. stop using Metadata.getIndicesLookup

    A way to avoid using getIndicesLookup would be to first loop over all the data streams that have a DSL and collect all the indices they cover. Then, when looping over all indices in the cluster, we can use the previously generated collection to determine whether an index is covered by a DSL (and then we'll still have to check the index' settings to know which lifecycle to use).

  3. a first quick fix would be to first check if any datastreams even use DLM and if the answer is no, the whole logic can be skipped.

    While definitely a good idea, we're shipping some data streams with DLS by default, I believe, so that would generally not have any effect unfortunately.

  4. I see that the validation part of reading a setting involves a number of method calls and object creations. Would it be worthwhile to read the setting without validation? This would assume that the settings are validated when stored.

@original-brownbear
Copy link
Member Author

Maybe this helps:

The many-shards project effectively concluded that ILM is somewhat fundamentally flawed in how it executes policies. Policies are optionally trigger on each cluster state update by inspecting each index individually. Thus the logic scales O(N) in the number of indices in the cluster which makes it by far the most expensive CS listener in larger clusters.
The new code that caused this makes the situation considerably worse by introducing another log(N) (for the tree-map lookup) cost factor for each index so now this thing scales O(N*log(N)). This cannot end well.
The real fix here should be to stop having that O(N) cost in ILM byu fixing #80407. If we want to fix this in isolation because #80407 is too hard right now, I'd strongly suggest trying to cache information on the IndexMetadata in some form, any other fix that involves allocations or more indirection is probably not going to get us much. The cost here is looking up from the index lookup + reading the setting. Both just need to go away, optimizing them is doomed to fail in my experience from the many-shards project.

@andreidan
Copy link
Contributor

Thanks, Armin for reporting this and Niels for working on it.

++ on making this more efficient. TIL about the cost of getIndicesLookup when called from the cluster applier thread (it is taking advantage of memoization on the surface)

It's very surprising to me that reading the PREFER_ILM setting takes so much here. That code should only execute if the data stream has a lifecycle so it should only be read 3 times (we ship 3 data streams managed by DSL right now)

This condition guarding the setting read should seldom be true in stateful:

 if (parentDataStream != null && parentDataStream.getLifecycle() != null && parentDataStream.getLifecycle().isEnabled()) {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

4 participants