Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up ILM cluster task execution #85405

Merged
merged 27 commits into from
Mar 29, 2022
Merged

Conversation

joegallo
Copy link
Contributor

@joegallo joegallo commented Mar 28, 2022

Closes #82708

The central change here is the addition of the special purpose Metadata#withLifecycleState method that bypasses some of the expensive validation that we perform in Metadata.Builder#build -- since we're not adding or removing indices, that validation isn't relevant.

In the many shards scalability profiling that @original-brownbear did against this code, it decreased ILM execution time from 93.33% of the profile to less than 1%, so it's extraordinarily effective at decreasing the time the master spends executing the ILM cluster task batches.

joegallo added 20 commits March 28, 2022 13:33
The other ilm UpdateTasks call it idxMeta in the same context (for
better or worse).
This is only for updating an indexMetadata that already exists and is
associated with the cluster state, it's not for sneakily injecting new
indices.
Since the index already exists, and the lifecycleState is the only
thing that's being changed, we can bypass most of the expense of
building a new cluster state (the internal validation that
Metadata.Builder runs).
A little whitespace, cleanup some comments, rename some variables.
in order to prevent the non-performant code path from resembling the
performant code path. InitializePolicyContextStep now explicitly uses
the non-performant code path, while all the other callers still use
the performant path.
@joegallo joegallo added >enhancement :Data Management/ILM+SLM Index and Snapshot lifecycle management v8.2.0 labels Mar 28, 2022
@joegallo joegallo requested review from dakrone and DaveCTurner March 28, 2022 17:38
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Mar 28, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine
Copy link
Collaborator

Hi @joegallo, I've created a changelog YAML for you.

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I left a few minor comments but thanks for working on this!

@@ -299,6 +300,47 @@ public Metadata withIncrementedVersion() {
);
}

public Metadata withLifecycleState(final Index index, final LifecycleExecutionState lifecycleState) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a public method, can you add javadoc for it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does ea9c861 look to you?

final ImmutableOpenMap.Builder<String, IndexMetadata> builder = ImmutableOpenMap.builder(indices);
builder.put(index.getName(), indexMetadataBuilder.build());

return new Metadata(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be worth adding a comment about why this doesn't use Metadata.builder(this).etc(...) but directly constructs the Metadata, so that it doesn't get accidentally undone in the future. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I handled this in ea9c861, too.

@joegallo joegallo merged commit cc51c1a into elastic:master Mar 29, 2022
@joegallo joegallo deleted the faster-ilm-batches branch March 29, 2022 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management >enhancement Team:Data Management Meta label for data/management team v8.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Large ILM Task Batches are Executed too Slowly
4 participants