Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up DiscoveryNodeFilters.trimTier #78170

Closed

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Sep 22, 2021

This shows up as expensive in profiling in at least the data tier allocation decider in 7.x.
Made it more efficient and also made copyMapWithRemovedEntry that it uses more
efficient and added tests for it.

This may be unused in `8` but it shows up as expensive in profiling
in at least the data tier allocation decider in 7.x. Made it more
efficient and also made `copyMapWithRemovedEntry` that it uses more
efficient and added tests for it.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in at least the data tier allocation decider in 7.x.

I don't understand this, DataTierAllocationDecider uses neither DiscoveryNodeFilters nor calls the trimTier method? Do you mean the FilterAllocationDecider?

I am a little worried that this is premature optimization, other than JMH benchmarks, how big of an impact does this have in a real-world system?

@@ -82,6 +85,8 @@ private boolean matchByIP(String[] values, @Nullable String hostIp, @Nullable St
return false;
}

private static final String TIER_PREFERENCE = "_tier_preference";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this up to the top where the other class variables are please?

Comment on lines +89 to +91
if (map.containsKey(key) == false) {
return map;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This subtly changes the behavior because someone would assume that the returned value is always a copied map (since it's in the name), and not the original map

Comment on lines +96 to +104
@SuppressWarnings("rawtypes")
final Map.Entry<K, V>[] entries = new Map.Entry[size - 1];
int i = 0;
for (Map.Entry<K, V> entry : map.entrySet()) {
if (key.equals(entry.getKey()) == false) {
entries[i++] = entry;
}
}
return Map.ofEntries(entries);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a lot like premature optimization, is this really a bottleneck here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole thing showed up hot during cluster restart + reroute tests on large shard count benchmarks. The problem with those really is that all these stream things look nice and their overhead might be irrelevant most of the time, now show up all over the place when they get nested inside other loops and we simply only have a single master update thread :)
I think the 7.x code is different for the data tier allocation decider that's why it shows up there, for 8.x this is less of a relevant change probably but as you point out will help with the filter decider.

@danhermann danhermann added v8.1.0 and removed v7.16.0 labels Oct 27, 2021
@original-brownbear
Copy link
Member Author

Closing this in favor of #80179 which is a simpler and much faster solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants