-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up DiscoveryNodeFilters.trimTier #78170
Speed up DiscoveryNodeFilters.trimTier #78170
Conversation
This may be unused in `8` but it shows up as expensive in profiling in at least the data tier allocation decider in 7.x. Made it more efficient and also made `copyMapWithRemovedEntry` that it uses more efficient and added tests for it.
Pinging @elastic/es-data-management (Team:Data Management) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in at least the data tier allocation decider in 7.x.
I don't understand this, DataTierAllocationDecider
uses neither DiscoveryNodeFilters
nor calls the trimTier
method? Do you mean the FilterAllocationDecider
?
I am a little worried that this is premature optimization, other than JMH benchmarks, how big of an impact does this have in a real-world system?
@@ -82,6 +85,8 @@ private boolean matchByIP(String[] values, @Nullable String hostIp, @Nullable St | |||
return false; | |||
} | |||
|
|||
private static final String TIER_PREFERENCE = "_tier_preference"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this up to the top where the other class variables are please?
if (map.containsKey(key) == false) { | ||
return map; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This subtly changes the behavior because someone would assume that the returned value is always a copied map (since it's in the name), and not the original map
@SuppressWarnings("rawtypes") | ||
final Map.Entry<K, V>[] entries = new Map.Entry[size - 1]; | ||
int i = 0; | ||
for (Map.Entry<K, V> entry : map.entrySet()) { | ||
if (key.equals(entry.getKey()) == false) { | ||
entries[i++] = entry; | ||
} | ||
} | ||
return Map.ofEntries(entries); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a lot like premature optimization, is this really a bottleneck here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole thing showed up hot during cluster restart + reroute tests on large shard count benchmarks. The problem with those really is that all these stream
things look nice and their overhead might be irrelevant most of the time, now show up all over the place when they get nested inside other loops and we simply only have a single master update thread :)
I think the 7.x code is different for the data tier allocation decider that's why it shows up there, for 8.x this is less of a relevant change probably but as you point out will help with the filter decider.
Closing this in favor of #80179 which is a simpler and much faster solution. |
This shows up as expensive in profiling in at least the data tier allocation decider in 7.x.
Made it more efficient and also made
copyMapWithRemovedEntry
that it uses moreefficient and added tests for it.