-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use String.join() to describe a list of tasks #28941
Use String.join() to describe a list of tasks #28941
Conversation
This change replaces the use of string concatenation with a call to String.join(). String concatenation might be quadratic, unless the compiler can optimise it away, whereas String.join() is more reliably linear. There can sometimes be a large number of pending ClusterState update tasks and elastic#28920 includes a report that this operation sometimes takes a long time.
NB I haven't been able to reproduce the slowness reported in #28920, because I don't have easy access to a partitionable cluster with 10k shards. @danielmitterdorfer do you have any opinions about this change and/or good ideas about how to quantify the benefit? If this change looks ok, I can ask the OP of #28920 to try it. |
I am not familiar with this part of the code. But if this is the bottleneck then your change makes sense to me.
I think this would be a good candidate for a microbenchmark? We have some infrastructure for that in place in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
return s1 + ", " + s2; | ||
} | ||
}).orElse(""); | ||
return String.join(", ", tasks.stream().map(t -> (CharSequence)t.toString()).filter(t -> t.length() == 0)::iterator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of curiosity - didn't T::toString
work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, apparently Iterator<String> cannot be converted to Iterator<CharSequence>
:
> Task :server:compileJava
/Users/davidturner/src/elasticsearch-master/server/src/main/java/org/elasticsearch/cluster/ClusterStateTaskExecutor.java:59: error: no suitable method found for join(String,tasks.stre[...]rator)
return String.join(", ", tasks.stream().map(T::toString).filter(t -> t.length() == 0)::iterator);
^
method String.join(CharSequence,CharSequence...) is not applicable
(varargs mismatch; CharSequence is not a functional interface
multiple non-overriding abstract methods found in interface CharSequence)
method String.join(CharSequence,Iterable<? extends CharSequence>) is not applicable
(argument mismatch; bad return type in method reference
Iterator<String> cannot be converted to Iterator<CharSequence>)
Thanks @danielmitterdorfer for the pointer to the benchmarks module, that's just what I was after. I just overwrote the existing benchmark with this one: https://gist.github.com/DaveCTurner/de8763bc791d860e4fb0c9a9f98df7cd. At 10,000 tasks, each with a 100-byte description, the improvement is from ~500ms to ~120µs:
This seems worth doing. |
You're welcome. That's quite a significant difference indeed. |
This change replaces the use of string concatenation with a call to String.join(). String concatenation might be quadratic, unless the compiler can optimise it away, whereas String.join() is more reliably linear. There can sometimes be a large number of pending ClusterState update tasks and #28920 includes a report that this operation sometimes takes a long time.
This change replaces the use of string concatenation with a call to String.join(). String concatenation might be quadratic, unless the compiler can optimise it away, whereas String.join() is more reliably linear. There can sometimes be a large number of pending ClusterState update tasks and elastic#28920 includes a report that this operation sometimes takes a long time.
This change replaces the use of string concatenation with a call to String.join(). String concatenation might be quadratic, unless the compiler can optimise it away, whereas String.join() is more reliably linear. There can sometimes be a large number of pending ClusterState update tasks and elastic#28920 includes a report that this operation sometimes takes a long time.
* master: (28 commits) Maybe die before failing engine (elastic#28973) Remove special handling for _all in nodes info Remove Booleans use from XContent and ToXContent (elastic#28768) Update Gradle Testing Docs (elastic#28970) Make primary-replica resync failures less lenient (elastic#28534) Remove temporary file 10_basic.yml~ Use different pipeline id in test. (pipelines do not get removed between tests extending from ESIntegTestCase) Use fixture to test the repository-gcs plugin (elastic#28788) Use String.join() to describe a list of tasks (elastic#28941) Fixed incorrect test try-catch statement Plugins: Consolidate plugin and module loading code (elastic#28815) percolator: Take `matchAllDocs` and `verified` of the sub result into account when analyzing a function_score query. Build: Remove rest tests on archive distribution projects (elastic#28952) Remove FastStringReader in favor of vanilla StringReader (elastic#28944) Remove FastCharArrayReader and FastCharArrayWriter (elastic#28951) Continue registering pipelines after one pipeline parse failure. (elastic#28752) Build: Fix ability to ignore when no tests are run (elastic#28930) [rest-api-spec] update doc link for /_rank_eval Switch XContentBuilder from BytesStreamOutput to ByteArrayOutputStream (elastic#28945) Factor UnknownNamedObjectException into its own class (elastic#28931) ...
In elastic#28941 we changed the computation of cluster state task descriptions but this introduced a bug in which we only log the empty descriptions (rather than the non-empty ones). This PR fixes that.
In #28941 we changed the computation of cluster state task descriptions but this introduced a bug in which we only log the empty descriptions (rather than the non-empty ones). This change fixes that.
In #28941 we changed the computation of cluster state task descriptions but this introduced a bug in which we only log the empty descriptions (rather than the non-empty ones). This change fixes that.
In elastic#28941 we changed the computation of cluster state task descriptions but this introduced a bug in which we only log the empty descriptions (rather than the non-empty ones). This change fixes that. Backport of elastic#34182.
In #28941 we changed the computation of cluster state task descriptions but this introduced a bug in which we only log the empty descriptions (rather than the non-empty ones). This change fixes that.
This change replaces the use of string concatenation with a call to
String.join(). String concatenation might be quadratic, unless the compiler can
optimise it away, whereas String.join() is more reliably linear. There can
sometimes be a large number of pending ClusterState update tasks and #28920
includes a report that this operation sometimes takes a long time.