From d6e189ee9f57e6ba89e99844d7dc9fa4cb862937 Mon Sep 17 00:00:00 2001 From: Nik Everett Date: Mon, 30 Jul 2018 17:25:42 -0400 Subject: [PATCH 1/5] Build: Remove shadowing from benchmarks Removes shadowing from the benchmarks. It isn't *strictly* needed. We do have to rework the documentation on how to run the benchmark, but it still seems to work if you run everything through gradle. --- benchmarks/README.md | 35 ++++++++++++++++++----------------- benchmarks/build.gradle | 28 +++------------------------- client/benchmark/README.md | 31 ++++++++++++++++++------------- client/benchmark/build.gradle | 4 ---- 4 files changed, 39 insertions(+), 59 deletions(-) diff --git a/benchmarks/README.md b/benchmarks/README.md index 03aaac7f3c4e6..284a95b2f8789 100644 --- a/benchmarks/README.md +++ b/benchmarks/README.md @@ -4,36 +4,37 @@ This directory contains the microbenchmark suite of Elasticsearch. It relies on ## Purpose -We do not want to microbenchmark everything but the kitchen sink and should typically rely on our -[macrobenchmarks](https://elasticsearch-benchmarks.elastic.co/app/kibana#/dashboard/Nightly-Benchmark-Overview) with -[Rally](http://github.com/elastic/rally). Microbenchmarks are intended to spot performance regressions in performance-critical components. +We do not want to microbenchmark everything but the kitchen sink and should typically rely on our +[macrobenchmarks](https://elasticsearch-benchmarks.elastic.co/app/kibana#/dashboard/Nightly-Benchmark-Overview) with +[Rally](http://github.com/elastic/rally). Microbenchmarks are intended to spot performance regressions in performance-critical components. The microbenchmark suite is also handy for ad-hoc microbenchmarks but please remove them again before merging your PR. ## Getting Started -Just run `gradle :benchmarks:jmh` from the project root directory. It will build all microbenchmarks, execute them and print the result. +Just run `./gradlew -p benchmarks run` from the project root +directory. It will build all microbenchmarks, execute them and print +the result. ## Running Microbenchmarks -Benchmarks are always run via Gradle with `gradle :benchmarks:jmh`. - -Running via an IDE is not supported as the results are meaningless (we have no control over the JVM running the benchmarks). +Running via an IDE is not supported as the results are meaningless +because we have no control over the JVM running the benchmarks. -If you want to run a specific benchmark class, e.g. `org.elasticsearch.benchmark.MySampleBenchmark` or have special requirements -generate the uberjar with `gradle :benchmarks:jmhJar` and run it directly with: +If you want to run a specific benchmark class like, say, +`MemoryStatsBenchmark`, you can use `--args`: ``` -java -jar benchmarks/build/distributions/elasticsearch-benchmarks-*.jar +./gradlew -p benchmarks run --args 'MemoryStatsBenchmark' ``` -JMH supports lots of command line parameters. Add `-h` to the command above to see the available command line options. +Everything in the `'` gets sent on the command line to JMH. ## Adding Microbenchmarks -Before adding a new microbenchmark, make yourself familiar with the JMH API. You can check our existing microbenchmarks and also the +Before adding a new microbenchmark, make yourself familiar with the JMH API. You can check our existing microbenchmarks and also the [JMH samples](http://hg.openjdk.java.net/code-tools/jmh/file/tip/jmh-samples/src/main/java/org/openjdk/jmh/samples/). -In contrast to tests, the actual name of the benchmark class is not relevant to JMH. However, stick to the naming convention and +In contrast to tests, the actual name of the benchmark class is not relevant to JMH. However, stick to the naming convention and end the class name of a benchmark with `Benchmark`. To have JMH execute a benchmark, annotate the respective methods with `@Benchmark`. ## Tips and Best Practices @@ -42,15 +43,15 @@ To get realistic results, you should exercise care when running benchmarks. Here ### Do -* Ensure that the system executing your microbenchmarks has as little load as possible. Shutdown every process that can cause unnecessary +* Ensure that the system executing your microbenchmarks has as little load as possible. Shutdown every process that can cause unnecessary runtime jitter. Watch the `Error` column in the benchmark results to see the run-to-run variance. * Ensure to run enough warmup iterations to get the benchmark into a stable state. If you are unsure, don't change the defaults. * Avoid CPU migrations by pinning your benchmarks to specific CPU cores. On Linux you can use `taskset`. -* Fix the CPU frequency to avoid Turbo Boost from kicking in and skewing your results. On Linux you can use `cpufreq-set` and the +* Fix the CPU frequency to avoid Turbo Boost from kicking in and skewing your results. On Linux you can use `cpufreq-set` and the `performance` CPU governor. * Vary the problem input size with `@Param`. * Use the integrated profilers in JMH to dig deeper if benchmark results to not match your hypotheses: - * Run the generated uberjar directly and use `-prof gc` to check whether the garbage collector runs during a microbenchmarks and skews + * Run the generated uberjar directly and use `-prof gc` to check whether the garbage collector runs during a microbenchmarks and skews your results. If so, try to force a GC between runs (`-gc true`) but watch out for the caveats. * Use `-prof perf` or `-prof perfasm` (both only available on Linux) to see hotspots. * Have your benchmarks peer-reviewed. @@ -59,4 +60,4 @@ To get realistic results, you should exercise care when running benchmarks. Here * Blindly believe the numbers that your microbenchmark produces but verify them by measuring e.g. with `-prof perfasm`. * Run more threads than your number of CPU cores (in case you run multi-threaded microbenchmarks). -* Look only at the `Score` column and ignore `Error`. Instead take countermeasures to keep `Error` low / variance explainable. \ No newline at end of file +* Look only at the `Score` column and ignore `Error`. Instead take countermeasures to keep `Error` low / variance explainable. diff --git a/benchmarks/build.gradle b/benchmarks/build.gradle index 80d1982300dd1..0838af7287126 100644 --- a/benchmarks/build.gradle +++ b/benchmarks/build.gradle @@ -18,11 +18,8 @@ */ apply plugin: 'elasticsearch.build' - -// order of this section matters, see: https://github.com/johnrengelman/shadow/issues/336 -apply plugin: 'application' // have the shadow plugin provide the runShadow task +apply plugin: 'application' mainClassName = 'org.openjdk.jmh.Main' -apply plugin: 'com.github.johnrengelman.shadow' // build an uberjar with all benchmarks // Not published so no need to assemble tasks.remove(assemble) @@ -50,10 +47,8 @@ compileJava.options.compilerArgs << "-Xlint:-cast,-deprecation,-rawtypes,-try,-u // needs to be added separately otherwise Gradle will quote it and javac will fail compileJava.options.compilerArgs.addAll(["-processor", "org.openjdk.jmh.generators.BenchmarkProcessor"]) -forbiddenApis { - // classes generated by JMH can use all sorts of forbidden APIs but we have no influence at all and cannot exclude these classes - ignoreFailures = true -} +// classes generated by JMH can use all sorts of forbidden APIs but we have no influence at all and cannot exclude these classes +forbiddenApisMain.enabled = false // No licenses for our benchmark deps (we don't ship benchmarks) dependencyLicenses.enabled = false @@ -69,20 +64,3 @@ thirdPartyAudit.excludes = [ 'org.openjdk.jmh.profile.HotspotRuntimeProfiler', 'org.openjdk.jmh.util.Utils' ] - -runShadow { - executable = new File(project.runtimeJavaHome, 'bin/java') -} - -// alias the shadowJar and runShadow tasks to abstract from the concrete plugin that we are using and provide a more consistent interface -task jmhJar( - dependsOn: shadowJar, - description: 'Generates an uberjar with the microbenchmarks and all dependencies', - group: 'Benchmark' -) - -task jmh( - dependsOn: runShadow, - description: 'Runs all microbenchmarks', - group: 'Benchmark' -) diff --git a/client/benchmark/README.md b/client/benchmark/README.md index 06211b9d8fe8c..cea82f3fc69c4 100644 --- a/client/benchmark/README.md +++ b/client/benchmark/README.md @@ -2,10 +2,14 @@ 1. Build `client-benchmark-noop-api-plugin` with `gradle :client:client-benchmark-noop-api-plugin:assemble` 2. Install it on the target host with `bin/elasticsearch-plugin install file:///full/path/to/client-benchmark-noop-api-plugin.zip` -3. Start Elasticsearch on the target host (ideally *not* on the same machine) -4. Build an uberjar with `gradle :client:benchmark:shadowJar` and execute it. +3. Start Elasticsearch on the target host (ideally *not* on the machine +that runs the benchmarks) +4. Run the benchmark with +``` +./graldew -p client/branchmark run --args 'params go here' +``` -Repeat all steps above for the other benchmark candidate. +See below for some example invocations. ### Example benchmark @@ -13,32 +17,35 @@ In general, you should define a few GC-related settings `-Xms8192M -Xmx8192M -XX #### Bulk indexing -Download benchmark data from http://benchmarks.elastic.co/corpora/geonames/documents.json.bz2 and decompress them. +Download benchmark data from http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames and decompress them. -Example command line parameters: +Example invocation: ``` -rest bulk 192.168.2.2 ./documents.json geonames type 8647880 5000 +wget http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames/documents-2.json.bz2 +bzip2 -d documents-2.json.bz2 +mv documents-2.json client/benchmark/build +./gradlew -p client/benchmark run --args 'rest bulk localhost build/documents-2.json geonames type 8647880 5000' ``` -The parameters are in order: +The parameters are all in the `'`s and are in order: * Client type: Use either "rest" or "transport" * Benchmark type: Use either "bulk" or "search" * Benchmark target host IP (the host where Elasticsearch is running) * full path to the file that should be bulk indexed * name of the index -* name of the (sole) type in the index +* name of the (sole) type in the index * number of documents in the file * bulk size -#### Bulk indexing +#### Search -Example command line parameters: +Example invocation: ``` -rest search 192.168.2.2 geonames "{ \"query\": { \"match_phrase\": { \"name\": \"Sankt Georgen\" } } }\"" 500,1000,1100,1200 +./gradlew -p client/benchmark run --args 'rest search localhost geonames {"query":{"match_phrase":{"name":"Sankt Georgen"}}} 500,1000,1100,1200' ``` The parameters are in order: @@ -49,5 +56,3 @@ The parameters are in order: * name of the index * a search request body (remember to escape double quotes). The `TransportClientBenchmark` uses `QueryBuilders.wrapperQuery()` internally which automatically adds a root key `query`, so it must not be present in the command line parameter. * A comma-separated list of target throughput rates - - diff --git a/client/benchmark/build.gradle b/client/benchmark/build.gradle index 0c3238d985346..c67120c7cf59b 100644 --- a/client/benchmark/build.gradle +++ b/client/benchmark/build.gradle @@ -18,9 +18,6 @@ */ apply plugin: 'elasticsearch.build' -// build an uberjar with all benchmarks -apply plugin: 'com.github.johnrengelman.shadow' -// have the shadow plugin provide the runShadow task apply plugin: 'application' group = 'org.elasticsearch.client' @@ -32,7 +29,6 @@ build.dependsOn.remove('assemble') archivesBaseName = 'client-benchmarks' mainClassName = 'org.elasticsearch.client.benchmark.BenchmarkMain' - // never try to invoke tests on the benchmark project - there aren't any test.enabled = false From 7bbf874a642952f3e9b90901a08714edb090b597 Mon Sep 17 00:00:00 2001 From: Nik Everett Date: Mon, 30 Jul 2018 19:01:03 -0400 Subject: [PATCH 2/5] Remove ./ --- benchmarks/README.md | 4 ++-- client/benchmark/README.md | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/benchmarks/README.md b/benchmarks/README.md index 284a95b2f8789..1ed8bf5d1b233 100644 --- a/benchmarks/README.md +++ b/benchmarks/README.md @@ -11,7 +11,7 @@ The microbenchmark suite is also handy for ad-hoc microbenchmarks but please rem ## Getting Started -Just run `./gradlew -p benchmarks run` from the project root +Just run `gradlew -p benchmarks run` from the project root directory. It will build all microbenchmarks, execute them and print the result. @@ -24,7 +24,7 @@ If you want to run a specific benchmark class like, say, `MemoryStatsBenchmark`, you can use `--args`: ``` -./gradlew -p benchmarks run --args 'MemoryStatsBenchmark' +gradlew -p benchmarks run --args 'MemoryStatsBenchmark' ``` Everything in the `'` gets sent on the command line to JMH. diff --git a/client/benchmark/README.md b/client/benchmark/README.md index cea82f3fc69c4..12af013b3ec3a 100644 --- a/client/benchmark/README.md +++ b/client/benchmark/README.md @@ -25,7 +25,7 @@ Example invocation: wget http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames/documents-2.json.bz2 bzip2 -d documents-2.json.bz2 mv documents-2.json client/benchmark/build -./gradlew -p client/benchmark run --args 'rest bulk localhost build/documents-2.json geonames type 8647880 5000' +gradlew -p client/benchmark run --args 'rest bulk localhost build/documents-2.json geonames type 8647880 5000' ``` The parameters are all in the `'`s and are in order: @@ -45,7 +45,7 @@ The parameters are all in the `'`s and are in order: Example invocation: ``` -./gradlew -p client/benchmark run --args 'rest search localhost geonames {"query":{"match_phrase":{"name":"Sankt Georgen"}}} 500,1000,1100,1200' +gradlew -p client/benchmark run --args 'rest search localhost geonames {"query":{"match_phrase":{"name":"Sankt Georgen"}}} 500,1000,1100,1200' ``` The parameters are in order: From 2f01a54f7e1e95c0332af932a1081de7c4c91462 Mon Sep 17 00:00:00 2001 From: Nik Everett Date: Mon, 30 Jul 2018 19:01:41 -0400 Subject: [PATCH 3/5] speeling --- client/benchmark/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/client/benchmark/README.md b/client/benchmark/README.md index 12af013b3ec3a..4d5958f70c1b0 100644 --- a/client/benchmark/README.md +++ b/client/benchmark/README.md @@ -6,7 +6,7 @@ that runs the benchmarks) 4. Run the benchmark with ``` -./graldew -p client/branchmark run --args 'params go here' +./gradlew -p client/benchmark run --args 'params go here' ``` See below for some example invocations. From 642f47b372d3b80d753b22c4312f879c5f5cf249 Mon Sep 17 00:00:00 2001 From: Nik Everett Date: Tue, 31 Jul 2018 09:49:01 -0400 Subject: [PATCH 4/5] workaround --- benchmarks/README.md | 6 ++++-- client/benchmark/README.md | 6 +++++- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/benchmarks/README.md b/benchmarks/README.md index 1ed8bf5d1b233..5be14bab0ecbc 100644 --- a/benchmarks/README.md +++ b/benchmarks/README.md @@ -24,10 +24,12 @@ If you want to run a specific benchmark class like, say, `MemoryStatsBenchmark`, you can use `--args`: ``` -gradlew -p benchmarks run --args 'MemoryStatsBenchmark' +gradlew -p benchmarks run --args ' MemoryStatsBenchmark' ``` -Everything in the `'` gets sent on the command line to JMH. +Everything in the `'` gets sent on the command line to JMH. The leading ` ` +inside the `'`s is important. Without it parameters are sometimes sent to +gradle. ## Adding Microbenchmarks diff --git a/client/benchmark/README.md b/client/benchmark/README.md index 4d5958f70c1b0..33322dff9e6f4 100644 --- a/client/benchmark/README.md +++ b/client/benchmark/README.md @@ -6,9 +6,13 @@ that runs the benchmarks) 4. Run the benchmark with ``` -./gradlew -p client/benchmark run --args 'params go here' +./gradlew -p client/benchmark run --args ' params go here' ``` +Everything in the `'` gets sent on the command line to JMH. The leading ` ` +inside the `'`s is important. Without it parameters are sometimes sent to +gradle. + See below for some example invocations. ### Example benchmark From 116c8af9db5bf67cc4af57f98566d7023e52a020 Mon Sep 17 00:00:00 2001 From: Nik Everett Date: Tue, 31 Jul 2018 09:53:25 -0400 Subject: [PATCH 5/5] Moar workaround --- client/benchmark/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/client/benchmark/README.md b/client/benchmark/README.md index 33322dff9e6f4..68a910468e0cd 100644 --- a/client/benchmark/README.md +++ b/client/benchmark/README.md @@ -29,7 +29,7 @@ Example invocation: wget http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames/documents-2.json.bz2 bzip2 -d documents-2.json.bz2 mv documents-2.json client/benchmark/build -gradlew -p client/benchmark run --args 'rest bulk localhost build/documents-2.json geonames type 8647880 5000' +gradlew -p client/benchmark run --args ' rest bulk localhost build/documents-2.json geonames type 8647880 5000' ``` The parameters are all in the `'`s and are in order: @@ -49,7 +49,7 @@ The parameters are all in the `'`s and are in order: Example invocation: ``` -gradlew -p client/benchmark run --args 'rest search localhost geonames {"query":{"match_phrase":{"name":"Sankt Georgen"}}} 500,1000,1100,1200' +gradlew -p client/benchmark run --args ' rest search localhost geonames {"query":{"match_phrase":{"name":"Sankt Georgen"}}} 500,1000,1100,1200' ``` The parameters are in order: