From 8aa9283597a5fb6851aa7f7d1ba00522db1428fe Mon Sep 17 00:00:00 2001
From: Dale McDiarmid <dalem@elastic.co>
Date: Thu, 20 Aug 2020 12:28:14 +0100
Subject: [PATCH 1/4] Hints for handling errors and identifying queries and
 responses

---
 docs/recipes.rst | 109 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 109 insertions(+)

diff --git a/docs/recipes.rst b/docs/recipes.rst
index 9b6751764..9eab9513e 100644
--- a/docs/recipes.rst
+++ b/docs/recipes.rst
@@ -172,3 +172,112 @@ Rally will push metrics to the metric store configured in 1. and they can be vis
 To tear down everything issue ``./stop.sh``.
 
 It is possible to specify a different version of Elasticsearch for step 3. by setting ``export ES_VERSION=<the_desired_version>``.
+
+Identifying when errors have been encountered
+--------------------------------------------------------------
+
+Custom track development can be error prone especially if you are testing a new query. A number of reasons can lead to queries returning errors.
+
+Consider a simple example Rally operation:
+
+    {
+      "name": "geo_distance",
+      "operation-type": "search",
+      "index": "logs-*",
+      "body": {
+        "query": {
+           "geo_distance": {
+              "distance": "12km",
+              "source.geo.location": "40,-70"
+           }
+        }
+      }
+    }
+This query requires the field ``source.geo.location`` to be mapped as a ``geo_point`` type. If incorrectly mapped, Elasticsearch will respond with an error. 
+
+Rally will not exit on errors (unless fatal e.g. [http://man7.org/linux/man-pages/man2/connect.2.html](ECONNREFUSED)) by default, instead reporting errors in the summary report via the [Error Rate](https://esrally.readthedocs.io/en/stable/summary_report.html?highlight=on-error#error-rate) statistic. This can potentially leading to misleading results. This behavior is by design and consistent with other load testing tools such as JMeter i.e. In most cases it is desirable that a large long running benchmark should not fail because of a single error response. 
+
+ This behavior can also be changed, by invoking Rally with the [--on-error](https://esrally.readthedocs.io/en/stable/command_line_reference.html?highlight=on-error#on-error) switch e.g.
+
+	esrally --track=geonames --on-error=abort
+	
+Errors can also be investigated if you have configured a [dedicated Elasticsearch metrics store](https://esrally.readthedocs.io/en/stable/configuration.html#advanced-configuration).
+
+Checking Queries and Responses
+--------------------------------------------------------------
+
+As described above, errors can lead to misleading benchmarking results. Some issues, however, are more subtle and the result of queries not behaving and matching as intended.
+
+Consider the following simple Rally operation:
+
+    {
+      "name": "geo_distance",
+      "operation-type": "search",
+      "index": "logs-*",
+      "body": {
+        "query": {
+           "term": {
+		      "http.request.method": {
+		        "value": "GET"
+		      }
+		    }
+        }
+      }
+    }
+For this term query to match the field ``http.request.method`` needs to be type `keyword`. Should this field be [dynamically mapped](https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-field-mapping.html), its default type will be ``text`` causing the value `GET` to be [analyzed](https://www.elastic.co/guide/en/elasticsearch/reference/current/text.html), and indexed as `get`. The above query will in turn return `0` hits. The field should either be correctly mapped or the query modified to match on `http.request.method.keyword`.
+
+Issues such as this can lead to misleading benchmarking results. Prior to running any benchmarks for analysis, we therefore recommended users ascertain whether queries are behaving as intended. Rally provides several tools to assist with this.
+
+Firstly, users can modify the [logging level](https://esrally.readthedocs.io/en/stable/configuration.html?highlight=logging#logging) of Rally to `DEBUG`. Specifically, modify the ``elasticsearch`` logger i.e.:
+
+	"loggers": {
+	  "elasticsearch": {
+	    "handlers": ["rally_log_handler"],
+	    "level": "DEBUG",
+	    "propagate": false
+	  },
+	  "rally.profile": {
+	    "handlers": ["rally_profile_handler"],
+	    "level": "INFO",
+	    "propagate": false
+	  }
+	}
+
+This will inturn ensure logs include the Elasticsearch query and accompanying response e.g.
+
+	2019-12-16 14:56:08,389 -not-actor-/PID:9790 elasticsearch DEBUG > {"sort":[{"geonameid":"asc"}],"query":{"match_all":{}}}
+	2019-12-16 14:56:08,389 -not-actor-/PID:9790 elasticsearch DEBUG < {"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":{"value":1000,"relation":"eq"},"max_score":null,"hits":[{"_index":"geonames","_type":"_doc","_id":"Lb81D28Bu7VEEZ3mXFGw","_score":null,"_source":{"geonameid": 2986043, "name": "Pic de Font Blanca", "asciiname": "Pic de Font Blanca", "alternatenames": "Pic de Font Blanca,Pic du Port", "feature_class": "T", "feature_code": "PK", "country_code": "AD", "admin1_code": "00", "population": 0, "dem": "2860", "timezone": "Europe/Andorra", "location": [1.53335, 42.64991]},"sort":[2986043]},
+
+Users should discard any performance metrics collected from a benchmark with DEBUG logging. This will likely cause a client-side bottleneck so once the correctness of the queries have been established, disable this setting and re-run any benchmarks.
+
+The number of hits from queries can also be investigated if you have configured a [dedicated Elasticsearch metrics store](https://esrally.readthedocs.io/en/stable/configuration.html#advanced-configuration). Specifically, documents within the index pattern ``rally-metrics-*`` contain a ``meta`` field with summary of individual responses e.g.
+
+	{
+	  "@timestamp" : 1597681313435,
+	  "relative-time" : 130273374,
+	  "race-id" : "452ad9d7-9c21-4828-848e-89974af3230e",
+	  "race-timestamp" : "20200817T160412Z",
+	  "environment" : "Personal",
+	  "track" : "geonames",
+	  "challenge" : "append-no-conflicts",
+	  "car" : "defaults",
+	  "name" : "latency",
+	  "value" : 270.77871300025436,
+	  "unit" : "ms",
+	  "sample-type" : "warmup",
+	  "meta" : {
+	    "source_revision" : "757314695644ea9a1dc2fecd26d1a43856725e65",
+	    "distribution_version" : "7.8.0",
+	    "distribution_flavor" : "oss",
+	    "pages" : 25,
+	    "hits" : 11396503,
+	    "hits_relation" : "eq",
+	    "timed_out" : false,
+	    "took" : 110,
+	    "success" : true
+	  },
+	  "task" : "scroll",
+	  "operation" : "scroll",
+	  "operation-type" : "Search"
+	}
+

From 5db75b5b11158e39199c03177ce243fa0dd9805d Mon Sep 17 00:00:00 2001
From: Dale McDiarmid <dalem@elastic.co>
Date: Thu, 20 Aug 2020 12:41:29 +0100
Subject: [PATCH 2/4] Fix formatting errors

---
 docs/recipes.rst | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/docs/recipes.rst b/docs/recipes.rst
index 9eab9513e..9b8fddcb3 100644
--- a/docs/recipes.rst
+++ b/docs/recipes.rst
@@ -178,7 +178,7 @@ Identifying when errors have been encountered
 
 Custom track development can be error prone especially if you are testing a new query. A number of reasons can lead to queries returning errors.
 
-Consider a simple example Rally operation:
+Consider a simple example Rally operation::
 
     {
       "name": "geo_distance",
@@ -193,6 +193,7 @@ Consider a simple example Rally operation:
         }
       }
     }
+
 This query requires the field ``source.geo.location`` to be mapped as a ``geo_point`` type. If incorrectly mapped, Elasticsearch will respond with an error. 
 
 Rally will not exit on errors (unless fatal e.g. [http://man7.org/linux/man-pages/man2/connect.2.html](ECONNREFUSED)) by default, instead reporting errors in the summary report via the [Error Rate](https://esrally.readthedocs.io/en/stable/summary_report.html?highlight=on-error#error-rate) statistic. This can potentially leading to misleading results. This behavior is by design and consistent with other load testing tools such as JMeter i.e. In most cases it is desirable that a large long running benchmark should not fail because of a single error response. 
@@ -208,7 +209,7 @@ Checking Queries and Responses
 
 As described above, errors can lead to misleading benchmarking results. Some issues, however, are more subtle and the result of queries not behaving and matching as intended.
 
-Consider the following simple Rally operation:
+Consider the following simple Rally operation::
 
     {
       "name": "geo_distance",
@@ -224,11 +225,12 @@ Consider the following simple Rally operation:
         }
       }
     }
+
 For this term query to match the field ``http.request.method`` needs to be type `keyword`. Should this field be [dynamically mapped](https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-field-mapping.html), its default type will be ``text`` causing the value `GET` to be [analyzed](https://www.elastic.co/guide/en/elasticsearch/reference/current/text.html), and indexed as `get`. The above query will in turn return `0` hits. The field should either be correctly mapped or the query modified to match on `http.request.method.keyword`.
 
 Issues such as this can lead to misleading benchmarking results. Prior to running any benchmarks for analysis, we therefore recommended users ascertain whether queries are behaving as intended. Rally provides several tools to assist with this.
 
-Firstly, users can modify the [logging level](https://esrally.readthedocs.io/en/stable/configuration.html?highlight=logging#logging) of Rally to `DEBUG`. Specifically, modify the ``elasticsearch`` logger i.e.:
+Firstly, users can modify the [logging level](https://esrally.readthedocs.io/en/stable/configuration.html?highlight=logging#logging) of Rally to `DEBUG`. Specifically, modify the ``elasticsearch`` logger i.e.::
 
 	"loggers": {
 	  "elasticsearch": {
@@ -250,7 +252,7 @@ This will inturn ensure logs include the Elasticsearch query and accompanying re
 
 Users should discard any performance metrics collected from a benchmark with DEBUG logging. This will likely cause a client-side bottleneck so once the correctness of the queries have been established, disable this setting and re-run any benchmarks.
 
-The number of hits from queries can also be investigated if you have configured a [dedicated Elasticsearch metrics store](https://esrally.readthedocs.io/en/stable/configuration.html#advanced-configuration). Specifically, documents within the index pattern ``rally-metrics-*`` contain a ``meta`` field with summary of individual responses e.g.
+The number of hits from queries can also be investigated if you have configured a [dedicated Elasticsearch metrics store](https://esrally.readthedocs.io/en/stable/configuration.html#advanced-configuration). Specifically, documents within the index pattern ``rally-metrics-*`` contain a ``meta`` field with summary of individual responses e.g.::
 
 	{
 	  "@timestamp" : 1597681313435,

From 7419fcfb14aa3a3fb272969025a71ae50333807e Mon Sep 17 00:00:00 2001
From: Dale McDiarmid <dalem@elastic.co>
Date: Fri, 21 Aug 2020 12:54:27 +0000
Subject: [PATCH 3/4] Fix links and formatting

---
 docs/command_line_reference.rst |  2 ++
 docs/configuration.rst          |  4 ++++
 docs/recipes.rst                | 24 ++++++++++++------------
 docs/summary_report.rst         |  2 ++
 4 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/docs/command_line_reference.rst b/docs/command_line_reference.rst
index 7ec1fbb94..ab87c788d 100644
--- a/docs/command_line_reference.rst
+++ b/docs/command_line_reference.rst
@@ -672,6 +672,8 @@ Client certificates can be presented regardless of the ``verify_certs`` setting,
 * Enable SSL, verify server certificates using private CA: ``--client-options="use_ssl:true,verify_certs:true,ca_certs:'/path/to/cacert.pem'"``
 * Enable SSL, verify server certificates using private CA, present client certificates: ``--client-options="use_ssl:true,verify_certs:true,ca_certs:'/path/to/cacert.pem',client_cert:'/path/to/client_cert.pem',client_key:'/path/to/client_key.pem'"``
 
+.. _command_line_reference_on_error:
+
 ``on-error``
 ~~~~~~~~~~~~
 
diff --git a/docs/configuration.rst b/docs/configuration.rst
index 2a7e43883..cfb20ec1d 100644
--- a/docs/configuration.rst
+++ b/docs/configuration.rst
@@ -39,6 +39,8 @@ Let's go through an example step by step: First run ``esrally``::
 
 Congratulations! Time to :doc:`run your first benchmark </race>`.
 
+.. _advanced_configuration:
+
 Advanced Configuration
 ----------------------
 
@@ -112,6 +114,8 @@ To verify that Rally will connect via the proxy server you can check the log fil
 
    Rally will use this proxy server only for downloading benchmark-related data. It will not use this proxy for the actual benchmark.
 
+.. _logging:
+
 Logging
 -------
 
diff --git a/docs/recipes.rst b/docs/recipes.rst
index 9b8fddcb3..51cdc2fb7 100644
--- a/docs/recipes.rst
+++ b/docs/recipes.rst
@@ -196,13 +196,13 @@ Consider a simple example Rally operation::
 
 This query requires the field ``source.geo.location`` to be mapped as a ``geo_point`` type. If incorrectly mapped, Elasticsearch will respond with an error. 
 
-Rally will not exit on errors (unless fatal e.g. [http://man7.org/linux/man-pages/man2/connect.2.html](ECONNREFUSED)) by default, instead reporting errors in the summary report via the [Error Rate](https://esrally.readthedocs.io/en/stable/summary_report.html?highlight=on-error#error-rate) statistic. This can potentially leading to misleading results. This behavior is by design and consistent with other load testing tools such as JMeter i.e. In most cases it is desirable that a large long running benchmark should not fail because of a single error response. 
+Rally will not exit on errors (unless fatal e.g. `ECONNREFUSED <http://man7.org/linux/man-pages/man2/connect.2.html>`_) by default, instead reporting errors in the summary report via the :ref:`Error Rate <summary_report_error_rate>` statistic. This can potentially leading to misleading results. This behavior is by design and consistent with other load testing tools such as JMeter i.e. In most cases it is desirable that a large long running benchmark should not fail because of a single error response.
 
- This behavior can also be changed, by invoking Rally with the [--on-error](https://esrally.readthedocs.io/en/stable/command_line_reference.html?highlight=on-error#on-error) switch e.g.
+This behavior can also be changed, by invoking Rally with the :ref:`--on-error <command_line_reference_on_error>` switch e.g.::
 
 	esrally --track=geonames --on-error=abort
 	
-Errors can also be investigated if you have configured a [dedicated Elasticsearch metrics store](https://esrally.readthedocs.io/en/stable/configuration.html#advanced-configuration).
+Errors can also be investigated if you have configured a :ref:`dedicated Elasticsearch metrics store <advanced_configuration>`.
 
 Checking Queries and Responses
 --------------------------------------------------------------
@@ -217,20 +217,20 @@ Consider the following simple Rally operation::
       "index": "logs-*",
       "body": {
         "query": {
-           "term": {
-		      "http.request.method": {
-		        "value": "GET"
-		      }
-		    }
+          "term": {
+            "http.request.method": {
+              "value": "GET"
+            }
+          }
         }
       }
     }
 
-For this term query to match the field ``http.request.method`` needs to be type `keyword`. Should this field be [dynamically mapped](https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-field-mapping.html), its default type will be ``text`` causing the value `GET` to be [analyzed](https://www.elastic.co/guide/en/elasticsearch/reference/current/text.html), and indexed as `get`. The above query will in turn return `0` hits. The field should either be correctly mapped or the query modified to match on `http.request.method.keyword`.
+For this term query to match the field ``http.request.method`` needs to be type ``keyword``. Should this field be `dynamically mapped <https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-field-mapping.html>`_, its default type will be ``text`` causing the value ``GET`` to be `analyzed <https://www.elastic.co/guide/en/elasticsearch/reference/current/text.html>`_, and indexed as ``get``. The above query will in turn return ``0`` hits. The field should either be correctly mapped or the query modified to match on ``http.request.method.keyword``.
 
 Issues such as this can lead to misleading benchmarking results. Prior to running any benchmarks for analysis, we therefore recommended users ascertain whether queries are behaving as intended. Rally provides several tools to assist with this.
 
-Firstly, users can modify the [logging level](https://esrally.readthedocs.io/en/stable/configuration.html?highlight=logging#logging) of Rally to `DEBUG`. Specifically, modify the ``elasticsearch`` logger i.e.::
+Firstly, users can modify the :ref:`logging level <logging>` of Rally to ``DEBUG``. Specifically, modify the ``elasticsearch`` logger i.e.::
 
 	"loggers": {
 	  "elasticsearch": {
@@ -245,14 +245,14 @@ Firstly, users can modify the [logging level](https://esrally.readthedocs.io/en/
 	  }
 	}
 
-This will inturn ensure logs include the Elasticsearch query and accompanying response e.g.
+This will in turn ensure logs include the Elasticsearch query and accompanying response e.g.::
 
 	2019-12-16 14:56:08,389 -not-actor-/PID:9790 elasticsearch DEBUG > {"sort":[{"geonameid":"asc"}],"query":{"match_all":{}}}
 	2019-12-16 14:56:08,389 -not-actor-/PID:9790 elasticsearch DEBUG < {"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":{"value":1000,"relation":"eq"},"max_score":null,"hits":[{"_index":"geonames","_type":"_doc","_id":"Lb81D28Bu7VEEZ3mXFGw","_score":null,"_source":{"geonameid": 2986043, "name": "Pic de Font Blanca", "asciiname": "Pic de Font Blanca", "alternatenames": "Pic de Font Blanca,Pic du Port", "feature_class": "T", "feature_code": "PK", "country_code": "AD", "admin1_code": "00", "population": 0, "dem": "2860", "timezone": "Europe/Andorra", "location": [1.53335, 42.64991]},"sort":[2986043]},
 
 Users should discard any performance metrics collected from a benchmark with DEBUG logging. This will likely cause a client-side bottleneck so once the correctness of the queries have been established, disable this setting and re-run any benchmarks.
 
-The number of hits from queries can also be investigated if you have configured a [dedicated Elasticsearch metrics store](https://esrally.readthedocs.io/en/stable/configuration.html#advanced-configuration). Specifically, documents within the index pattern ``rally-metrics-*`` contain a ``meta`` field with summary of individual responses e.g.::
+The number of hits from queries can also be investigated if you have configured a :ref:`dedicated Elasticsearch metrics store <advanced_configuration>`. Specifically, documents within the index pattern ``rally-metrics-*`` contain a ``meta`` field with summary of individual responses e.g.::
 
 	{
 	  "@timestamp" : 1597681313435,
diff --git a/docs/summary_report.rst b/docs/summary_report.rst
index 157a9e277..58891f61d 100644
--- a/docs/summary_report.rst
+++ b/docs/summary_report.rst
@@ -187,6 +187,8 @@ Rally reports several percentile numbers for each task. Which percentiles are sh
 * **Definition**: Time period between start of request processing and receiving the complete response. This metric can easily be mixed up with ``latency`` but does not include waiting time. This is what most load testing tools refer to as "latency" (although it is incorrect).
 * **Corresponding metrics key**: ``service_time``
 
+.. _summary_report_error_rate:
+
 Error rate
 ----------
 

From e2ec43d9ec8526ac1d47a136f688bd04e58958aa Mon Sep 17 00:00:00 2001
From: Dale McDiarmid <dalem@elastic.co>
Date: Fri, 21 Aug 2020 13:16:35 +0000
Subject: [PATCH 4/4] Grammatical fixes

---
 docs/recipes.rst | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/recipes.rst b/docs/recipes.rst
index 51cdc2fb7..ccafcdf99 100644
--- a/docs/recipes.rst
+++ b/docs/recipes.rst
@@ -230,7 +230,7 @@ For this term query to match the field ``http.request.method`` needs to be type
 
 Issues such as this can lead to misleading benchmarking results. Prior to running any benchmarks for analysis, we therefore recommended users ascertain whether queries are behaving as intended. Rally provides several tools to assist with this.
 
-Firstly, users can modify the :ref:`logging level <logging>` of Rally to ``DEBUG``. Specifically, modify the ``elasticsearch`` logger i.e.::
+Firstly, users can set the :ref:`log level <logging>` for the Elasticsearch client to ``DEBUG`` i.e.::
 
 	"loggers": {
 	  "elasticsearch": {
@@ -250,9 +250,9 @@ This will in turn ensure logs include the Elasticsearch query and accompanying r
 	2019-12-16 14:56:08,389 -not-actor-/PID:9790 elasticsearch DEBUG > {"sort":[{"geonameid":"asc"}],"query":{"match_all":{}}}
 	2019-12-16 14:56:08,389 -not-actor-/PID:9790 elasticsearch DEBUG < {"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":{"value":1000,"relation":"eq"},"max_score":null,"hits":[{"_index":"geonames","_type":"_doc","_id":"Lb81D28Bu7VEEZ3mXFGw","_score":null,"_source":{"geonameid": 2986043, "name": "Pic de Font Blanca", "asciiname": "Pic de Font Blanca", "alternatenames": "Pic de Font Blanca,Pic du Port", "feature_class": "T", "feature_code": "PK", "country_code": "AD", "admin1_code": "00", "population": 0, "dem": "2860", "timezone": "Europe/Andorra", "location": [1.53335, 42.64991]},"sort":[2986043]},
 
-Users should discard any performance metrics collected from a benchmark with DEBUG logging. This will likely cause a client-side bottleneck so once the correctness of the queries have been established, disable this setting and re-run any benchmarks.
+Users should discard any performance metrics collected from a benchmark with ``DEBUG`` logging. This will likely cause a client-side bottleneck so once the correctness of the queries has been established, disable this setting and re-run any benchmarks.
 
-The number of hits from queries can also be investigated if you have configured a :ref:`dedicated Elasticsearch metrics store <advanced_configuration>`. Specifically, documents within the index pattern ``rally-metrics-*`` contain a ``meta`` field with summary of individual responses e.g.::
+The number of hits from queries can also be investigated if you have configured a :ref:`dedicated Elasticsearch metrics store <advanced_configuration>`. Specifically, documents within the index pattern ``rally-metrics-*`` contain a ``meta`` field with a summary of individual responses e.g.::
 
 	{
 	  "@timestamp" : 1597681313435,