From 3aab8187d1448273d5fb22b3ce629b45150573a7 Mon Sep 17 00:00:00 2001 From: Berg Lloyd-Haig Date: Wed, 16 Aug 2017 05:31:28 +1000 Subject: [PATCH] Docs disambiguate reindex's requests_per_second (#26185) Reindex's docs were somewhere between unclear and inaccurate around `requests_per_second`. This makes them much more clear and accurate. --- docs/reference/docs/reindex.asciidoc | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/docs/reference/docs/reindex.asciidoc b/docs/reference/docs/reindex.asciidoc index 86132ea56a228..3fe6e4fe4312c 100644 --- a/docs/reference/docs/reindex.asciidoc +++ b/docs/reference/docs/reindex.asciidoc @@ -535,14 +535,21 @@ shards to become available. Both work exactly how they work in the <>. `requests_per_second` can be set to any positive decimal number (`1.4`, `6`, -`1000`, etc) and throttles the number of requests per second that the reindex -issues or it can be set to `-1` to disabled throttling. The throttling is done -waiting between bulk batches so that it can manipulate the scroll timeout. The -wait time is the difference between the time it took the batch to complete and -the time `requests_per_second * requests_in_the_batch`. Since the batch isn't -broken into multiple bulk requests large batch sizes will cause Elasticsearch -to create many requests and then wait for a while before starting the next set. -This is "bursty" instead of "smooth". The default is `-1`. +`1000`, etc) and throttles the number of batches that the reindex issues by +padding each batch with a wait time. The throttling can be disabled by +setting `requests_per_second` to `-1`. + +The throttling is done waiting between bulk batches so that it can manipulate the +scroll timeout. The wait time is the difference between the request scroll search +size divided by the `requests_per_second` and the `batch_write_time`. By default +the scroll batch size is `1000`, so if the `requests_per_second` is set to `500`: + +`target_total_time` = `1000` / `500 per second` = `2 seconds` + +`wait_time` = `target_total_time` - `batch_write_time` = `2 seconds` - `.5 seconds` = `1.5 seconds` + +Since the batch isn't broken into multiple bulk requests large batch sizes will +cause Elasticsearch to create many requests and then wait for a while before +starting the next set. This is "bursty" instead of "smooth". The default is `-1`. [float] [[docs-reindex-response-body]]