From e297457e6c0ce95e306e18862f757f3a1fa381ce Mon Sep 17 00:00:00 2001 From: David Turner Date: Mon, 16 Jul 2018 12:04:38 +0100 Subject: [PATCH] Improve docs for search preferences Today it is unclear what guarantees are offered by the search preference feature, and we claim a guarantee that is stronger than what we really offer: > A custom value will be used to guarantee that the same shards will be used > for the same custom value. This commit clarifies this documentation and explains more clearly why `_primary`, `_replica`, etc. are deprecated in `6.x` and removed in `master`. Relates #31929 #26335 #26791. --- .../search/request/preference.asciidoc | 102 ++++++++++++------ 1 file changed, 68 insertions(+), 34 deletions(-) diff --git a/docs/reference/search/request/preference.asciidoc b/docs/reference/search/request/preference.asciidoc index c2bb190351404..ef67d74ebd437 100644 --- a/docs/reference/search/request/preference.asciidoc +++ b/docs/reference/search/request/preference.asciidoc @@ -1,56 +1,81 @@ [[search-request-preference]] === Preference -Controls a `preference` of which shard copies on which to execute the -search. By default, the operation is randomized among the available shard -copies, unless allocation awareness is used. +Controls a `preference` of which shard copies on which to execute the search. +By default, Elasticsearch selects from the available shard copies in an +unspecified order, taking the <> and +<> configuration into +account. However, it may sometimes be desirable to try and route certain +searches to certain sets of shard copies, for instance to make better use of +per-copy caches. + +Preferences do not _guarantee_ that any particular shard copies are used in a +search, and on a changing index this may mean that repeated searches may yield +different results if they are executed on different shard copies which are in +different refresh states. The `preference` is a query string parameter which can be set to: [horizontal] -`_primary`:: - The operation will go and be executed only on the primary - shards. deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`] +`_primary`:: + The operation will be executed only on primary shards. + deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or + `_prefer_nodes`] -`_primary_first`:: - The operation will go and be executed on the primary - shard, and if not available (failover), will execute on other shards. - deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`] +`_primary_first`:: + The operation will be executed on primary shards if possible, but will + fall back to other shards if not. deprecated[6.1.0, will be removed in + 7.0, use `_only_nodes` or `_prefer_nodes`] `_replica`:: - The operation will go and be executed only on a replica shard. - deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`] + The operation will be executed only on replica shards. If there are + multiple replicas then the order of preference between them is + unspecified. deprecated[6.1.0, will be removed in 7.0, use + `_only_nodes` or `_prefer_nodes`] `_replica_first`:: - The operation will go and be executed only on a replica shard, and if - not available (failover), will execute on other shards. - deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`] + The operation will be executed on replica shards if possible, but will + fall back to other shards if not. If there are multiple replicas then + the order of preference between them is unspecified. deprecated[6.1.0, + will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`] + +`_only_local`:: + The operation will be executed only on shards allocated to the local + node. -`_local`:: - The operation will prefer to be executed on a local - allocated shard if possible. +`_local`:: + The operation will be executed on shards allocated to the local node if + possible, and will fall back to other shards if not. `_prefer_nodes:abc,xyz`:: - Prefers execution on the nodes with the provided - node ids (`abc` or `xyz` in this case) if applicable. + The operation will be executed on nodes with one of the provided node + ids (`abc` or `xyz` in this case) if possible. If suitable shard copies + exist on more than one of the selected nodes then the order of + preference between these copies is unspecified. -`_shards:2,3`:: - Restricts the operation to the specified shards. (`2` - and `3` in this case). This preference can be combined with other - preferences but it has to appear first: `_shards:2,3|_local` +`_shards:2,3`:: + Restricts the operation to the specified shards. (`2` and `3` in this + case). This preference can be combined with other preferences but it + has to appear first: `_shards:2,3|_local` -`_only_nodes`:: - Restricts the operation to nodes specified in <> +`_only_nodes:abc*,x*yz,...`:: + Restricts the operation to nodes specified according to the + <>. If suitable shard copies exist on more + than one of the selected nodes then the order of preference between + these copies is unspecified. -Custom (string) value:: - A custom value will be used to guarantee that - the same shards will be used for the same custom value. This can help - with "jumping values" when hitting different shards in different refresh - states. A sample value can be something like the web session id, or the - user name. +Custom (string) value:: + Any value that does not start with `_`. If two searches both give the same + custom string value for their preference and the underlying cluster state + does not change then the same ordering of shards will be used for the + searches. This does not guarantee that the exact same shards will be used + each time: the cluster state, and therefore the selected shards, may change + for a number of reasons including shard relocations and shard failures, and + nodes may sometimes reject searches causing fallbacks to alternative nodes. + A good candidate for a custom preference value is something like the web + session id or the user name. -For instance, use the user's session ID to ensure consistent ordering of results -for the user: +For instance, use the user's session ID `xyzabc123` as follows: [source,js] ------------------------------------------------ @@ -65,3 +90,12 @@ GET /_search?preference=xyzabc123 ------------------------------------------------ // CONSOLE +WARNING: The `_primary`, `_primary_first`, `_replica` and `_replica_first` are +not recommended, and will be removed in a future version. They do not help to +avoid inconsistent results that arise from the use of shards that have +different refresh states, and Elasticsearch uses synchronous replication so the +primary does not in general hold fresher data than its replicas. The +`_primary_first` and `_replica_first` preferences silently fall back to +non-preferred copies if it is not possible to search the preferred copies. The +`_primary` and `_replica` preferences will silently change their preferred +shards if a replica is promoted to primary, which can happen at any time.