From 9d28ba1162a54d1c90f888813f2dc867899ba691 Mon Sep 17 00:00:00 2001 From: Bolarinwa Saheed Date: Sat, 17 Mar 2018 20:57:13 +0100 Subject: [PATCH 01/10] Glossary update, Closes Issue #16891 Terms "Filter" and "Query" were added as described here https://www.elastic.co/guide/en/elasticsearch/guide/current/_queries_and_filters.html --- docs/reference/glossary.asciidoc | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/docs/reference/glossary.asciidoc b/docs/reference/glossary.asciidoc index 0012beebdca98..aaadb9c41c7fc 100644 --- a/docs/reference/glossary.asciidoc +++ b/docs/reference/glossary.asciidoc @@ -61,6 +61,16 @@ `object`. The mapping also allows you to define (amongst other things) how the value for a field should be analyzed. +[[glossary-filter]] filter :: + + A filter is kind of query known as a "non-scoring" query. It does not give a score, + it is only concerned about answering the question - "Does this document match?". + The answer is always a simple, binary yes|no. This kind of query is said to be made + in a "filtering" context, hence it is called a filter. Filtering queries/Filters are + simple checks for set inclusion/exclusion. The goal of filtering is to reduce the + number of documents that have to be examined, as it is in the case of + <>. + [[glossary-index]] index :: An index is like a _table_ in a relational database. It has a @@ -105,6 +115,19 @@ + See also <> +[[glossary-query]] query :: + + Query refers to all queries which not only determine if a document matches, + but also calculate how well the document matches. This calculation is refered + to as scoring, hence these queries are also known as "scoring queries". + A scoring query calculates how relevant each document is to the query, and assigns + it a relevance _score, which is later used to sort matching documents by relevance. + This concept of relevance is well suited to full-text search, where there is seldom a + completely “correct” answer. These queries are more heavier than + <> and their query results are not cacheable. + As a general rule, use query clauses for full-text search or for any condition that should + affect the relevance score, and use filters for everything else. + [[glossary-replica-shard]] replica shard :: Each <> can have zero or more From d28152e05055b97ff125225296d93f908cba89ca Mon Sep 17 00:00:00 2001 From: Bolarinwa Saheed Date: Wed, 21 Mar 2018 16:54:38 +0100 Subject: [PATCH 02/10] Updated based on reviews --- docs/reference/glossary.asciidoc | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/docs/reference/glossary.asciidoc b/docs/reference/glossary.asciidoc index aaadb9c41c7fc..67df500199188 100644 --- a/docs/reference/glossary.asciidoc +++ b/docs/reference/glossary.asciidoc @@ -63,12 +63,12 @@ [[glossary-filter]] filter :: - A filter is kind of query known as a "non-scoring" query. It does not give a score, - it is only concerned about answering the question - "Does this document match?". - The answer is always a simple, binary yes|no. This kind of query is said to be made - in a "filtering" context, hence it is called a filter. Filtering queries/Filters are - simple checks for set inclusion/exclusion. The goal of filtering is to reduce the - number of documents that have to be examined, as it is in the case of + A filter is a query. It is a kind of query which does not give a score, so it is + known as a "non-scoring" query.It is only concerned about answering the question - + "Does this document match?". The answer is always a simple, binary yes or no. This kind of query is said to be made + in a "filtering" context (e.g. Is the created date in the range 2013 - 2014?), hence it is called a filter. Filters are + simple checks for set inclusion or exclusion. The goal of filtering is to reduce the + number of documents that have to be examined, instead of what happens in the case of <>. [[glossary-index]] index :: @@ -117,14 +117,15 @@ [[glossary-query]] query :: - Query refers to all queries which not only determine if a document matches, - but also calculate how well the document matches. This calculation is refered - to as scoring, hence these queries are also known as "scoring queries". + A query is the basic component of a search. A search can be defined by one or more queries + which can be mixed and matched in endless combinations. The term Query refers to all queries + which not only determine if a document matches, but also calculate how well the document matches. + This calculation is refered to as scoring, hence these queries are also known as "scoring queries". A scoring query calculates how relevant each document is to the query, and assigns - it a relevance _score, which is later used to sort matching documents by relevance. + it a relevance score, which is later used to sort matching documents by relevance. This concept of relevance is well suited to full-text search, where there is seldom a - completely “correct” answer. These queries are more heavier than - <> and their query results are not cacheable. + completely “correct” answer. These queries are takes more resources than + <> and their query results are not cacheable. As a general rule, use query clauses for full-text search or for any condition that should affect the relevance score, and use filters for everything else. From a14b6c76937110313c1416a905282dc1b10767cf Mon Sep 17 00:00:00 2001 From: Bolarinwa Saheed Olayemi Date: Wed, 28 Mar 2018 16:46:02 +0200 Subject: [PATCH 03/10] Updated based on review comments --- docs/reference/glossary.asciidoc | 46 ++++++++++---------------------- 1 file changed, 14 insertions(+), 32 deletions(-) diff --git a/docs/reference/glossary.asciidoc b/docs/reference/glossary.asciidoc index 67df500199188..33c9cbb16cb32 100644 --- a/docs/reference/glossary.asciidoc +++ b/docs/reference/glossary.asciidoc @@ -54,7 +54,6 @@ pairs. The value can be a simple (scalar) value (eg a string, integer, date), or a nested structure like an array or an object. A field is similar to a column in a table in a relational database. - + The <> for each field has a field _type_ (not to be confused with document <>) which indicates the type of data that can be stored in that field, eg `integer`, `string`, @@ -63,20 +62,18 @@ [[glossary-filter]] filter :: - A filter is a query. It is a kind of query which does not give a score, so it is - known as a "non-scoring" query.It is only concerned about answering the question - - "Does this document match?". The answer is always a simple, binary yes or no. This kind of query is said to be made - in a "filtering" context (e.g. Is the created date in the range 2013 - 2014?), hence it is called a filter. Filters are - simple checks for set inclusion or exclusion. The goal of filtering is to reduce the - number of documents that have to be examined, instead of what happens in the case of - <>. + A filter is a non-scoring <>, meaning that it does not score documents. + It is only concerned about answering the question - "Does this document match?". + The answer is always a simple, binary yes or no. This kind of query is said to be made + in a https://www.elastic.co/guide/en/elasticsearch/reference/current/query-filter-context.html["filtering" context], + hence it is called a filter. Filters are simple checks for set inclusion or exclusion. + In most cases, the goal of filtering is to reduce the number of documents that have to be examined. [[glossary-index]] index :: An index is like a _table_ in a relational database. It has a <> which contains a <>, which contains the <> in the index. - + An index is a logical namespace which maps to one or more <> and can have zero or more <>. @@ -86,7 +83,6 @@ A mapping is like a _schema definition_ in a relational database. Each <> has a mapping, which defines a <>, plus a number of index-wide settings. - + A mapping can either be defined explicitly, or it will be generated automatically when a document is indexed. @@ -96,7 +92,6 @@ <>. Multiple nodes can be started on a single server for testing purposes, but usually you should have one node per server. - + At startup, a node will use unicast to discover an existing cluster with the same cluster name and will try to join that cluster. @@ -105,41 +100,33 @@ Each document is stored in a single primary <>. When you index a document, it is indexed first on the primary shard, then on all <> of the primary shard. - + By default, an <> has 5 primary shards. You can specify fewer or more primary shards to scale the number of <> that your index can handle. - + You cannot change the number of primary shards in an index, once the index is created. - + See also <> [[glossary-query]] query :: A query is the basic component of a search. A search can be defined by one or more queries - which can be mixed and matched in endless combinations. The term Query refers to all queries - which not only determine if a document matches, but also calculate how well the document matches. - This calculation is refered to as scoring, hence these queries are also known as "scoring queries". - A scoring query calculates how relevant each document is to the query, and assigns - it a relevance score, which is later used to sort matching documents by relevance. - This concept of relevance is well suited to full-text search, where there is seldom a - completely “correct” answer. These queries are takes more resources than - <> and their query results are not cacheable. - As a general rule, use query clauses for full-text search or for any condition that should - affect the relevance score, and use filters for everything else. + which can be mixed and matched in endless combinations. While <> are + queries that only determine if a document matches, those queries that also calculate or score + how well the document matches are known as "scoring queries". A scoring query calculates how + well a document mathces a query, and assigns it a score, which is later used to sort matching + documents by score. Scoring queries takes more resources than <> and + their query results are not cacheable. As a general rule, use query clauses for full-text search + or for any condition that requires scoring, and use filters for everything else. [[glossary-replica-shard]] replica shard :: Each <> can have zero or more replicas. A replica is a copy of the primary shard, and has two purposes: - + 1. increase failover: a replica shard can be promoted to a primary shard if the primary fails 2. increase performance: get and search requests can be handled by primary or replica shards. - + By default, each primary shard has one replica, but the number of replicas can be changed dynamically on an existing index. A replica shard will never be started on the same node as its primary shard. @@ -152,7 +139,6 @@ the ID of the document or, if the document has a specified parent document, from the ID of the parent document (to ensure that child and parent documents are stored on the same shard). - + This value can be overridden by specifying a `routing` value at index time, or a <> in the <>. @@ -163,11 +149,9 @@ which is managed automatically by Elasticsearch. An index is a logical namespace which points to <> and <> shards. - + Other than defining the number of primary and replica shards that an index should have, you never need to refer to shards directly. Instead, your code should deal only with an index. - + Elasticsearch distributes shards amongst all <> in the <>, and can move shards automatically from one node to another in the case of node failure, or the addition of new @@ -185,7 +169,7 @@ A term is an exact value that is indexed in Elasticsearch. The terms `foo`, `Foo`, `FOO` are NOT equivalent. Terms (i.e. exact values) can - be searched for using _term_ queries. + + be searched for using _term_ queries. See also <> and <>. [[glossary-text]] text :: @@ -193,12 +177,10 @@ Text (or full text) is ordinary unstructured text, such as this paragraph. By default, text will be <> into <>, which is what is actually stored in the index. - + Text <> need to be analyzed at index time in order to be searchable as full text, and keywords in full text queries must be analyzed at search time to produce (and search for) the same terms that were generated at index time. - + See also <> and <>. [[glossary-type]] type :: From 6d091ff05f49e8a4dac0e6eebbe3560559853bd3 Mon Sep 17 00:00:00 2001 From: Bolarinwa Saheed Olayemi Date: Fri, 30 Mar 2018 01:22:52 +0200 Subject: [PATCH 04/10] "+" used for styling restored and more editing --- docs/reference/glossary.asciidoc | 25 ++++++++++++++++++------- 1 file changed, 18 insertions(+), 7 deletions(-) diff --git a/docs/reference/glossary.asciidoc b/docs/reference/glossary.asciidoc index 33c9cbb16cb32..822f11ec8f5e7 100644 --- a/docs/reference/glossary.asciidoc +++ b/docs/reference/glossary.asciidoc @@ -74,6 +74,7 @@ An index is like a _table_ in a relational database. It has a <> which contains a <>, which contains the <> in the index. + + An index is a logical namespace which maps to one or more <> and can have zero or more <>. @@ -83,6 +84,7 @@ A mapping is like a _schema definition_ in a relational database. Each <> has a mapping, which defines a <>, plus a number of index-wide settings. + + A mapping can either be defined explicitly, or it will be generated automatically when a document is indexed. @@ -92,6 +94,7 @@ <>. Multiple nodes can be started on a single server for testing purposes, but usually you should have one node per server. + + At startup, a node will use unicast to discover an existing cluster with the same cluster name and will try to join that cluster. @@ -100,33 +103,37 @@ Each document is stored in a single primary <>. When you index a document, it is indexed first on the primary shard, then on all <> of the primary shard. + + By default, an <> has 5 primary shards. You can specify fewer or more primary shards to scale the number of <> that your index can handle. + + You cannot change the number of primary shards in an index, once the index is created. + + See also <> [[glossary-query]] query :: A query is the basic component of a search. A search can be defined by one or more queries which can be mixed and matched in endless combinations. While <> are - queries that only determine if a document matches, those queries that also calculate or score - how well the document matches are known as "scoring queries". A scoring query calculates how - well a document mathces a query, and assigns it a score, which is later used to sort matching - documents by score. Scoring queries takes more resources than <> and - their query results are not cacheable. As a general rule, use query clauses for full-text search - or for any condition that requires scoring, and use filters for everything else. + queries that only determine if a document matches, those queries that also calculate how well + the document matches are known as "scoring queries". Those queries assign it a score, which is + later used to sort matched documents. Scoring queries take more resources than <> + and their query results are not cacheable. As a general rule, use query clauses for full-text + search or for any condition that requires scoring, and use filters for everything else. [[glossary-replica-shard]] replica shard :: Each <> can have zero or more replicas. A replica is a copy of the primary shard, and has two purposes: + + 1. increase failover: a replica shard can be promoted to a primary shard if the primary fails 2. increase performance: get and search requests can be handled by primary or replica shards. + + By default, each primary shard has one replica, but the number of replicas can be changed dynamically on an existing index. A replica shard will never be started on the same node as its primary shard. @@ -149,6 +156,7 @@ which is managed automatically by Elasticsearch. An index is a logical namespace which points to <> and <> shards. + + Other than defining the number of primary and replica shards that an index should have, you never need to refer to shards directly. Instead, your code should deal only with an index. @@ -170,17 +178,20 @@ A term is an exact value that is indexed in Elasticsearch. The terms `foo`, `Foo`, `FOO` are NOT equivalent. Terms (i.e. exact values) can be searched for using _term_ queries. - See also <> and <>. + + + See also <> and <>. [[glossary-text]] text :: Text (or full text) is ordinary unstructured text, such as this paragraph. By default, text will be <> into <>, which is what is actually stored in the index. + + Text <> need to be analyzed at index time in order to be searchable as full text, and keywords in full text queries must be analyzed at search time to produce (and search for) the same terms that were generated at index time. + + See also <> and <>. [[glossary-type]] type :: From f57cb47ee4554662850468c1243d57ed1926737c Mon Sep 17 00:00:00 2001 From: Bolarinwa Saheed Olayemi Date: Tue, 3 Apr 2018 23:44:16 +0200 Subject: [PATCH 05/10] Reverted some unrelated changes made and added relative link --- docs/reference/glossary.asciidoc | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/docs/reference/glossary.asciidoc b/docs/reference/glossary.asciidoc index 822f11ec8f5e7..1a4d02822d76d 100644 --- a/docs/reference/glossary.asciidoc +++ b/docs/reference/glossary.asciidoc @@ -54,6 +54,7 @@ pairs. The value can be a simple (scalar) value (eg a string, integer, date), or a nested structure like an array or an object. A field is similar to a column in a table in a relational database. + + The <> for each field has a field _type_ (not to be confused with document <>) which indicates the type of data that can be stored in that field, eg `integer`, `string`, @@ -65,7 +66,7 @@ A filter is a non-scoring <>, meaning that it does not score documents. It is only concerned about answering the question - "Does this document match?". The answer is always a simple, binary yes or no. This kind of query is said to be made - in a https://www.elastic.co/guide/en/elasticsearch/reference/current/query-filter-context.html["filtering" context], + in a <>, hence it is called a filter. Filters are simple checks for set inclusion or exclusion. In most cases, the goal of filtering is to reduce the number of documents that have to be examined. @@ -146,6 +147,7 @@ the ID of the document or, if the document has a specified parent document, from the ID of the parent document (to ensure that child and parent documents are stored on the same shard). + + This value can be overridden by specifying a `routing` value at index time, or a <> in the <>. @@ -160,6 +162,7 @@ Other than defining the number of primary and replica shards that an index should have, you never need to refer to shards directly. Instead, your code should deal only with an index. + + Elasticsearch distributes shards amongst all <> in the <>, and can move shards automatically from one node to another in the case of node failure, or the addition of new From 69a9fc36433d2b92a278d5521cf6574cba68d4f5 Mon Sep 17 00:00:00 2001 From: Bolarinwa Saheed Olayemi Date: Wed, 4 Apr 2018 11:11:58 +0200 Subject: [PATCH 06/10] changed ` "filter" context ` to ` filter context` --- docs/reference/glossary.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/glossary.asciidoc b/docs/reference/glossary.asciidoc index 1a4d02822d76d..34325394d46c9 100644 --- a/docs/reference/glossary.asciidoc +++ b/docs/reference/glossary.asciidoc @@ -66,7 +66,7 @@ A filter is a non-scoring <>, meaning that it does not score documents. It is only concerned about answering the question - "Does this document match?". The answer is always a simple, binary yes or no. This kind of query is said to be made - in a <>, + in a <>, hence it is called a filter. Filters are simple checks for set inclusion or exclusion. In most cases, the goal of filtering is to reduce the number of documents that have to be examined. From 2c406050075a91c78d927fcadeca060eba710ba9 Mon Sep 17 00:00:00 2001 From: Bolarinwa Saheed Olayemi Date: Wed, 4 Apr 2018 17:29:15 +0200 Subject: [PATCH 07/10] Corrected absolute link to the proper relative link --- docs/reference/glossary.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/glossary.asciidoc b/docs/reference/glossary.asciidoc index 34325394d46c9..91ccfb56f7022 100644 --- a/docs/reference/glossary.asciidoc +++ b/docs/reference/glossary.asciidoc @@ -66,7 +66,7 @@ A filter is a non-scoring <>, meaning that it does not score documents. It is only concerned about answering the question - "Does this document match?". The answer is always a simple, binary yes or no. This kind of query is said to be made - in a <>, + in a <>, hence it is called a filter. Filters are simple checks for set inclusion or exclusion. In most cases, the goal of filtering is to reduce the number of documents that have to be examined. From b5024e82fa734d617e63fcbe0e7bd4a17b99d0d5 Mon Sep 17 00:00:00 2001 From: Bolarinwa Saheed Date: Thu, 12 Apr 2018 23:21:16 +0200 Subject: [PATCH 08/10] fix bad link syntax --- docs/reference/glossary.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/glossary.asciidoc b/docs/reference/glossary.asciidoc index 91ccfb56f7022..38d4446a3b6ea 100644 --- a/docs/reference/glossary.asciidoc +++ b/docs/reference/glossary.asciidoc @@ -66,7 +66,7 @@ A filter is a non-scoring <>, meaning that it does not score documents. It is only concerned about answering the question - "Does this document match?". The answer is always a simple, binary yes or no. This kind of query is said to be made - in a <>, + in a https://www.elastic.co/guide/en/elasticsearch/guide/current/_queries_and_filters.html[filter context], hence it is called a filter. Filters are simple checks for set inclusion or exclusion. In most cases, the goal of filtering is to reduce the number of documents that have to be examined. From 5bc957ac2f1685374444de9fab773f90afedba42 Mon Sep 17 00:00:00 2001 From: Bolarinwa Saheed Date: Fri, 13 Apr 2018 16:57:35 +0200 Subject: [PATCH 09/10] Edit link to query_filter_context, now getting error --- docs/reference/glossary.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/glossary.asciidoc b/docs/reference/glossary.asciidoc index 38d4446a3b6ea..1e5f27d6467a3 100644 --- a/docs/reference/glossary.asciidoc +++ b/docs/reference/glossary.asciidoc @@ -66,7 +66,7 @@ A filter is a non-scoring <>, meaning that it does not score documents. It is only concerned about answering the question - "Does this document match?". The answer is always a simple, binary yes or no. This kind of query is said to be made - in a https://www.elastic.co/guide/en/elasticsearch/guide/current/_queries_and_filters.html[filter context], + in a <>, hence it is called a filter. Filters are simple checks for set inclusion or exclusion. In most cases, the goal of filtering is to reduce the number of documents that have to be examined. From 3813c919ec8822d0a381ae5a5db729d30635b06c Mon Sep 17 00:00:00 2001 From: Bolarinwa Saheed Date: Sat, 14 Apr 2018 07:57:58 +0200 Subject: [PATCH 10/10] fix the problem with the link to query-filter-context --- docs/reference/glossary.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/glossary.asciidoc b/docs/reference/glossary.asciidoc index 1e5f27d6467a3..53164d366cd93 100644 --- a/docs/reference/glossary.asciidoc +++ b/docs/reference/glossary.asciidoc @@ -66,7 +66,7 @@ A filter is a non-scoring <>, meaning that it does not score documents. It is only concerned about answering the question - "Does this document match?". The answer is always a simple, binary yes or no. This kind of query is said to be made - in a <>, + in a <>, hence it is called a filter. Filters are simple checks for set inclusion or exclusion. In most cases, the goal of filtering is to reduce the number of documents that have to be examined.