Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Glossary update, Closes Issue #16891 #29127

Merged
merged 10 commits into from
Apr 16, 2018
23 changes: 23 additions & 0 deletions docs/reference/glossary.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,16 @@
`object`. The mapping also allows you to define (amongst other things)
how the value for a field should be analyzed.

[[glossary-filter]] filter ::

A filter is kind of query known as a "non-scoring" query. It does not give a score,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct, but could you make it clearer from the beginning that a "filter" is a normal query, just one that doesn't score? You already say this, but in the current order the definition isn't as clear as it could be. I think this just needs a little rephrasing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also: "is a kind" if you leave that part in anywhere.

it is only concerned about answering the question - "Does this document match?".
The answer is always a simple, binary yes|no. This kind of query is said to be made
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the context of the glossary I would spell "yes|no" out and say yes or no.

in a "filtering" context, hence it is called a filter. Filtering queries/Filters are
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe explain "filtering" context, e.g. the "filter" section in a boolean query.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the "/" here? I'd simply talk about "filters" here.

Copy link
Contributor Author

@refactormyself refactormyself Mar 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Filtering queries/Filters" is used to emphasis that Filtering queries = Filters. Maybe I should make it - "Filtering queries or Filters"

simple checks for set inclusion/exclusion. The goal of filtering is to reduce the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"inclusion or exclusion"

number of documents that have to be examined, as it is in the case of
<<glossary-query,scoring queries>>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't undestand the reference to scoring queries in this sentence, maybe you can explain or rephrase to make this clearer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of filtering is to reduce the number of documents that have to be examined, instead of what happens in the case of <<glossary-query,scoring queries>>.

The reference is also meant to provide a link to the reader for comparism


[[glossary-index]] index ::

An index is like a _table_ in a relational database. It has a
Expand Down Expand Up @@ -105,6 +115,19 @@
+
See also <<glossary-routing,routing>>

[[glossary-query]] query ::

Query refers to all queries which not only determine if a document matches,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You start by contrasting queries with something else (filters I guess), but maybe its possible to start the definition in general? Maybe talk about queries as that what defines a search request and then later differentiate between queries that score and queries that don't score (filters)? Not sure about the best phrasing for this, I think the current formulations are already quiete good, maybe also just change the ordering a bit?

but also calculate how well the document matches. This calculation is refered
to as scoring, hence these queries are also known as "scoring queries".
A scoring query calculates how relevant each document is to the query, and assigns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave out the notion of "relevance" here, this is opening yet another term that we'd need to refine more carefully I think. I would suggest using "how well a document mathces a query", "score" and "sort by score" instead.

it a relevance _score, which is later used to sort matching documents by relevance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use backticks for _score

This concept of relevance is well suited to full-text search, where there is seldom a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just leave out "concept of relevance" if you agree with my former proposal.

completely “correct” answer. These queries are more heavier than
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if "more heavier" is correct (I'm not a native speaker) but it sounds like using "heavier" would be enough. Maybe "heavier" could also be exchanged for something that makes it clearer that we are talking about performance here.

<<glossary-filter,filters/non scoring queries>> and their query results are not cacheable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would either use "filters" or "non-scoring queries" here, probably the later.

As a general rule, use query clauses for full-text search or for any condition that should
affect the relevance score, and use filters for everything else.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe user "condition that requires scoring" if you agree on leaving you the notion of relevance here.


[[glossary-replica-shard]] replica shard ::

Each <<glossary-primary-shard,primary shard>> can have zero or more
Expand Down