From 308d87343efa54a949184777a1195d219f1e368d Mon Sep 17 00:00:00 2001
From: looly <loolly@gmail.com>
Date: Sat, 18 Oct 2014 19:46:31 +0800
Subject: [PATCH] Finished 5.0

---
 050_Search/00_Intro.asciidoc                  |  57 -------
 050_Search/00_Intro.md                        |  60 +++-----
 050_Search/05_Empty_search.asciidoc           | 114 --------------
 050_Search/10_Multi_index_multi_type.asciidoc |  61 --------
 050_Search/15_Pagination.asciidoc             |  53 -------
 050_Search/20_Query_string.asciidoc           | 144 ------------------
 6 files changed, 18 insertions(+), 471 deletions(-)
 delete mode 100755 050_Search/00_Intro.asciidoc
 delete mode 100755 050_Search/05_Empty_search.asciidoc
 delete mode 100755 050_Search/10_Multi_index_multi_type.asciidoc
 delete mode 100755 050_Search/15_Pagination.asciidoc
 delete mode 100755 050_Search/20_Query_string.asciidoc
diff --git a/050_Search/00_Intro.asciidoc b/050_Search/00_Intro.asciidoc
deleted file mode 100755
index ad3793d..0000000
--- a/050_Search/00_Intro.asciidoc
+++ /dev/null
@@ -1,57 +0,0 @@
-[[search]]
-== Searching – the basic tools
-
-So far, we have learned how to use Elasticsearch as a simple NoSQL-style
-distributed document store -- we can throw JSON documents at Elasticsearch and
-retrieve each one by ID. But the real power of Elasticsearch lies in its
-ability to make sense out of chaos -- to turn Big Data into Big Information.
-
-This is the reason that we use structured JSON documents, rather than
-amorphous blobs of data.  Elasticsearch doesn't only _store_ the document, it
-also _indexes_ the content of the document in order to make it searchable.
-
-*Every field in a document is indexed and can be queried*.  And it's not just
-that. During a single query, Elasticsearch can use *all* of these indices, to
-return results at breath-taking speed.  That's something that you could never
-consider doing with a traditional database.
-
-A _search_ can be:
-
-* a structured query on concrete fields like `gender` or `age`, sorted by
-  a field like `join_date`, similar to the type of query that you could construct 
-  in SQL
-
-* a full text query, which finds all documents matching the search keywords,
-  and returns them sorted by _relevance_
-
-* or a combination of the two
-
-While many searches will just work out of the box, to use Elasticsearch to
-its full potential you need to understand three subjects:
-
-[horizontal]
-
-_Mapping_::     how the data in each field is interpreted
-_Analysis_::    how full text is processed to make it searchable
-_Query DSL_::   the flexible, powerful query language used by Elasticsearch
-
-Each of the above is a big subject in its own right and we explain them in
-detail in <<search-in-depth>>. The chapters in this section will introduce the
-basic concepts of all three -- just enough to help you to get an overall
-understanding of how search works.
-
-We will start by explaining the `search` API in its simplest form.
-
-.Test data
-
-****
-
-The documents that we will use for test purposes in this chapter can be found
-in this gist: https://gist.github.com/clintongormley/8579281
-
-You can copy the commands and paste them into your shell in order to follow
-along with this chapter.
-
-Alternatively, link:sense_widget.html?snippets/050_Search/Test_data.json[click here to open in Sense].
-
-****
diff --git a/050_Search/00_Intro.md b/050_Search/00_Intro.md
index 1782a7b..78f7058 100755
--- a/050_Search/00_Intro.md
+++ b/050_Search/00_Intro.md
@@ -2,59 +2,35 @@
 == Searching – the basic tools
 ## 搜索——基本的工具
 
-So far, we have learned how to use Elasticsearch as a simple NoSQL-style
-distributed document store -- we can throw JSON documents at Elasticsearch and
-retrieve each one by ID. But the real power of Elasticsearch lies in its
-ability to make sense out of chaos -- to turn Big Data into Big Information.
+到目前为止，我们已经学会了如何使用elasticsearch作为一个简单的NoSQL风格的分布式文件存储器——我们可以将一个JSON文档扔给Elasticsearch，也可以根据ID检索它们。但Elasticsearch真正强大之处在于可以从混乱的数据中找出有意义的信息——从大数据到全面的信息。
 
-至目前，我们已经学会了如何使用elasticsearch作为一个简单的NoSQL风格的分布式文件存储器——我们可以将一个JSON文档扔给Elasticsearch，也可以根据ID检索它们。
+这也是为什么我们使用结构化的JSON文档，而不是无结构的二进制数据。Elasticsearch不只会**存储(store)**文档，也会**索引(indexes)**文档内容来使之可以被搜索。
 
-This is the reason that we use structured JSON documents, rather than
-amorphous blobs of data.  Elasticsearch doesn't only _store_ the document, it
-also _indexes_ the content of the document in order to make it searchable.
-
-*Every field in a document is indexed and can be queried*.  And it's not just
-that. During a single query, Elasticsearch can use *all* of these indices, to
-return results at breath-taking speed.  That's something that you could never
-consider doing with a traditional database.
+**每个文档里的字段都会被索引并被查询**。而且不仅如此。在简单查询时，Elasticsearch可以使用**所有**的索引，以非常快的速度返回结果。这让你永远不必考虑传统数据库的一些东西。
 
 A _search_ can be:
+**搜索(search)**可以：
 
-* a structured query on concrete fields like `gender` or `age`, sorted by
-  a field like `join_date`, similar to the type of query that you could construct
-  in SQL
-
-* a full text query, which finds all documents matching the search keywords,
-  and returns them sorted by _relevance_
-
-* or a combination of the two
-
-While many searches will just work out of the box, to use Elasticsearch to
-its full potential you need to understand three subjects:
-
-[horizontal]
+* 在类似于`gender`或者`age`这样的字段上使用结构化查询，`join_date`这样的字段上使用排序，就像SQL的结构化查询一样。
+* 全文检索，可以使用所有字段来匹配关键字，然后按照**关联性(relevance)**排序返回结果。
+* 或者结合以上两条。
 
-_Mapping_::     how the data in each field is interpreted
-_Analysis_::    how full text is processed to make it searchable
-_Query DSL_::   the flexible, powerful query language used by Elasticsearch
+很多搜索都是开箱即用的，为了充分挖掘Elasticsearch的潜力，你需要理解以下三个概念：
 
-Each of the above is a big subject in its own right and we explain them in
-detail in <<search-in-depth>>. The chapters in this section will introduce the
-basic concepts of all three -- just enough to help you to get an overall
-understanding of how search works.
 
-We will start by explaining the `search` API in its simplest form.
+| 概念                            | 解释                                                                  |
+| ------------------------------- | ----------------------------------------- |
+| **映射(Mapping)**               | 数据在每个字段中的解释说明                                            |
+| **分析(Analysis)**              | 全文是如何处理的可以被搜索的                                           |
+| **领域特定语言查询(Query DSL)** | Elasticsearch使用的灵活的、强大的查询语言 |
 
-.Test data
 
-****
+以上提到的每个点都是一个巨大的话题，我们将在《深入搜索》一章阐述它们。本章节我们将介绍这三点的一些基本概念——仅仅帮助你大致了解搜索是如何工作的。
 
-The documents that we will use for test purposes in this chapter can be found
-in this gist: https://gist.github.com/clintongormley/8579281
+我们将使用最简单的形式开始介绍`search` API.
 
-You can copy the commands and paste them into your shell in order to follow
-along with this chapter.
+> ### 测试数据
 
-Alternatively, link:sense_widget.html?snippets/050_Search/Test_data.json[click here to open in Sense].
+> 本章节测试用的数据可以在这里被找到[https://gist.github.com/clintongormley/8579281](https://gist.github.com/clintongormley/8579281)
 
-****
+> 你可以把这些命令复制到终端中执行以便可以实践本章的例子。
diff --git a/050_Search/05_Empty_search.asciidoc b/050_Search/05_Empty_search.asciidoc
deleted file mode 100755
index a71ebf3..0000000
--- a/050_Search/05_Empty_search.asciidoc
+++ /dev/null
@@ -1,114 +0,0 @@
-[[empty-search]]
-=== The empty search
-
-The most basic form of the search API is the _empty search_ which doesn't
-specify any query, but simply returns all documents in all indices in the
-cluster:
-
-[source,js]
---------------------------------------------------
-GET /_search
---------------------------------------------------
-// SENSE: 050_Search/05_Empty_search.json
-
-The response (edited for brevity) looks something like this:
-
-[source,js]
---------------------------------------------------
-{
-   "hits" : {
-      "total" :       14,
-      "hits" : [
-        {
-          "_index":   "us",
-          "_type":    "tweet",
-          "_id":      "7",
-          "_score":   1,
-          "_source": {
-             "date":    "2014-09-17",
-             "name":    "John Smith",
-             "tweet":   "The Query DSL is really powerful and flexible",
-             "user_id": 2
-          }
-       },
-        ... 9 RESULTS REMOVED ...
-      ],
-      "max_score" :   1
-   },
-   "took" :           4,
-   "_shards" : {
-      "failed" :      0,
-      "successful" :  10,
-      "total" :       10
-   },
-   "timed_out" :      false
-}
---------------------------------------------------
-
-
-==== `hits`
-
-The most important section of the response is `hits`, which contains the
-`total` number of documents that matched our query, and a `hits` array
-containing the first 10 of those matching documents -- the results.
-
-Each result in the `hits` array contains the `_index`, `_type` and `_id` of
-the document, plus the `_source` field.  This means that the whole document is
-immediately available to us directly from the search results. This is unlike
-other search engines which return just the document ID, requiring you to fetch
-the document itself in a separate step.
-
-Each element also has a `_score`.  This is the _relevance score_, which is a
-measure of how well the document matches the query.  By default, results are
-returned with the most relevant documents first; that is, in descending order
-of `_score`. In this case, we didn't specify any query so all documents are
-equally relevant, hence the neutral `_score` of `1` for all results.
-
-The `max_score` value is the highest `_score` of any document that matches our
-query.
-
-==== `took`
-
-The `took` value tells us how many milliseconds the entire search request took
-to execute.
-
-==== `shards`
-
-The `_shards` element tells us the `total` number of shards that were involved
-in the query and, of them, how many were `successful` and how many `failed`.
-We wouldn't normally expect shards to fail, but it can happen. If we were to
-suffer a major disaster in which we lost both the primary and the replica copy
-of the same shard, there would be no copies of that shard available to respond
-to search requests. In this case, Elasticsearch would report the shard as
-`failed`, but continue to return results from the remaining shards.
-
-==== `timeout`
-
-The `timed_out` value tells us whether the query timed out or not.  By
-default, search requests do not timeout.  If low response times are more
-important to you than complete results, you can specify a `timeout` as `10`
-or `"10ms"` (10 milliseconds), or `"1s"` (1 second):
-
-[source,js]
---------------------------------------------------
-GET /_search?timeout=10ms
---------------------------------------------------
-
-
-Elasticsearch will return any results that it has managed to gather from
-shards which responded before the request timed out.
-
-.Timeout is not a circuit breaker
-[WARNING]
-================================================
-
-It should be noted that this `timeout` does not halt the execution of the
-query, it merely tells the coordinating node to return the results collected
-_so far_ and to close the connection.  In the background, other shards may
-still be processing the query even though results have been sent.
-
-Use the timeout because it is important to your SLA, not because you want
-to abort the execution of long running queries.
-
-================================================
-
diff --git a/050_Search/10_Multi_index_multi_type.asciidoc b/050_Search/10_Multi_index_multi_type.asciidoc
deleted file mode 100755
index 65d52a3..0000000
--- a/050_Search/10_Multi_index_multi_type.asciidoc
+++ /dev/null
@@ -1,61 +0,0 @@
-[[multi-index-multi-type]]
-=== Multi-index, multi-type
-
-Did you notice that the results from the <<empty-search,empty search>> above
-contained documents of different types -- `user` and `tweet` -- from two
-different indices -- `us` and `gb`?
-
-By not limiting our search to a particular index or type, we have searched
-across *all* documents in the cluster. Elasticsearch forwarded the search
-request in parallel to a primary or replica of every shard in the cluster,
-gathered the results to select the overall top ten, and returned them to us.
-
-Usually, however, you will want to search within one or more specific indices,
-and probably one or more specific types. We can do this by specifying the
-index and type in the URL, as follows:
-
-[horizontal]
-`/_search`::
-
-    search all types in all indices
-
-`/gb/_search`::
-
-    search all types in the `gb` index
-
-`/gb,us/_search`::
-
-    search all types in the `gb` and `us` indices
-
-`/g*,u*/_search`::
-
-    search all types in any indices beginning with `g` or beginning with `u`
-
-`/gb/user/_search`::
-
-    search type `user` in the `gb` index
-
-`/gb,us/user,tweet/_search`::
-
-    search types `user` and `tweet` in the `gb` and `us` indices
-
-`/_all/user,tweet/_search`::
-
-    search types `user` and `tweet` in all indices
-
-
-When you search within a single index, Elasticsearch forwards the search
-request to a primary or replica of every shard in that index, then gathers the
-results from each shard. Searching within multiple indices works in exactly
-the same way -- there are just more shards involved.
-
-[IMPORTANT]
-================================================
-
-Searching one index which has 5 primary shards is *exactly equivalent* to
-searching 5 indices which have one primary shard each.
-
-================================================
-
-Later, you will see how this simple fact makes it easy to scale flexibly
-as your requirements change.
diff --git a/050_Search/15_Pagination.asciidoc b/050_Search/15_Pagination.asciidoc
deleted file mode 100755
index 4f9e11e..0000000
--- a/050_Search/15_Pagination.asciidoc
+++ /dev/null
@@ -1,53 +0,0 @@
-[[pagination]]
-=== Pagination
-
-Our <<empty-search,empty search above>> told us that there are 14 documents in the
-cluster which match our (empty) query.  But there were only 10 documents in
-the `hits` array.  How can we see the other documents?
-
-In the same way as SQL uses the `LIMIT` keyword to return a single ``page'' of
-results, Elasticsearch accepts the `from` and `size` parameters:
-
-[horizontal]
-`size`:: How many results should be returned, defaults to `10`
-`from`:: How many initial results should be skipped, defaults to `0`
-
-If you wanted to show 5 results per page, then pages 1 to 3
-could be requested as:
-
-[source,js]
---------------------------------------------------
-GET /_search?size=5
-GET /_search?size=5&from=5
-GET /_search?size=5&from=10
---------------------------------------------------
-// SENSE: 050_Search/15_Pagination.json
-
-
-Beware of paging too deep or requesting too many results at once. Results are
-sorted before being returned. But remember that a search request usually spans
-multiple shards. Each shard generates its own sorted results, which then need
-to be sorted centrally to ensure that the overall order is correct.
-
-.Deep paging in distributed systems
-****
-
-To understand why deep paging is problematic, let's imagine that we are
-searching within a single index with 5 primary shards.  When we request the
-first page of results (results 1 to 10), each shard produces its own top 10
-results and returns them to the _requesting node_, which then sorts all 50
-results in order to select the overall top 10.
-
-Now imagine that we ask for page 1,000 -- results 10,001 to 10,010. Everything
-works in the same way except that each shard has to produce its top 10,010
-results. The requesting node then sorts through all 50,050 results and
-discards 50,040 of them!
-
-You can see that, in a distributed system, the cost of sorting results
-grows exponentially the deeper we page.  There is a very good reason
-why web search engines don't return more than 1,000 results for any query.
-
-****
-
-TIP: In <<reindex>> we will explain how you *can* retrieve large numbers of
-documents efficiently.
diff --git a/050_Search/20_Query_string.asciidoc b/050_Search/20_Query_string.asciidoc
deleted file mode 100755
index a763b23..0000000
--- a/050_Search/20_Query_string.asciidoc
+++ /dev/null
@@ -1,144 +0,0 @@
-[[search-lite]]
-=== Search _Lite_
-
-There are two forms of the `search` API: a ``lite'' _query string_ version
-that expects all its parameters to be passed in the query string, and the full
-_request body_ version that expects a JSON request body and uses a
-rich search language called the query DSL.
-
-The query string search is useful for running _ad hoc_ queries from the
-command line. For instance this query finds all documents of type `tweet` that
-contain the word `"elasticsearch"` in the `tweet` field:
-
-[source,js]
---------------------------------------------------
-GET /_all/tweet/_search?q=tweet:elasticsearch
---------------------------------------------------
-// SENSE: 050_Search/20_Query_string.json
-
-The next query looks for `"john"` in the `name` field and `"mary"` in the
-`tweet` field. The actual query is just:
-
-    +name:john +tweet:mary
-
-but the _percent encoding_ needed for query string parameters makes it appear
-more cryptic than it really is:
-
-[source,js]
---------------------------------------------------
-GET /_search?q=%2Bname%3Ajohn+%2Btweet%3Amary
---------------------------------------------------
-// SENSE: 050_Search/20_Query_string.json
-
-
-The `"+"` prefix indicates conditions which _must_ be satisfied for our query to
-match. Similarly a `"-"` prefix would indicate conditions that _must not_
-match.  All conditions without a `+` or `-` are optional -- the more that match,
-the more relevant the document.
-
-[[all-field-intro]]
-==== The `_all` field
-
-This simple search returns all documents which contain the word `"mary"`:
-
-[source,js]
---------------------------------------------------
-GET /_search?q=mary
---------------------------------------------------
-// SENSE: 050_Search/20_All_field.json
-
-
-In the previous examples, we searched for words in the `tweet` or
-`name` fields. However, the results from this query mention `"mary"` in
-three different fields:
-
-* a user whose name is "Mary"
-* six tweets by "Mary"
-* one tweet directed at "@mary"
-
-How has Elasticsearch managed to find results in three different fields?
-
-When you index a document, Elasticsearch takes the string values of all of
-its fields and concatenates them into one big string which it indexes as
-the special `_all` field. For example, when we index this document:
-
-[source,js]
---------------------------------------------------
-{
-    "tweet":    "However did I manage before Elasticsearch?",
-    "date":     "2014-09-14",
-    "name":     "Mary Jones",
-    "user_id":  1
-}
---------------------------------------------------
-
-
-it's as if we had added an extra field called `_all` with the value:
-
-[source,js]
---------------------------------------------------
-"However did I manage before Elasticsearch? 2014-09-14 Mary Jones 1"
---------------------------------------------------
-
-
-The query string search uses the `_all` field unless another
-field name has been specified.
-
-TIP: The `_all` field is a useful feature while you are getting started with
-a new application. Later, you will find that you have more control over
-your search results if you query specific fields instead of the `_all`
-field.  When the `_all` field is no longer useful to you, you can
-disable it, as explained in <<all-field>>.
-
-[[query-string-query]]
-==== More complicated queries
-
-The next query searches for tweets:
-
-* where the `name` field contains `"mary"` or `"john"`
-* and where the `date` is greater than `2014-09-10`
-* and which contain either of the words `"aggregations"` or `"geo"` in the
-  `_all` field
-
-[source,js]
---------------------------------------------------
-+name:(mary john) +date:>2014-09-10 +(aggregations geo)
---------------------------------------------------
-// SENSE: 050_Search/20_All_field.json
-
-which, as a properly encoded query string looks like the slightly less
-readable:
-
-[source,js]
---------------------------------------------------
-?q=%2Bname%3A(mary+john)+%2Bdate%3A%3E2014-09-10+%2B(aggregations+geo)
---------------------------------------------------
-
-As you can see from the above examples, this _lite_ query string search is
-surprisingly powerful. Its query syntax, which is explained in detail in the
-{ref}/query-dsl-query-string-query.html#query-string-syntax[Query String Syntax]
-reference docs, allows us to express quite complex queries succinctly. This
-makes it great for throwaway queries from the command line or during
-development.
-
-However, you can also see that its terseness can make it cryptic and
-difficult to debug. And it's fragile -- a slight syntax error in the query
-string, such as a misplaced `-`, `:`, `/` or `"` and it will return an error
-instead of results.
-
-Lastly, the query string search allows any user to run potentially slow heavy
-queries on any field in your index, possibly exposing private information or
-even bringing your cluster to its knees!
-
-[TIP]
-==================================================
-For these reasons, we don't recommend exposing query string search directly to
-your users, unless they are power users who can be trusted with your data and
-with your cluster.
-==================================================
-
-Instead, in production we usually rely on the full-featured _request body_
-search API, which does all of the above, plus a lot more. Before we get there
-though, we first need to take a look at how our data is indexed in
-Elasticsearch.
-