Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7.x] [DOCS] Add PIT to search after docs (#61593) #62101

Merged
merged 5 commits into from
Sep 14, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/reference/search.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ exception of the <<search-explain,explain API>>.
* <<search-search>>
* <<search-multi-search>>
* <<async-search>>
* <<point-in-time-api>>
* <<scroll-api>>
* <<clear-scroll-api>>
* <<search-suggesters>>
Expand Down Expand Up @@ -51,6 +52,8 @@ include::search/search.asciidoc[]

include::search/async-search.asciidoc[]

include::search/point-in-time-api.asciidoc[]

include::search/scroll-api.asciidoc[]

include::search/clear-scroll-api.asciidoc[]
Expand Down
120 changes: 120 additions & 0 deletions docs/reference/search/point-in-time-api.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
[role="xpack"]
[testenv="basic"]
[[point-in-time-api]]
=== Point in time API
++++
<titleabbrev>Point in time</titleabbrev>
++++

A search request by default executes against the most recent visible data of
the target indices, which is called point in time. Elasticsearch pit (point in time)
is a lightweight view into the state of the data as it existed when initiated.
In some cases, it's preferred to perform multiple search requests using
the same point in time. For example, if <<indices-refresh,refreshes>> happen between
search_after requests, then the results of those requests might not be consistent as
changes happening between searches are only visible to the more recent point in time.

A point in time must be opened explicitly before being used in search requests. The
keep_alive parameter tells Elasticsearch how long it should keep a point in time alive,
e.g. `?keep_alive=5m`.

[source,console]
--------------------------------------------------
POST /my-index-000001/_pit?keep_alive=1m
--------------------------------------------------
// TEST[setup:my_index]

The result from the above request includes a `id`, which should
be passed to the `id` of the `pit` parameter of a search request.

[source,console]
--------------------------------------------------
POST /_search <1>
{
"size": 100,
"query": {
"match" : {
"title" : "elasticsearch"
}
},
"pit": {
"id": "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", <2>
"keep_alive": "1m" <3>
}
}
--------------------------------------------------
// TEST[catch:missing]

<1> A search request with the `pit` parameter must not specify `index`, `routing`,
and {ref}/search-request-body.html#request-body-search-preference[`preference`]
as these parameters are copied from the point in time.
<2> The `id` parameter tells Elasticsearch to execute the request using contexts
from this point int time.
<3> The `keep_alive` parameter tells Elasticsearch how long it should extend
the time to live of the point in time.

IMPORTANT: The open point in time request and each subsequent search request can
return different `id`; thus always use the most recently received `id` for the
next search request.

[[point-in-time-keep-alive]]
==== Keeping point in time alive
The `keep_alive` parameter, which is passed to a open point in time request and
search request, extends the time to live of the corresponding point in time.
The value (e.g. `1m`, see <<time-units>>) does not need to be long enough to
process all data -- it just needs to be long enough for the next request.

Normally, the background merge process optimizes the index by merging together
smaller segments to create new, bigger segments. Once the smaller segments are
no longer needed they are deleted. However, open point-in-times prevent the
old segments from being deleted since they are still in use.

TIP: Keeping older segments alive means that more disk space and file handles
are needed. Ensure that you have configured your nodes to have ample free file
handles. See <<file-descriptors>>.

Additionally, if a segment contains deleted or updated documents then the
point in time must keep track of whether each document in the segment was live at
the time of the initial search request. Ensure that your nodes have sufficient heap
space if you have many open point-in-times on an index that is subject to ongoing
deletes or updates.

You can check how many point-in-times (i.e, search contexts) are open with the
<<cluster-nodes-stats,nodes stats API>>:

[source,console]
---------------------------------------
GET /_nodes/stats/indices/search
---------------------------------------

[[close-point-in-time-api]]
==== Close point in time API

Point-in-time is automatically closed when its `keep_alive` has
been elapsed. However keeping point-in-times has a cost, as discussed in the
<<point-in-time-keep-alive,previous section>>. Point-in-times should be closed
as soon as they are no longer used in search requests.

[source,console]
---------------------------------------
DELETE /_pit
{
"id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA="
}
---------------------------------------
// TEST[catch:missing]

The API returns the following response:

[source,console-result]
--------------------------------------------------
{
"succeeded": true, <1>
"num_freed": 3 <2>
}
--------------------------------------------------
// TESTRESPONSE[s/"succeeded": true/"succeeded": $body.succeeded/]
// TESTRESPONSE[s/"num_freed": 3/"num_freed": $body.num_freed/]

<1> If true, all search contexts associated with the point-in-time id are successfully closed
<2> The number of search contexts have been successfully closed
4 changes: 4 additions & 0 deletions docs/reference/search/scroll-api.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@
<titleabbrev>Scroll</titleabbrev>
++++

IMPORTANT: We no longer recommend using the scroll API for deep pagination. If
you need to preserve the index state while paging through more than 10,000 hits,
use the <<search-after,`search_after`>> parameter with a point in time (PIT).

Retrieves the next batch of results for a <<scroll-search-results,scrolling
search>>.

Expand Down
Loading