Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] DSL downsampling docs #103148

Merged
merged 4 commits into from
Dec 8, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions docs/reference/data-streams/lifecycle/apis/put-lifecycle.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,16 @@ duration the document could be deleted. When empty, every document in this data
If defined, it turns data streqm lifecycle on/off (`true`/`false`) for this data stream.
A data stream lifecycle that's disabled (`enabled: false`) will have no effect on the
data stream. Defaults to `true`.

`downsampling`::
(Optional, array)
An optional array of downsampling configuration objects, each defining an `after`
interval representing when the backing index is meant to be downsampled (the time
frame is calculated since the index was rolled over, i.e. generation time) and
a `fixed_interval` representing the downsampling interval (the minimum `fixed_interval`
value is `5m`). A maximum number of 10 downsampling rounds can be configured.
See <<data-streams-put-lifecycle-downsampling-example, configuration example>> below.

====

[[data-streams-put-lifecycle-example]]
Expand All @@ -84,3 +94,29 @@ When the lifecycle is successfully updated in all data streams, you receive the
"acknowledged": true
}
--------------------------------------------------

[[data-streams-put-lifecycle-downsampling-example]]
==== {api-examples-title}

The following example configures two downsampling rounds, the first one starting
one day after the backing index is rolled over (or later, if the index is still
within its write-accepting <<time-bound-indices, time bounds>>) with an interval
of `10m`, and a second round starting 7 days after rollover at an interval of `1d`:

[source,console]
--------------------------------------------------------------------
PUT _data_stream/my-weather-sensor-data-stream/_lifecycle
{
"downsampling": [
{
"after": "1d",
"fixed_interval": "10m"
},
{
"after": "7d",
"fixed_interval": "1d"
}
]
}
--------------------------------------------------------------------
//TEST[skip:downsampling requires waiting for indices to be out of time bounds]
8 changes: 7 additions & 1 deletion docs/reference/data-streams/lifecycle/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,10 @@ and backwards incompatible mapping changes.
* Configurable retention, which allows you to configure the time period for which your data is guaranteed to be stored.
{es} is allowed at a later time to delete data older than this time period.

A data stream lifecycle also supports downsampling the data stream backing indices.
See <<data-streams-put-lifecycle-downsampling-example, the downsampling example>> for
more details.

[discrete]
[[data-streams-lifecycle-how-it-works]]
=== How does it work?
Expand All @@ -35,7 +39,9 @@ into tiers of exponential sizes, merging the long tail of small segments is only
fraction of the cost of force mergeing to a single segment. The small segments would usually
hold the most recent data so tail mergeing will focus the merging resources on the higher-value
data that is most likely to keep being queried.
4. Applies retention to the remaining backing indices. This means deleting the backing indices whose
4. If <<data-streams-put-lifecycle-downsampling-example, downsampling>> is configured it will execute
all the configured downsampling rounds.
5. Applies retention to the remaining backing indices. This means deleting the backing indices whose
`generation_time` is longer than the configured retention period. The `generation_time` is only applicable to rolled over backing
indices and it is either the time since the backing index got rolled over, or the time optionally configured in the
<<index-data-stream-lifecycle-origination-date,`index.lifecycle.origination_date`>> setting.
Expand Down