Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support ilm phase timings based on index naming date_math rather than creation_date #42449

Closed
bczifra opened this issue May 23, 2019 · 3 comments · Fixed by #46755
Closed

support ilm phase timings based on index naming date_math rather than creation_date #42449

bczifra opened this issue May 23, 2019 · 3 comments · Fixed by #46755
Assignees
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management >enhancement

Comments

@bczifra
Copy link
Member

bczifra commented May 23, 2019

Describe the feature:
Sometimes indices are named based on the dates that their documents pertain to, not based on the date the indices are created. As an example, on May 23 an index could be created with a name like foo-2019.05.20.

It would be be helpful if the ILM phase timings could be based on the index naming date math rather than the index's creation_date, supporting options like:
"for this monthly index, keep 3 months of data"
"for this weekly index, keep 12 weeks of data"
"for this daily index, keep 20 days of data"

In each of these situations, the start date of that elapsed time period should be calculated based upon the date specified in the index name.

The Rollover API does support index name date math, but ILM does not.

@bczifra bczifra added >enhancement :Data Management/ILM+SLM Index and Snapshot lifecycle management labels May 23, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features

@dakrone
Copy link
Member

dakrone commented May 28, 2019

The Rollover API does support index name date math, but ILM does not.

This isn't right, ILM does support index name date math just like rollover does, it requires that the index name be created with date math, for instance:

PUT _ilm/policy/roll
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_age": "10s"
          }
        }
      }
    }
  }
}

PUT _template/roll
{
  "index_patterns": ["roll-*"],
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "index.lifecycle.name": "roll",
    "index.lifecycle.rollover_alias": "roll-alias"
  }
}

PUT /_cluster/settings
{
  "transient": {
    "indices.lifecycle.poll_interval": "10s"
  }
}

PUT /%3Croll-%7Bnow%2Fs%7Byyyy-MM-dd-HH-mm-ss%7D%7D-1%3E
{
  "aliases": {
    "roll-alias":{
      "is_write_index": true
    }
  }
}

Creates indices that follow the date math as they are rolled over.

@dakrone
Copy link
Member

dakrone commented Sep 6, 2019

We have a plan for addressing situation like this, I'll outline the general plan (which may shift as implementation proceeds).

Step one is to add a setting, index.lifecycle.origination_date that a user can set on an index, if this is set, ILM will use this date to calculate the index age for its phase transitions. This allows a user to manually create an "old" index with an old origination date when indexing older data. This custom origination date should also be exposed in the ILM explain output for an index.

Step two is to add another index setting allowing the origination date setting to be automatically configured on index creation based on some pattern in the index name. We haven't yet decided how to parse the index name, or whether this should be template level or index level. This would allow us to automate setting the origination date for newly created indices containing older data.

@andreidan is going to start working on step one.

andreidan added a commit to andreidan/elasticsearch that referenced this issue Sep 10, 2019
Add the `index.lifecycle.origination_date` to allow users to configure a
custom date that'll be used to calculate the index age for the phase
transmissions (as opposed to the default index creation date).

This could be useful for users to create an index with an "older"
origination date when indexing old data.

Relates to elastic#42449.
andreidan added a commit that referenced this issue Sep 12, 2019
* [ILM] Add date setting to calculate index age

Add the `index.lifecycle.origination_date` to allow users to configure a
custom date that'll be used to calculate the index age for the phase
transmissions (as opposed to the default index creation date).

This could be useful for users to create an index with an "older"
origination date when indexing old data.

Relates to #42449.

* [ILM] Don't override creation date on policy init

The initial approach we took was to override the lifecycle creation date
if the `index.lifecycle.origination_date` setting was set. This had the
disadvantage of the user not being able to update the `origination_date`
anymore once set.

This commit changes the way we makes use of the
`index.lifecycle.origination_date` setting by checking its value when
we calculate the index age (ie. at "read time") and, in case it's not
set, default to the index creation date.

* Make origination date setting index scope dynamic

* Document orignation date setting in ilm settings
andreidan added a commit to andreidan/elasticsearch that referenced this issue Sep 13, 2019
* [ILM] Add date setting to calculate index age

Add the `index.lifecycle.origination_date` to allow users to configure a
custom date that'll be used to calculate the index age for the phase
transmissions (as opposed to the default index creation date).

This could be useful for users to create an index with an "older"
origination date when indexing old data.

Relates to elastic#42449.

* [ILM] Don't override creation date on policy init

The initial approach we took was to override the lifecycle creation date
if the `index.lifecycle.origination_date` setting was set. This had the
disadvantage of the user not being able to update the `origination_date`
anymore once set.

This commit changes the way we makes use of the
`index.lifecycle.origination_date` setting by checking its value when
we calculate the index age (ie. at "read time") and, in case it's not
set, default to the index creation date.

* Make origination date setting index scope dynamic

* Document orignation date setting in ilm settings

(cherry picked from commit d5bd2bb)
Signed-off-by: Andrei Dan <[email protected]>
andreidan added a commit that referenced this issue Sep 16, 2019
* [ILM] Add date setting to calculate index age

Add the `index.lifecycle.origination_date` to allow users to configure a
custom date that'll be used to calculate the index age for the phase
transmissions (as opposed to the default index creation date).

This could be useful for users to create an index with an "older"
origination date when indexing old data.

Relates to #42449.

* [ILM] Don't override creation date on policy init

The initial approach we took was to override the lifecycle creation date
if the `index.lifecycle.origination_date` setting was set. This had the
disadvantage of the user not being able to update the `origination_date`
anymore once set.

This commit changes the way we makes use of the
`index.lifecycle.origination_date` setting by checking its value when
we calculate the index age (ie. at "read time") and, in case it's not
set, default to the index creation date.

* Make origination date setting index scope dynamic

* Document orignation date setting in ilm settings

(cherry picked from commit d5bd2bb)
Signed-off-by: Andrei Dan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management >enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants