Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow ILM to transition to implicit cached steps #91779

Merged
merged 12 commits into from
Mar 20, 2023

Conversation

andreidan
Copy link
Contributor

ILM tries to honour the cached phase however, some steps are implicit (e.g. injected actions or the terminal policy/phase step) Currently, ILM would throw an exception if the currently cached phase was removed:

step [{"phase":"warm","action":"complete","name":"complete"}] for index [index]
with policy [my-policy] does not exist

ILM would currently also throw an exception if the next step was an implicit one and the phase was removed from the underlying policy e.g. if the index is on the migrate step and looking to transition to the check-migration step, whilt the warm phase doesn't exist in the policy anymore

step [{"phase":"warm","action":"migrate","name":"check-migration"}] for index
[index] with policy [my-policy] does not exist

This fixes these scenarios by enhancing the
PolicyStepsRegistry#parseStepKeysFromPhase method to compute all the steps in the phase (including all implicit steps)

Fixes #91749

ILM tries to honour the cached phase however, some steps are implicit
(e.g. injected actions or the terminal policy/phase step)
Currently, ILM would throw an exception if the currently cached phase was removed:
```
step [{"phase":"warm","action":"complete","name":"complete"}] for index [index]
with policy [my-policy] does not exist
```
ILM would currently also throw an exception if the next step was an
implicit one and the phase was removed from the underlying policy e.g. if the
index is on the `migrate` step and looking to transition to the
`check-migration` step, whilt the `warm` phase doesn't exist in the
policy anymore
```
step [{"phase":"warm","action":"migrate","name":"check-migration"}] for index
[index] with policy [my-policy] does not exist
```

This fixes these scenarios by enhancing the
`PolicyStepsRegistry#parseStepKeysFromPhase` method to compute all the
steps in the phase (including all implicit steps)
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine elasticsearchmachine added Team:Data Management Meta label for data/management team v8.7.0 labels Nov 21, 2022
@elasticsearchmachine
Copy link
Collaborator

Hi @andreidan, I've created a changelog YAML for you.

@andreidan
Copy link
Contributor Author

Note: there might be a slight performance penalty here due to LifecyclePolicy#toSteps( https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/LifecyclePolicy.java#L215 )
now executing on every transition

@andreidan
Copy link
Contributor Author

andreidan commented Nov 21, 2022

Potentially superseding #91754 (pending discussion on the potential performance penalty)

@dakrone
Copy link
Member

dakrone commented Nov 28, 2022

@joegallo I think you introduced the PhaseCacheManagement piece, which was to increase the performance of this specific method if I recall correctly. Do you have thoughts on how this will affect performance?

@andreidan
Copy link
Contributor Author

andreidan commented Dec 8, 2022

@dakrone I introduced the PhaseCacheManagement to handle the phase manipulations we perform when migrating a deployment to data tiers (ie. we potentially remove the allocate action from phases). The reason this class exists is for encapsulating the interaction and management of the cached phase.

The change I propose here eludes the PhaseCacheManagement to keep the change as isolated as possible (ie. code affects only the validation of the transition to the next step) - hence the code is in PolicyStepsRegistry.

However, we'll continue to disregard implicit steps when parsing the cached phase (as we do today) when it comes to establishing if the current cached phase is safely updateable when updating a policy (ie. PhaseCacheManagement#isIndexPhaseDefinitionUpdatable will continue to work as it is today)

@joegallo what do you think ?

@andreidan

This comment was marked as resolved.

@joegallo

This comment was marked as resolved.

@joegallo joegallo self-assigned this Jan 5, 2023
@joegallo
Copy link
Contributor

joegallo commented Jan 5, 2023

I've assigned this to myself -- I want to do a little more looking at parseStepKeysFromPhase and parseStepsFromPhase. I'm not in love with the Stringly-typed phaseDef argument we're using for both. It can be null, a phase name like 'new' or 'complete' or a full blown json String that we parse. I'd like to see if I can tighten that up a bit.

@andreidan
Copy link
Contributor Author

@joegallo do you have an update here? Could we try to benchmark this approach and see if it's feasible?

@andreidan
Copy link
Contributor Author

@elasticmachine update branch

@andreidan andreidan requested a review from joegallo March 20, 2023 13:25
@joegallo
Copy link
Contributor

The plan is to merge this and give the benchmarks a little while to show an effect, then consult with Armin later this week to determine the path forward (if there's no effect, then that would imply a short meeting 😉).

@andreidan andreidan removed the v8.6.3 label Mar 20, 2023
@andreidan andreidan merged commit dc726e6 into elastic:main Mar 20, 2023
andreidan added a commit to andreidan/elasticsearch that referenced this pull request Mar 27, 2023
ILM tries to honour the cached phase however, some steps are implicit
(e.g. injected actions or the terminal policy/phase step)
Currently, ILM would throw an exception if the currently cached phase was removed:
```
step [{"phase":"warm","action":"complete","name":"complete"}] for index [index]
with policy [my-policy] does not exist
```
ILM would currently also throw an exception if the next step was an
implicit one and the phase was removed from the underlying policy e.g. if the
index is on the `migrate` step and looking to transition to the
`check-migration` step, whilt the `warm` phase doesn't exist in the
policy anymore
```
step [{"phase":"warm","action":"migrate","name":"check-migration"}] for index
[index] with policy [my-policy] does not exist
```

This fixes these scenarios by enhancing the
`PolicyStepsRegistry#parseStepKeysFromPhase` method to compute all the
steps in the phase (including all implicit steps)

(cherry picked from commit dc726e6)
Signed-off-by: Andrei Dan <[email protected]>
andreidan added a commit that referenced this pull request Mar 28, 2023
)

* Allow ILM to transition to implicit cached steps (#91779)

ILM tries to honour the cached phase however, some steps are implicit
(e.g. injected actions or the terminal policy/phase step)
Currently, ILM would throw an exception if the currently cached phase was removed:
```
step [{"phase":"warm","action":"complete","name":"complete"}] for index [index]
with policy [my-policy] does not exist
```
ILM would currently also throw an exception if the next step was an
implicit one and the phase was removed from the underlying policy e.g. if the
index is on the `migrate` step and looking to transition to the
`check-migration` step, whilt the `warm` phase doesn't exist in the
policy anymore
```
step [{"phase":"warm","action":"migrate","name":"check-migration"}] for index
[index] with policy [my-policy] does not exist
```

This fixes these scenarios by enhancing the
`PolicyStepsRegistry#parseStepKeysFromPhase` method to compute all the
steps in the phase (including all implicit steps)

(cherry picked from commit dc726e6)
Signed-off-by: Andrei Dan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team v7.17.10 v8.8.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ILM] Error moving to the phase complete step when phase was deleted
10 participants