Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set the rollover action to idempotent #986

Merged

Conversation

bowenlan-amzn
Copy link
Member

@bowenlan-amzn bowenlan-amzn commented Oct 5, 2023

Issue #, if available:

Description of changes:

The rollover action could fail with message Previous action was not able to update IndexMetaData.
This failure was because ISM failed to save metadata at the end of a transaction run.

Background

ISM manages each index by using a background job running in certain frequency. For each background job run, ISM makes it transactional by setting the metadata to STARTING status before executing the action, and resets the STARTING status after execution. If the reset failed to be saved into ISM system index and the action to be executed is not idempotent, then at next run, ISM detected this as a transaction failure and disabled/stopped the background job for this managed index. After this point, users need to use ISM retry API to continue the ISM running after confirming whether the action has been executed or not. Rollover is currently treated as a non-idempotent action because calling rollover multiple times could create multiple new indexes.

Why change rollover to idempotent

We are looking at the situation when ISM cannot figure out what's the result of previous run. For rollover, there are 2 possibilities.

  1. rollover succeed
  2. rollover failed or didn't happen
    For case 1, the rolloverInfo would be saved in cluster state, while it wouldn't be saved for case 2.
    And since we already have the logic to check the rolloverInfo, the attempt rollover step already can decide what to do depending on that. That makes this step idempotent.
    if (clusterService.state().metadata.index(indexName).rolloverInfos.containsKey(rolloverTarget)) {
    stepStatus = StepStatus.COMPLETED
    info = mapOf("message" to getAlreadyRolledOverMessage(indexName, rolloverTarget))
    // If already rolled over, alias may not get copied over yet
    copyAlias(clusterService, indexName, context.client, rolloverTarget, context.metadata)
    return this
    }

Testing

Add one IT that re-produce the transaction failure state, and see ISM can continue the rollover action without failing.

CheckList:

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@codecov
Copy link

codecov bot commented Oct 5, 2023

Codecov Report

Merging #986 (be26e78) into main (bc3c3f3) will increase coverage by 0.09%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##               main     #986      +/-   ##
============================================
+ Coverage     74.61%   74.71%   +0.09%     
- Complexity     2749     2755       +6     
============================================
  Files           361      361              
  Lines         16077    16077              
  Branches       2314     2314              
============================================
+ Hits          11996    12012      +16     
+ Misses         2803     2785      -18     
- Partials       1278     1280       +2     
Files Coverage Δ
...atemanagement/step/rollover/AttemptRolloverStep.kt 75.00% <100.00%> (+0.92%) ⬆️

... and 11 files with indirect coverage changes

@bowenlan-amzn bowenlan-amzn merged commit f543d93 into opensearch-project:main Oct 5, 2023
23 of 24 checks passed
@bowenlan-amzn bowenlan-amzn deleted the rollover-idempotent branch October 5, 2023 23:48
opensearch-trigger-bot bot pushed a commit that referenced this pull request Oct 5, 2023
Signed-off-by: bowenlan-amzn <[email protected]>
(cherry picked from commit f543d93)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
bowenlan-amzn pushed a commit that referenced this pull request Oct 5, 2023
(cherry picked from commit f543d93)

Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Joshua152 pushed a commit to Joshua152/index-management that referenced this pull request Oct 10, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Oct 14, 2023
(cherry picked from commit f543d93)

Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 4dc7997)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants