-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Migrations] Update all aliases with a single updateAliases() when relocating SO documents #158940
[Migrations] Update all aliases with a single updateAliases() when relocating SO documents #158940
Conversation
@@ -242,6 +249,12 @@ export const nextActionMap = ( | |||
}), | |||
MARK_VERSION_INDEX_READY: (state: MarkVersionIndexReady) => | |||
Actions.updateAliases({ client, aliasActions: state.versionIndexReadyActions.value }), | |||
MARK_VERSION_INDEX_READY_SYNC: (state: MarkVersionIndexReady) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's where the synchronization magic happens:
- All migrators involved in a relocation will wait for the rest to reach this point.
- They will then provide the aliases that they intend to update (in the
payload
property). - The
updateRelocationAliases
synchronisation object willthen
call theclient.indices.updateAliases
one single time. - Each of the migrators will receive the same response for the global update.
@@ -1458,7 +1470,9 @@ export const model = (currentState: State, resW: ResponseType<AllActionStates>): | |||
// index. | |||
return { | |||
...stateP, | |||
controlState: 'MARK_VERSION_INDEX_READY', | |||
controlState: stateP.mustRelocateDocuments | |||
? 'MARK_VERSION_INDEX_READY_SYNC' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All migrators involved in a relocation will update the aliases simultaneously.
return new Defer<T>(); | ||
} | ||
|
||
export function createWaitGroupMap<T, U>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed all "defer" objects to "waitGroups" following Pierre's suggestion (less confusing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! However, I'm not a migration expert yet. So I'd leave the LGTM to someone else :)
throwDelayMillis: 1000, // another migrator has failed for a reason, let it take Kibana down and log its problem | ||
}; | ||
} else { | ||
throwBadResponse(stateP, left); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking at the line below, should this be return throwBadResponse(...)
(mind the return
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I didn't notice, but in the code there's a few of each. I suppose it does not matter, cause the throwBadResponse
throws an error anyway. We can choose one and make it consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with the tight deadline, I think we should rather not try to decide on what form we prefer and make it consistent in this PR, typescript has us covered either way.
@@ -242,6 +249,12 @@ export const nextActionMap = ( | |||
}), | |||
MARK_VERSION_INDEX_READY: (state: MarkVersionIndexReady) => | |||
Actions.updateAliases({ client, aliasActions: state.versionIndexReadyActions.value }), | |||
MARK_VERSION_INDEX_READY_SYNC: (state: MarkVersionIndexReady) => | |||
Actions.synchronizeMigrators({ | |||
waitGroup: updateRelocationAliases, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Could it happen that 1 alias creation fails (while the others succeed) and it triggers a whole re-run of the migrations on the next restart?
Should we create all aliases in the same call?
updateAliases({ | ||
client: options.elasticsearchClient, | ||
aliasActions: allAliasActions.flat(), | ||
})() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh! this is the response to my previous comment, right? 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's correct!
There's a single call to update them all.
In fact, they all await on something like:
(Promise.all([migrator1, migrator2, migratorN]).then(updateAliases))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the simplest solutions I came up with, but perhaps we can create a separate issue to refactor and make something more centralised, some sort of SynchronizationManager
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the same "Oh!" moment 😅 I would not come here to look for an updateAliases
action call. I think it's fine to merge as-is and rather spend extra time on testing.
But I think it would be worth exploring if we could make this less "surprising". One way might be to let every migrator call the updateAliases action. Before we had 6 update aliases calls with each call doing one index/alias. Now we'd have 1 batch call and 5 no-ops.
Another option could be to use the synchronizeMigrators.then
hook but only let the .kibana
migrator do the updateAliases call, other migrators have a no-op then hook.
@@ -1474,9 +1488,19 @@ export const model = (currentState: State, resW: ResponseType<AllActionStates>): | |||
} else if (stateP.controlState === 'CREATE_NEW_TARGET') { | |||
const res = resW as ExcludeRetryableEsError<ResponseType<typeof stateP.controlState>>; | |||
if (Either.isRight(res)) { | |||
if (res.right === 'index_already_exists') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This the second fix of the PR:
If we are creating a new index but the index already exists:
- it probably belongs to a previous failed upgrade
- which managed to create the index
- but failed before it could create the aliases
In this scenario, instead of simply completing the migration, we will attempt to update mappings first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering why index_already_exists
was a right
response.
Strictly speaking, if the index exists, and it shouldn't, then the previous migration failure would be an error state, i.e. a left
response). But I see what we're doing here, migrations skip over index creation and move on. Hence, it's a recoverable 'unexpected' case.
It makes sense now, great!
() => { | ||
if (!aliasActions || !aliasActions.length) throw Error('updating NO aliases!'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, that was a debug statement, removing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get some more eyes on it, but the sync process and the handling for index_already_exists
all make sense to me. Thanks for jumping on this so quickly @gsoldevila ❤️
return { ...stateP, controlState: 'LEGACY_CREATE_REINDEX_TARGET' }; | ||
} else { | ||
// @ts-expect-error TS doesn't correctly narrow this type to never | ||
return throwBadResponse(stateP, res); | ||
throwBadResponse(stateP, left); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume it was just a mistake that we were previously throwing res
here instead of res.left
, and that we aren't losing anything important in our error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this branch of the conditional, res
should not contain anything apart from left
.
I looked at the code that can throw this exception:
packages/core/saved-objects/core-saved-objects-migration-server-internal/src/actions/set_write_block.ts
And I can confirm, the only possible Left responses are IndexNotFound | RetryableEsClientError
, none of them have anything else.
Finally, the throwBadResponse
only JSON.stringifies the received object, so I believe it's completely safe to delete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah we're just helping TS see what we already know by giving it the const left = res.left;
hint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM.
💚 Build Succeeded
Metrics [docs]Public APIs missing exports
Unknown metric groupsESLint disabled line counts
Total ESLint disabled count
History
To update your PR or re-run it, just comment with: |
This PR addresses remarks and feedback from #158940, which was part of an emergency release.
This PR adds #158733 to the list of known issues: * issue: #158733 * pull: #158940 --------- Co-authored-by: James Rodewig <[email protected]>
#159221) # Backport This will backport the following commits from `8.8` to `main`: - [[DOCS+] Add #158940 to the list of 8.8.0 known issues (#159197)](#159197) <!--- Backport version: 8.9.7 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Gerard Soldevila","email":"[email protected]"},"sourceCommit":{"committedDate":"2023-06-07T13:16:53Z","message":"[DOCS+] Add #158940 to the list of 8.8.0 known issues (#159197)\n\nThis PR adds #158733 to the list\r\nof known issues:\r\n* issue: https://github.com/elastic/kibana/issues/158733\r\n* pull: https://github.com/elastic/kibana/pull/158940\r\n\r\n---------\r\n\r\nCo-authored-by: James Rodewig <[email protected]>","sha":"528671e3bdcf65856c52cb48bbfaec231bdbaca3","branchLabelMapping":{"^v8.8.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Team:Docs","release_note:skip","docs","Feature:Migrations"],"number":159197,"url":"https://github.com/elastic/kibana/pull/159197","mergeCommit":{"message":"[DOCS+] Add #158940 to the list of 8.8.0 known issues (#159197)\n\nThis PR adds #158733 to the list\r\nof known issues:\r\n* issue: https://github.com/elastic/kibana/issues/158733\r\n* pull: https://github.com/elastic/kibana/pull/158940\r\n\r\n---------\r\n\r\nCo-authored-by: James Rodewig <[email protected]>","sha":"528671e3bdcf65856c52cb48bbfaec231bdbaca3"}},"sourceBranch":"8.8","suggestedTargetBranches":[],"targetPullRequestStates":[]}] BACKPORT--> Co-authored-by: Gerard Soldevila <[email protected]>
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Friendly reminder: Looks like this PR hasn’t been backported yet. |
15 similar comments
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Fixes #158733
The goal of this modification is to enforce migrators of all indices involved in a relocation (e.g. as part of the dot kibana split) to create the index aliases in the same
updateAliases()
call.This way, either: