Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement zero downtime migration algorithm #150309

Closed
pgayvallet opened this issue Feb 6, 2023 · 1 comment
Closed

Implement zero downtime migration algorithm #150309

pgayvallet opened this issue Feb 6, 2023 · 1 comment
Assignees
Labels
Epic:ZDTmigrations Zero downtime migrations Feature:Migrations Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@pgayvallet
Copy link
Contributor

pgayvallet commented Feb 6, 2023

Part of #150296
Somehow depends on #150301 (could maybe to both in parallel, but doing the Doc migration part first would help us having a better vision probably)

  • add a new migrations.managed configuration setting to switch between the two implementations
  • implement the base of the algorithm, especially the part storing / checking the model version in the index's meta
  • implement the whole migration algorithm

The new managed algorithm will be dissociated from the existing (v2) one, but ideally we would reuse as much low-level blocks as possible (e.g actions)

Follow-up / optimizations:

  • [req] support for upgrading docs based on coreMigrationVersion
  • [req] support for executing only the index-state part when during from non-migrator instances (to bootstrap the index)
  • [opt] during the reindex following the update of the mapping, only reindex document that got their mappings updated
  • [opt] optimize documentsUpdateInit to perform a version check and skip most tasks if same version
  • [opt] create the aliases directly during CREATE_TARGET_INDEX to allow skipping UPDATE_ALIASES when creating the index

PRs

@pgayvallet pgayvallet added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Feature:Migrations Epic:ZDTmigrations Zero downtime migrations labels Feb 6, 2023
@pgayvallet pgayvallet self-assigned this Feb 15, 2023
pgayvallet added a commit that referenced this issue Feb 27, 2023
## Summary

Part of #150309

Purpose of the PR is to create the skeleton of the ZDT algorithm, in
order to make sure we're all aligned on the way we'll be managing our
codebase between the 2 implementation (and to ease with the review of
the follow-up PRs by not having the bootstrap of the algo to review at
the same time)

---------

Co-authored-by: kibanamachine <[email protected]>
pgayvallet added a commit that referenced this issue Mar 8, 2023
## Summary

Part of #150309

This PR implements the first stage (mapping check / update) of the ZDT
algorithm, following the schema from the design document:

<img width="1114" alt="Screenshot 2023-02-28 at 09 23 07"
src="https://user-images.githubusercontent.com/1532934/221795647-4e3d8ad0-18a1-4e2a-8c0d-dd70e66a3c25.png">

Which translates to this:

<img width="700" alt="Screenshot 2023-03-01 at 14 30 50"
src="https://user-images.githubusercontent.com/1532934/222153028-8e2cc6e8-4da2-4ca6-b299-61db6fbb624e.png">
bmorelli25 pushed a commit to bmorelli25/kibana that referenced this issue Mar 10, 2023
## Summary

Part of elastic#150309

Purpose of the PR is to create the skeleton of the ZDT algorithm, in
order to make sure we're all aligned on the way we'll be managing our
codebase between the 2 implementation (and to ease with the review of
the follow-up PRs by not having the bootstrap of the algo to review at
the same time)

---------

Co-authored-by: kibanamachine <[email protected]>
bmorelli25 pushed a commit to bmorelli25/kibana that referenced this issue Mar 10, 2023
## Summary

Part of elastic#150309

This PR implements the first stage (mapping check / update) of the ZDT
algorithm, following the schema from the design document:

<img width="1114" alt="Screenshot 2023-02-28 at 09 23 07"
src="https://user-images.githubusercontent.com/1532934/221795647-4e3d8ad0-18a1-4e2a-8c0d-dd70e66a3c25.png">

Which translates to this:

<img width="700" alt="Screenshot 2023-03-01 at 14 30 50"
src="https://user-images.githubusercontent.com/1532934/222153028-8e2cc6e8-4da2-4ca6-b299-61db6fbb624e.png">
pgayvallet added a commit that referenced this issue Mar 28, 2023
## Summary

Part of #150309
Follow-up of #152219

Implement the second part of the zero-downtime migration algorithm: the
document conversion.

### Schema

because a schema is worth a thousand words:

<img width="650" alt="Screenshot 2023-03-22 at 08 33 44"
src="https://user-images.githubusercontent.com/1532934/226832339-d74d8349-9969-4c51-a5fe-f77558f17b67.png">


### TODO / notepad

- ~check that all types have model versions in INIT~ will do later when
we'll start have real types using MVs
- [x] Optimize to skip document migration when creating new index
- [x] documentsUpdateInit: extract remaining logic to utilities
- [x] outdatedDocumentsSearchRead: cleanup corrupted doc logic
- [x] outdatedDocumentsSearchTransform: cleanup corrupted doc logic
- [x] tests for /zdt/actions/wait_for_delay.ts ?
- ~support for coreMigrationVersion~ added as a follow-up in the parent
issue
- [x] init -> equal -> check if aliasActions is empty

---------

Co-authored-by: kibanamachine <[email protected]>
pgayvallet added a commit that referenced this issue May 4, 2023
…156345)

## Summary

Part of #150309
A few enhancements to the ZDT migration algorithm.

### 1. Run the 'expand' phase (and only this one) on non-migrator nodes

Given our latests changes to the way we want the algo to function, the
non-migrator nodes will have to run the 'expand' (schema expansion)
phase. However, the document migration phase will have to be run by the
migrator node exclusively.

Note: because it was required for integration tests, a new
`migration.zdt.runOnNonMigratorNodes` option was introduced to change
this behavor and have non-migrator nodes ignore this limitation.

### 2. Don't terminate during `INIT` if higher mapping versions are
found

Any mapping changes are upward compatible, meaning that we can safely
no-op instead of failing of the mapping version check result is
`lesser`. This change is required now that mapping updates will be
performed before all nodes of the previous version are shut down (and is
also required for rollbacks)

### 3. Perform a version check during `DOCUMENTS_UPDATE_INIT`

We were always executing the full doc update cycle when entering this
stage. We're now performing a version check similar to what was done
during `INIT`.

If the check result returns:
- `greater`: we perform the document migration (as it was done before
this change)
- `equal`: we skip the document migration
- `lesser`: we skip the document migration (**NOTE**: this may change
later depending on how we handle rollbacks)
- `conflict`: we terminate with a failure, as done during `INIT`

---------

Co-authored-by: kibanamachine <[email protected]>
@pgayvallet
Copy link
Contributor Author

The main algorithm has been implemented. I opened #161067 to track the follow-ups. I'll go ahead and close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Epic:ZDTmigrations Zero downtime migrations Feature:Migrations Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

No branches or pull requests

1 participant