-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ZDT migration] Don't run document migration on non-migrator nodes #156345
[ZDT migration] Don't run document migration on non-migrator nodes #156345
Conversation
117af1f
to
fd844be
Compare
…55981) ## Summary Adapt the `zdt` migration algorithm to run the v2 migrations in addition to the model version transformations. The intent is to be able to use the current migration system in a zero-downtime upgrade friendly-ish way ### Technicals elastic#148656 was a precondition, as the zdt algo requires that mapping changes are compatible (given we keep the same index). Technically, it means we're now storing the `virtualVersion` instead of the `modelVersion` in the index's meta (`mappingVersions` and `docVersions`), to allow keeping track of mixed stack and model versions per type. Note that switching to model versioning is still a one way, non-revertible action. Once using model versioning (by specifying `switchToModelVersionAt` on your type), there is no going back.
fd844be
to
487bc49
Compare
…gration-for-non-mig-nodes
…gration-for-non-mig-nodes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self review
if (logger.isLevelEnabled('debug')) { | ||
logger.debug<StateTransitionLogMeta>(logMessage, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small improvement, since we implemented Logger.isLevelEnabled
a while ago. Only log one message depending on the log level.
const toContainLogEntries: MatcherFunction<[entries: string[], options: { ordered?: boolean }]> = ( | ||
actual, | ||
entries, | ||
options = {} | ||
) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finally added it!
// app version is lower than the index mapping version. | ||
// Should only occur in case of rollback. | ||
// For now, we don't do anything. | ||
case 'lesser': | ||
return { | ||
...state, | ||
logs, | ||
controlState: 'DONE', | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, we just perform a noop for downgrades and go straight to DONE. It may change in the future, depending on how we want to manage that, but it probably makes sense to just have a defined behavior for now, to be able to add tests around it.
/** | ||
* When true, will fully skip document migration, and will transition directly to DONE | ||
* after the INDEX_STATE_UPDATE_DONE stage. | ||
* | ||
* This flag is set to `true` in the following scenarios: | ||
* - on nodes without the `migrator` role, the flag will always be `true`. | ||
* - if the migrator create the index, the workflow will set the flag to `true` given there is nothing to migrate. | ||
*/ | ||
readonly skipDocumentMigration: boolean; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As documented - this flag replaces newIndexCreation
const runDocumentMigration = | ||
context.nodeRoles.migrator || context.migrationConfig.zdt.runOnNonMigratorNodes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As explained on the PR's description: this zdt.runOnNonMigratorNodes
was added to force non-migrator nodes to run the document migration. The only usage for now is integration tests, but we may want to use this in single-node dev / test environments.
// app version is lower than the index mapping version. | ||
// likely a rollback scenario - unsupported for the initial implementation | ||
// either a rollback scenario, or an old node rebooting during the cohabitation period | ||
// in that case, we simply no-op the expand phase. | ||
case 'lesser': | ||
return { | ||
...state, | ||
controlState: 'FATAL', | ||
reason: 'Downgrading model version is currently unsupported', | ||
logs, | ||
controlState: 'INDEX_STATE_UPDATE_DONE', | ||
...commonState, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change of behavior - encountering higher versions of the mappings in the index's meta now just skip the schema expansion phase
describe('ZDT upgrades - basic downgrade', () => { | ||
let esServer: TestElasticsearchUtils['es']; | ||
|
||
const startElasticsearch = async () => { | ||
const { startES } = createTestServers({ | ||
adjustTimeout: (t: number) => jest.setTimeout(t), | ||
settings: { | ||
es: { | ||
license: 'basic', | ||
}, | ||
}, | ||
}); | ||
return await startES(); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a basic test asserting of the current behavior of the algo in case of 'downgrade' (booting with a lower version of the SO types)
…gration-for-non-mig-nodes
Pinging @elastic/kibana-core (Team:Core) |
@@ -98,7 +99,7 @@ describe('migrationsStateActionMachine', () => { | |||
abort, | |||
}); | |||
const logs = loggingSystemMock.collect(mockLogger); | |||
const doneLog = logs.info.splice(8, 1)[0][0]; | |||
const doneLog = logs.info.splice(4, 1)[0][0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT could you add a little comment describing what you're doing with this obscure line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I
'm doing nothing, just adapting someone else
's hack
...saved-objects-migration-server-internal/src/zdt/model/stages/index_state_update_done.test.ts
Outdated
Show resolved
Hide resolved
describe('ZDT upgrades - basic downgrade', () => { | ||
let esServer: TestElasticsearchUtils['es']; | ||
|
||
const startElasticsearch = async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT Since this function is very common, I made it available directly in the kibana_migrator_test_kit.ts
(under the same name).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, great. Wasn't aware of that and TBH I thought about extracting it in this PR. Will adapt accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall lgtm. Left a q about node roles. Did not test locally but test coverage looks great!
elasticsearchClient: elasticsearchClientMock.createElasticsearchClient(), | ||
maxRetryAttempts: 15, | ||
migrationDocLinks: docLinksServiceMock.createSetupContract().links.kibanaUpgradeSavedObjects, | ||
typeRegistry, | ||
serializer: serializerMock.create(), | ||
deletedTypes: ['deleted-type'], | ||
discardCorruptObjects: false, | ||
nodeRoles: { migrator: true, ui: true, backgroundTasks: true }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our current config validation prohibits migrator being combined with other roles. I'm guessing we are just doing all roles true in this mock for convenience?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, pretty much. If we want to be more consistent, I can adapt that to { migrator: true, ui: false, backgroundTasks: false }
, it shouldn't cause any test to fail. Will try
|
||
// check that the mappingVersions are still v2 | ||
expect(mappingMeta.mappingVersions).toEqual({ | ||
sample_type: '10.2.0', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignorant question: How / where is this mappingVersions 10.2.0
defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Model version 2 = virtual version 10.2.0
.
…gration-for-non-mig-nodes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with few nits and comments!
@@ -18,7 +18,7 @@ export const indexStateUpdateDone: ModelStage< | |||
throwBadResponse(state, res as never); | |||
} | |||
|
|||
if (state.newIndexCreation) { | |||
if (state.skipDocumentMigration) { | |||
// we created the index, so we can safely skip the whole document migration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT outdated comment
// this migrator is not configured to update documents (or its index has just been created), skip the whole document migration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to move this condition inside the documentsUpdateInit()
?
Otherwise, if in the future we have other paths that lead to the DOCUMENTS_UPDATE_INIT we'll have to add the if(state.skipDocumentMigration)
condition in each of them. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reasoning was that INDEX_STATE_UPDATE_DONE
was a mandatory step (as skipping the index update still leads to this stage), so we should be good keeping it there for now.
import { SavedObjectTypeRegistry } from '@kbn/core-saved-objects-base-server-internal'; | ||
import { | ||
SavedObjectTypeRegistry, | ||
SavedObjectsMigrationConfigType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT can be imported as a type
…gration-for-non-mig-nodes
💚 Build Succeeded
Metrics [docs]Public APIs missing comments
Unknown metric groupsAPI count
ESLint disabled line counts
Total ESLint disabled count
History
To update your PR or re-run it, just comment with: |
Summary
Part of #150309
A few enhancements to the ZDT migration algorithm.
1. Run the 'expand' phase (and only this one) on non-migrator nodes
Given our latests changes to the way we want the algo to function, the non-migrator nodes will have to run the 'expand' (schema expansion) phase. However, the document migration phase will have to be run by the migrator node exclusively.
Note: because it was required for integration tests, a new
migration.zdt.runOnNonMigratorNodes
option was introduced to change this behavor and have non-migrator nodes ignore this limitation.2. Don't terminate during
INIT
if higher mapping versions are foundAny mapping changes are upward compatible, meaning that we can safely no-op instead of failing of the mapping version check result is
lesser
. This change is required now that mapping updates will be performed before all nodes of the previous version are shut down (and is also required for rollbacks)3. Perform a version check during
DOCUMENTS_UPDATE_INIT
We were always executing the full doc update cycle when entering this stage. We're now performing a version check similar to what was done during
INIT
.If the check result returns:
greater
: we perform the document migration (as it was done before this change)equal
: we skip the document migrationlesser
: we skip the document migration (NOTE: this may change later depending on how we handle rollbacks)conflict
: we terminate with a failure, as done duringINIT