Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the second part of the ZDT migration algorithm #153031

Merged
merged 47 commits into from
Mar 28, 2023

Conversation

pgayvallet
Copy link
Contributor

@pgayvallet pgayvallet commented Mar 9, 2023

Summary

Part of #150309
Follow-up of #152219

Implement the second part of the zero-downtime migration algorithm: the document conversion.

Schema

because a schema is worth a thousand words:

Screenshot 2023-03-22 at 08 33 44

TODO / notepad

  • check that all types have model versions in INIT will do later when we'll start have real types using MVs
  • Optimize to skip document migration when creating new index
  • documentsUpdateInit: extract remaining logic to utilities
  • outdatedDocumentsSearchRead: cleanup corrupted doc logic
  • outdatedDocumentsSearchTransform: cleanup corrupted doc logic
  • tests for /zdt/actions/wait_for_delay.ts ?
  • support for coreMigrationVersion added as a follow-up in the parent issue
  • init -> equal -> check if aliasActions is empty

@pgayvallet pgayvallet added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Feature:Migrations Epic:ZDTmigrations Zero downtime migrations labels Mar 9, 2023
@pgayvallet pgayvallet force-pushed the kbn-150309-zdt-algo-part-2 branch from e2d36fe to 1646817 Compare March 20, 2023 12:01
@pgayvallet pgayvallet added release_note:skip Skip the PR/issue when compiling release notes v8.8.0 labels Mar 22, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

Copy link
Contributor

@jloleysens jloleysens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job @pgayvallet ! This is looking really solid. I left a few comments, no blockers from my side, I am guessing more reviews are in flight 👀 .

One thought is that it would be nice to avoid deleting unknown documents because it should be safe to do leveraging model versions and on-read-migration. I understand maintaining parity with the previous algo is probably still safest but my gut feeling is that discarding data unless absolutely necessary should be avoided.

Comment on lines 70 to 72
// couldn't find a way to infer the type of the state depending on the state of the handler
// even if they are directly coupled, so had to force-cast to this ugly any instead.
return stageHandler(current as any, response as any, context);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is really annoying. One idea, if you specifically want to avoid using any:

type AnyModelStageHandler = (
  state: State,
  response: Either.Either<unknown, unknown>,
  ctx: MigratorContext
) => State;

then later

  const stageHandler = modelStageMap[current.controlState] as AnyModelStageHandler;

Still uses type cast. Happy to stick with what you have + comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ these small files

@@ -48,10 +47,11 @@ export const createTargetIndex: ModelStage<'CREATE_TARGET_INDEX', 'UPDATE_ALIASE

return {
...state,
controlState: 'UPDATE_ALIASES',
controlState: aliasActions.length ? 'UPDATE_ALIASES' : 'INDEX_STATE_UPDATE_DONE',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forced optimisation :P

await runMigrations();
};

it('should perform a no-op upgrade', async () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check my understanding: does the ZDT algo actually re-run queries to check for outdated documents?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It currently does. it's probably not necessary if we were to check the docVersions property, but I didn't want to optimize too soon (and checking for docs everytime is safest anyway)

@afharo
Copy link
Member

afharo commented Mar 24, 2023

ACK! I'll take a look on Monday

Copy link
Member

@afharo afharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just added a few minor comments

/**
* Indicates that the algorithm is currently converting the documents.
*/
convertingDocuments: boolean;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ to keeping it simple for now. In the future we might want to include a migration_stage field that claims what step it is. I think it might help with troubleshooting.

*/

import { BulkOperationContainer } from '@elastic/elasticsearch/lib/api/types';
import type { BulkOperation } from '../model/create_batches';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: importing non-common types from the common dir. We should try to avoid that 😇

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... I'm planning on doing some 'move all the things' refactor, but I was planning on waiting until the algorithm is more polished before doing so

Comment on lines +11 to +12
const nextTick = () => new Promise<void>((resolve) => resolve());
const aFewTicks = () => nextTick().then(nextTick).then(nextTick);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hahaha! the .then sorting in NodeJS is interesting...

I think the following works and completes immediately because of the fake timers:

it('resolves after the specified amount of time', async () => {
    const handler = jest.fn();

    const promise = waitForDelay({ delayInSec: 5 })().then(handler);

    expect(handler).not.toHaveBeenCalled();

    jest.advanceTimersByTime(5000);
    await promise;

    expect(handler).toHaveBeenCalledTimes(1);
  });

@pgayvallet
Copy link
Contributor Author

@elasticmachine merge upstream

@pgayvallet
Copy link
Contributor Author

@elasticmachine merge upstream

@@ -48,10 +47,11 @@ export const createTargetIndex: ModelStage<'CREATE_TARGET_INDEX', 'UPDATE_ALIASE

return {
...state,
controlState: 'UPDATE_ALIASES',
controlState: aliasActions.length ? 'UPDATE_ALIASES' : 'INDEX_STATE_UPDATE_DONE',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we create a new index then alias actions should always be empty?

But I wonder why we add aliases as a second step instead of creating the index with the aliases already set (createIndex accepts aliases param)?

Comment on lines 21 to 26
if (state.newIndexCreation) {
return {
...state,
controlState: 'DONE',
};
} else {
Copy link
Contributor

@rudolf rudolf Mar 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have tsdocs on state.newIndexCreation but it would be useful to add (another) comment here to explain why we can skip to done

Comment on lines +101 to +103
meta: setMetaDocMigrationStarted({
meta: state.currentIndexMeta,
}),
Copy link
Contributor

@rudolf rudolf Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is sometimes awkward in the v2 migration model too :( it's not great that the previous stage needs to know how to prepare the state for the next stage. Almost feels like we want to have two callbacks for each stage in the model, one when entering that state and another when we get an action response. But even if we try this, it feels like the kind of thing we don't want to do inside another PR anyway.

The downside of adding it here is that there's business logic outside of the model and outside the actions making it harder to understand and test. I think we should either

  1. add more test coverage to next.ts
  2. Move this into a dedicated Actions.setDocMigrationStarted and test it there

I'd slightly lean towards (2) but this does also mean we're putting more business logic into actions them 🤷

@pgayvallet pgayvallet force-pushed the kbn-150309-zdt-algo-part-2 branch from 848bbd5 to 359c64c Compare March 28, 2023 13:13
@pgayvallet pgayvallet enabled auto-merge (squash) March 28, 2023 15:13
@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/core-saved-objects-base-server-internal 56 52 -4

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id before after diff
@kbn/core-saved-objects-base-server-internal 7 8 +1
Unknown metric groups

API count

id before after diff
@kbn/core-saved-objects-base-server-internal 76 72 -4

ESLint disabled line counts

id before after diff
@kbn/core 7 8 +1
securitySolution 433 436 +3
total +4

Total ESLint disabled count

id before after diff
@kbn/core 8 9 +1
securitySolution 513 516 +3
total +4

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@pgayvallet pgayvallet merged commit 3ff906d into elastic:main Mar 28, 2023
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Epic:ZDTmigrations Zero downtime migrations Feature:Migrations release_note:skip Skip the PR/issue when compiling release notes Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v8.8.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants