Bootstrap ZDT migration algorithm #151282

pgayvallet · 2023-02-15T12:16:24Z

Summary

Purpose of the PR is to create the skeleton of the ZDT algorithm, in order to make sure we're all aligned on the way we'll be managing our codebase between the 2 implementation (and to ease with the review of the follow-up PRs by not having the bootstrap of the algo to review at the same time)

…fix'

…-bootstrap

pgayvallet

Self-review, trying to explain my reasoning:

pgayvallet · 2023-02-15T14:34:27Z

packages/core/saved-objects/core-saved-objects-base-server-internal/src/saved_objects_config.ts

+  algorithm: schema.oneOf([schema.literal('v2'), schema.literal('zdt')], {
+    defaultValue: 'v2',
+  }),


I wasn't sure how to name it, so I went with zdt. If anyone prefers managed, or anything else, please feel free to tell me. We could even change the algorithm name to something else.

I'm fine with a more descriptive algorithm name, managed is a little too obscure.

From someone using this API's perspective the values v2 and zdt seem to not have any relation. This is OK, but we need a good doc comment about what each means and where it came from.

IMO zero-downtime or operator (bc I'm guessing this algo is intended to work with the "operator pattern") are also good candidates.

my 2cents: I'd use a name that makes it obvious that there is an external agent required (operator seems like a good candidate).

At the same time, I tend to prefer options that resemble the functionality they enable: Zero-downtime...

Mixing both thoughts... I wonder if we should be more explicit about when we plan to use it (or anyone willing to change the defaults is expected to use this): with the k8s operator? Something like k8s-zdt?

Yeah, naming is hard.... we could also go with serverless and default/standard. OTOH at some point we may want to allow this algo to be officially supported on prem (with 'manual' orchestration of the workflow)... So I really don't know.

pgayvallet · 2023-02-15T14:38:31Z

...ges/core/saved-objects/core-saved-objects-migration-server-internal/src/common/utils/logs.ts

+export const logStateTransition = (
+  logger: Logger,
+  logMessagePrefix: string,
+  prevState: LogAwareState,


Was extracted from being inlined in a state machine file of the v2 algorithm to be re-used for zdt.

Please note the src/common folder of the package. This is where I'm planning to move stuff shared/common between the two algo.

Basically the ideal end structure for me would be

- src - common - actions - other_stuff - v2 - model - other_v2_specific_stuff - zdt - folders_this_pr_introduced

Does that seems fine to you, or should we go with another folder structure?

Does that seems fine to you, or should we go with another folder structure?

Seems good to me. It's not worth getting too tied up in where the code lives, as long as it's somewhat categorized.

I'm all good with src/common... just wondering about the differences between src/common and src/core... Should we pick one or are we good with having both?

If we are at risk of having circular references, I'm OK with having 2 common dirs 😇

src/core is the remain of old history tbh 😅 . Ideally we would be moving src/core to subfolders of src/common soon. I just wanted to avoid doing it in the initial PR.

pgayvallet · 2023-02-15T14:39:46Z

.../saved-objects/core-saved-objects-migration-server-internal/src/core/build_types_mappings.ts

+/**
+ * Merge mappings from all registered saved object types.
+ */
+export const buildTypesMappings = (


extracted from v2. Moved to src/core because the other mapping-related helpers are there already. src/core may eventually move to src/common/core or something later (if we feel like it's worth the effort)

pgayvallet · 2023-02-15T14:45:08Z

packages/core/saved-objects/core-saved-objects-migration-server-internal/src/kibana_migrator.ts

+    if (migrationAlgorithm === 'zdt') {
+      return this.runMigrationZdt();
+    } else {
+      return this.runMigrationV2();
+    }


The branching

pgayvallet · 2023-02-15T14:45:52Z

...objects/core-saved-objects-migration-server-internal/src/migrations_state_machine_cleanup.ts


-export async function cleanup(client: ElasticsearchClient, state?: State) {
-  if (!state) return;
+type CleanableState = { sourceIndexPitId: string } | {};


Adapted the type to make it reusable with different State types (as the 2 algos have different shapes of states)

pgayvallet · 2023-02-15T15:08:10Z

...core/saved-objects/core-saved-objects-migration-server-internal/src/zdt/model/stages/init.ts

+import type { State } from '../../state';
+import type { ModelStage } from '../types';
+
+export const init: ModelStage<'INIT'> = (state, res, context): State => {


Allowing to define them just like that (parameter types are all properly inferred)

Separating these into individual functions is really great! What do you think of something like this as an iteration on this idea:

// just an example export const init: ModelTransition<'INIT', 'FATAL' | 'DONE'> = (state, res, context) => {}

In this way we will:

leverage TS sanity check on our expected return values

replace State return value declarations from these functions

ModelTransition would be defined as:

/** * Defines a transition function for the model */ export type ModelTransition<T extends AllActionStates, R extends AllControlStates> = ( state: StateFromActionState<T>, res: StateActionResponse<T>, context: MigratorContext ) => StateFromControlState<T | R>; // T | R because we assume a state can go back to itself as QOL for user of this type...

Plus utility:

export type StateFromControlState<T extends AllControlStates> = ControlStateMap[T];

I think your improvements are great. Done in f7d6c7f

pgayvallet · 2023-02-15T15:09:09Z

...core/saved-objects/core-saved-objects-migration-server-internal/src/zdt/model/stages/init.ts

+  // nothing implemented yet, just going to 'DONE'
+  return {
+    ...state,
+    controlState: 'DONE',
+  };


Yeah, not the purpose of this PR, implementation will be done in follow-up(s)

pgayvallet · 2023-02-15T15:09:25Z

...saved-objects/core-saved-objects-migration-server-internal/src/zdt/model/stages/init.test.ts

+describe('Action: init', () => {
+  let context: MockedMigratorContext;
+


Per model stage test file

pgayvallet · 2023-02-15T15:10:55Z

packages/core/saved-objects/core-saved-objects-migration-server-internal/src/zdt/next.ts

+export const next = (context: MigratorContext) => {
+  const map = nextActionMap(context);
+
+  return (state: State) => {
+    const delay = <F extends (...args: any) => any>(fn: F): (() => ReturnType<F>) => {
+      return () => {
+        return state.retryDelay > 0


Also encountered an issue trying to factorize this. nextActionMap cannot be factorized, and next depends on it (and most importantly, on the shape of the State).

We could try spending some time to find a proper TS way to make this generic, but for 50 lines, it didn't felt like a priority.

pgayvallet · 2023-02-15T15:12:49Z

packages/core/saved-objects/core-saved-objects-migration-server-internal/src/zdt/state/types.ts

+export interface BaseState extends ControlState {
+  readonly retryCount: number;
+  readonly retryDelay: number;
+  readonly logs: MigrationLog[];
+}


State types are different between the two algorithm (both the BaseState and the composition of the State) so I don't think we can really re-use anything in this specific part of the code.

I'm not totally onboard with having logs as part of the state.
Stages won't require checking logs for anything, so it feels weird to have them as part of the state.
That being said, alternatives would probably require having stages as classes or objects that implement some sort of interface, so perhaps that's not ideal either.

Yeah. TBH I tried just having the logger in the context initially, and it brings two issues:

incompatibilities between helper functions of the two algos

making it harder to control when logs are outputted
So I went back to putting the logs in the state...

So if we were to change that, I think we would need to do it in both implementations at the same time, which kinda closed the question for me (given it's not the same amount of work at all)

elasticmachine · 2023-02-15T15:41:29Z

Pinging @elastic/kibana-core (Team:Core)

TinaHeiligers

I've added a few comments and nits, none of which block merging.
Exciting times!
LGTM

TinaHeiligers · 2023-02-15T19:12:11Z

...ges/core/saved-objects/core-saved-objects-migration-server-internal/src/common/utils/logs.ts

+  }
+
+  logger.info(
+    logMessagePrefix + `${prevState.controlState} -> ${currState.controlState}. took: ${tookMs}ms.`


I can't see it here but do we add a migration log identifier? I mean, does the logMessagePrefix include some indication that the logs come from the zdt migration algorithm?

No, atm we're using the same prefix than the v2 algo:

kibana/packages/core/saved-objects/core-saved-objects-migration-server-internal/src/zdt/migration_state_action_machine.ts

Line 47 in 445058b

const logMessagePrefix = `[${context.indexPrefix}] `;

We should probably allow to distinguish between the two, but maybe we should use a dedicated logger child/context instead of this prefix, to make sure all our logs are properly identifiable.

dedicated logger child/context instead of this prefix, to make sure all our logs are properly identifiable

++

TinaHeiligers · 2023-02-15T20:08:32Z

packages/core/saved-objects/core-saved-objects-migration-server-internal/src/kibana_migrator.ts

+    });
+  }
+
+  private runMigrationV2(): Promise<MigrationResult[]> {


asside: The prep needed for runResilientMigrator looks clunky compared to runZeroDowntimeMigration.
Would it make sense to move that to a 'prepareResilientMigration' setup function? The cleanup's not needed now but possibly some time.

Yeah, I agree. Ideally most of the 'architectural'/'code' improvements we do in this second algo should be ported to the existing one. Not sure our current priorities will allow us to take such chore tasks in the following months.

TinaHeiligers · 2023-02-15T20:23:52Z

...saved-objects/core-saved-objects-migration-server-internal/src/zdt/model/model.test.mocks.ts

+ */
+
+export const StageMocks = {
+  init: jest.fn().mockImplementation((state: unknown) => state),


It should be fine and it even gives us testing options to mock stages individually.

TinaHeiligers · 2023-02-15T20:45:04Z

...saved-objects/core-saved-objects-migration-server-internal/src/zdt/model/stages/init.test.ts

+    context = createContextMock();
+  });
+
+  test('INIT -> DONE because its not done yet', () => {


nit:

Suggested change

test('INIT -> DONE because its not done yet', () => {

test("INIT -> DONE because it's not implemented yet", () => {

"not done yet" could mean that the migration hasn't finished yet, making for a confused dev reading the code 😉

...saved-objects/core-saved-objects-migration-server-internal/src/zdt/model/stages/init.test.ts

TinaHeiligers · 2023-02-15T20:50:05Z

...saved-objects/core-saved-objects-migration-server-internal/src/zdt/model/stages/init.test.ts

+  test('INIT -> INIT when cluster routing allocation is incompatible', () => {
+    const state = createState();
+    const res: StateActionResponse<'INIT'> = Either.left({
+      type: 'incompatible_cluster_routing_allocation',


Are we going to enforce routing allocation given that these will be managed instances?

Great question, and I'm not sure tbh. I mostly re-used the existing init action as a POC that the algorithm got executed. The zdt workflow may start by something different, and we may not need to check for such things on managed environment, yeah.

TinaHeiligers · 2023-02-15T20:51:17Z

packages/core/saved-objects/core-saved-objects-migration-server-internal/src/zdt/model/types.ts

+export type StateActionResponse<T extends AllActionStates> = ExcludeRetryableEsError<
+  ResponseType<T>
+>;
+
+/**
+ * Defines a stage delegation function for the model
+ */
+export type ModelStage<T extends AllActionStates> = (
+  state: StateFromActionState<T>,
+  res: StateActionResponse<T>,
+  context: MigratorContext
+) => State;


jloleysens

Left some initial comments, will finish rest of review soon! Let me know what you think!

jloleysens · 2023-02-20T14:34:20Z

packages/core/saved-objects/core-saved-objects-base-server-internal/src/saved_objects_config.ts

+  algorithm: schema.oneOf([schema.literal('v2'), schema.literal('zdt')], {
+    defaultValue: 'v2',
+  }),


From someone using this API's perspective the values v2 and zdt seem to not have any relation. This is OK, but we need a good doc comment about what each means and where it came from.

IMO zero-downtime or operator (bc I'm guessing this algo is intended to work with the "operator pattern") are also good candidates.

jloleysens · 2023-02-20T14:41:37Z

...ges/core/saved-objects/core-saved-objects-migration-server-internal/src/common/utils/logs.ts

+  }
+
+  logger.info(
+    logMessagePrefix + `${prevState.controlState} -> ${currState.controlState}. took: ${tookMs}ms.`


dedicated logger child/context instead of this prefix, to make sure all our logs are properly identifiable

++

jloleysens · 2023-02-20T15:01:04Z

...core/saved-objects/core-saved-objects-migration-server-internal/src/zdt/model/stages/init.ts

+import type { State } from '../../state';
+import type { ModelStage } from '../types';
+
+export const init: ModelStage<'INIT'> = (state, res, context): State => {


Separating these into individual functions is really great! What do you think of something like this as an iteration on this idea:

// just an example export const init: ModelTransition<'INIT', 'FATAL' | 'DONE'> = (state, res, context) => {}

In this way we will:

leverage TS sanity check on our expected return values

replace State return value declarations from these functions

ModelTransition would be defined as:

/** * Defines a transition function for the model */ export type ModelTransition<T extends AllActionStates, R extends AllControlStates> = ( state: StateFromActionState<T>, res: StateActionResponse<T>, context: MigratorContext ) => StateFromControlState<T | R>; // T | R because we assume a state can go back to itself as QOL for user of this type...

Plus utility:

export type StateFromControlState<T extends AllControlStates> = ControlStateMap[T];

afharo

So exciting to have this 🎉

afharo · 2023-02-21T14:53:16Z

packages/core/saved-objects/core-saved-objects-base-server-internal/src/saved_objects_config.ts

+  algorithm: schema.oneOf([schema.literal('v2'), schema.literal('zdt')], {
+    defaultValue: 'v2',
+  }),


my 2cents: I'd use a name that makes it obvious that there is an external agent required (operator seems like a good candidate).

At the same time, I tend to prefer options that resemble the functionality they enable: Zero-downtime...

Mixing both thoughts... I wonder if we should be more explicit about when we plan to use it (or anyone willing to change the defaults is expected to use this): with the k8s operator? Something like k8s-zdt?

afharo · 2023-02-21T14:57:29Z

...ges/core/saved-objects/core-saved-objects-migration-server-internal/src/common/utils/logs.ts

+export const logStateTransition = (
+  logger: Logger,
+  logMessagePrefix: string,
+  prevState: LogAwareState,


I'm all good with src/common... just wondering about the differences between src/common and src/core... Should we pick one or are we good with having both?

If we are at risk of having circular references, I'm OK with having 2 common dirs 😇

afharo · 2023-02-21T15:01:18Z

packages/core/saved-objects/core-saved-objects-migration-server-internal/src/model/helpers.ts

+export function throwBadResponse(state: { controlState: string }, p: never): never;
+export function throwBadResponse(state: { controlState: string }, res: any): never {


nit: does res: unknown work here?

afharo · 2023-02-21T19:42:44Z

...saved-objects/core-saved-objects-migration-server-internal/src/zdt/model/model.test.mocks.ts

+ */
+
+export const StageMocks = {
+  init: jest.fn().mockImplementation((state: unknown) => state),


Probably something like jest.requireActual('./stages') and loop over its keys would automate the creation of the mocks... although I don't think it's a big deal

…-bootstrap

jloleysens

Thanks for integrating my feedback @pgayvallet , overall LGTM!

kibana-ci · 2023-02-27T11:52:26Z

💚 Build Succeeded

Buildkite Build
Commit: d1f02da

Metrics [docs]

Unknown metric groups

ESLint disabled line counts

id	before	after	diff
`securitySolution`	428	430	+2

Total ESLint disabled count

id	before	after	diff
`securitySolution`	506	508	+2

History

💚 Build #108273 succeeded 07b7682
💔 Build #108234 failed ad4577a
💔 Build #108231 failed cf51bdd
💔 Build #108225 failed 81eb2b3
💔 Build #108180 failed 445058b

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

## Summary Part of elastic#150309 Purpose of the PR is to create the skeleton of the ZDT algorithm, in order to make sure we're all aligned on the way we'll be managing our codebase between the 2 implementation (and to ease with the review of the follow-up PRs by not having the bootstrap of the algo to review at the same time) --------- Co-authored-by: kibanamachine <[email protected]>

pgayvallet added 2 commits February 15, 2023 11:23

bootstrapping the ZDT migration algorithm

b28424a

dummy implementation done

445058b

pgayvallet mentioned this pull request Feb 15, 2023

Implement zero downtime migration algorithm #150309

Closed

pgayvallet and others added 4 commits February 15, 2023 14:24

add some tests

784eb5e

fix test type again

81eb2b3

[CI] Auto-commit changed files from 'node scripts/lint_ts_projects --…

cf51bdd

…fix'

Merge remote-tracking branch 'upstream/main' into kbn-150309-zdt-algo…

ad4577a

…-bootstrap

pgayvallet added v8.8.0 Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc release_note:skip Skip the PR/issue when compiling release notes Feature:Migrations labels Feb 15, 2023

pgayvallet added 2 commits February 15, 2023 14:57

fix cyclic dep

5dff658

nit export format

07b7682

pgayvallet commented Feb 15, 2023

View reviewed changes

pgayvallet marked this pull request as ready for review February 15, 2023 15:41

pgayvallet requested a review from a team as a code owner February 15, 2023 15:41

TinaHeiligers approved these changes Feb 15, 2023

View reviewed changes

jloleysens reviewed Feb 20, 2023

View reviewed changes

afharo reviewed Feb 21, 2023

View reviewed changes

pgayvallet added 4 commits February 27, 2023 10:16

Merge remote-tracking branch 'upstream/main' into kbn-150309-zdt-algo…

2855d1e

…-bootstrap

review nits

8c7acbf

improve ModelStage utility type

f7d6c7f

Merge remote-tracking branch 'upstream/main' into kbn-150309-zdt-algo…

d1f02da

…-bootstrap

jloleysens approved these changes Feb 27, 2023

View reviewed changes

pgayvallet merged commit bbbf8d1 into elastic:main Feb 27, 2023

kibanamachine added the backport:skip This commit does not require backporting label Feb 27, 2023

		describe('Action: init', () => {
		let context: MockedMigratorContext;

	test('INIT -> DONE because its not done yet', () => {
	test("INIT -> DONE because it's not implemented yet", () => {

		export function throwBadResponse(state: { controlState: string }, p: never): never;
		export function throwBadResponse(state: { controlState: string }, res: any): never {

Bootstrap ZDT migration algorithm #151282

Bootstrap ZDT migration algorithm #151282

Conversation

pgayvallet commented Feb 15, 2023 • edited Loading

Summary

pgayvallet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jloleysens Feb 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jloleysens Feb 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elasticmachine commented Feb 15, 2023

TinaHeiligers left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jloleysens left a comment

Choose a reason for hiding this comment

jloleysens Feb 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jloleysens Feb 20, 2023 • edited Loading

Choose a reason for hiding this comment

afharo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jloleysens left a comment

Choose a reason for hiding this comment

kibana-ci commented Feb 27, 2023

💚 Build Succeeded

Metrics [docs]

ESLint disabled line counts

Total ESLint disabled count

History

pgayvallet commented Feb 15, 2023 •

edited

Loading

jloleysens Feb 20, 2023 •

edited

Loading

jloleysens Feb 20, 2023 •

edited

Loading

jloleysens Feb 20, 2023 •

edited

Loading

jloleysens Feb 20, 2023 •

edited

Loading