Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SoMigV2] Fail fast if unknown document types are present in the source index #103341

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 9 additions & 8 deletions rfcs/text/0013_saved_object_migrations.md
Original file line number Diff line number Diff line change
Expand Up @@ -253,26 +253,27 @@ Note:
4. Fail the migration if:
1. `.kibana` is pointing to an index that belongs to a later version of Kibana .e.g. `.kibana_7.12.0_001`
2. (Only in 8.x) The source index contains documents that belong to an unknown Saved Object type (from a disabled plugin). Log an error explaining that the plugin that created these documents needs to be enabled again or that these objects should be deleted. See section (4.2.1.4).
5. Set a write block on the source index. This prevents any further writes from outdated nodes.
6. Create a new temporary index `.kibana_7.10.0_reindex_temp` with `dynamic: false` on the top-level mappings so that any kind of document can be written to the index. This allows us to write untransformed documents to the index which might have fields which have been removed from the latest mappings defined by the plugin. Define minimal mappings for the `migrationVersion` and `type` fields so that we're still able to search for outdated documents that need to be transformed.
5. Search the source index for documents with types not registered within Kibana. Fail the migration if any document is found.
6. Set a write block on the source index. This prevents any further writes from outdated nodes.
7. Create a new temporary index `.kibana_7.10.0_reindex_temp` with `dynamic: false` on the top-level mappings so that any kind of document can be written to the index. This allows us to write untransformed documents to the index which might have fields which have been removed from the latest mappings defined by the plugin. Define minimal mappings for the `migrationVersion` and `type` fields so that we're still able to search for outdated documents that need to be transformed.
1. Ignore errors if the target index already exists.
7. Reindex the source index into the new temporary index.
8. Reindex the source index into the new temporary index.
1. Use `op_type=create` `conflicts=proceed` and `wait_for_completion=false` so that multiple instances can perform the reindex in parallel but only one write per document will succeed.
2. Wait for the reindex task to complete. If reindexing doesn’t complete within the 60s timeout, log a warning for visibility and poll again.
8. Clone the temporary index into the target index `.kibana_7.10.0_001`. Since any further writes will only happen against the cloned target index this prevents a lost delete from occuring where one instance finishes the migration and deletes a document and another instance's reindex operation re-creates the deleted document.
9. Clone the temporary index into the target index `.kibana_7.10.0_001`. Since any further writes will only happen against the cloned target index this prevents a lost delete from occuring where one instance finishes the migration and deletes a document and another instance's reindex operation re-creates the deleted document.
1. Set a write block on the temporary index
2. Clone the temporary index into the target index while specifying that the target index should have writes enabled.
3. If the clone operation fails because the target index already exist, ignore the error and wait for the target index to become green before proceeding.
4. (The `001` postfix in the target index name isn't used by Kibana, but allows for re-indexing an index should this be required by an Elasticsearch upgrade. E.g. re-index `.kibana_7.10.0_001` into `.kibana_7.10.0_002` and point the `.kibana_7.10.0` alias to `.kibana_7.10.0_002`.)
9. Transform documents by reading batches of outdated documents from the target index then transforming and updating them with optimistic concurrency control.
10. Transform documents by reading batches of outdated documents from the target index then transforming and updating them with optimistic concurrency control.
1. Ignore any version conflict errors.
2. If a document transform throws an exception, add the document to a failure list and continue trying to transform all other documents. If any failures occured, log the complete list of documents that failed to transform. Fail the migration.
10. Update the mappings of the target index
11. Update the mappings of the target index
1. Retrieve the existing mappings including the `migrationMappingPropertyHashes` metadata.
2. Update the mappings with `PUT /.kibana_7.10.0_001/_mapping`. The API deeply merges any updates so this won't remove the mappings of any plugins that are disabled on this instance but have been enabled on another instance that also migrated this index.
3. Ensure that fields are correctly indexed using the target index's latest mappings `POST /.kibana_7.10.0_001/_update_by_query?conflicts=proceed`. In the future we could optimize this query by only targeting documents:
1. That belong to a known saved object type.
11. Mark the migration as complete. This is done as a single atomic
12. Mark the migration as complete. This is done as a single atomic
operation (requires https://github.com/elastic/elasticsearch/pull/58100)
to guarantee that when multiple versions of Kibana are performing the
migration in parallel, only one version will win. E.g. if 7.11 and 7.12
Expand All @@ -284,7 +285,7 @@ Note:
4. If this fails with a "required alias [.kibana] does not exist" error or "index_not_found_exception" for the temporary index, fetch `.kibana` again:
1. If `.kibana` is _not_ pointing to our target index fail the migration.
2. If `.kibana` is pointing to our target index the migration has succeeded and we can proceed to step (12).
12. Start serving traffic. All saved object reads/writes happen through the
13. Start serving traffic. All saved object reads/writes happen through the
version-specific alias `.kibana_7.10.0`.

Together with the limitations, this algorithm ensures that migrations are
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,7 @@ export class KibanaMigrator {
migrationVersionPerType: this.documentMigrator.migrationVersion,
indexPrefix: index,
migrationsConfig: this.soMigrationsConfig,
typeRegistry: this.typeRegistry,
});
},
};
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0 and the Server Side Public License, v 1; you may not use this file except
* in compliance with, at your election, the Elastic License 2.0 or the Server
* Side Public License, v 1.
*/

import * as Either from 'fp-ts/lib/Either';
import { catchRetryableEsClientErrors } from './catch_retryable_es_client_errors';
import { errors as EsErrors, estypes } from '@elastic/elasticsearch';
import { elasticsearchClientMock } from '../../../elasticsearch/client/mocks';
import { checkForUnknownDocs } from './check_for_unknown_docs';

jest.mock('./catch_retryable_es_client_errors');

describe('checkForUnknownDocs', () => {
const unusedTypesQuery: estypes.QueryDslQueryContainer = {
bool: { must: [{ term: { hello: 'dolly' } }] },
};
const knownTypes = ['foo', 'bar'];

beforeEach(() => {
jest.clearAllMocks();
});

it('calls catchRetryableEsClientErrors when the promise rejects', async () => {
// Create a mock client that rejects all methods with a 503 status code response.
const retryableError = new EsErrors.ResponseError(
elasticsearchClientMock.createApiResponse({
statusCode: 503,
body: { error: { type: 'es_type', reason: 'es_reason' } },
})
);
const client = elasticsearchClientMock.createInternalClient(
elasticsearchClientMock.createErrorTransportRequestPromise(retryableError)
);

const task = checkForUnknownDocs({
client,
indexName: '.kibana_8.0.0',
knownTypes,
unusedTypesQuery,
});
try {
await task();
} catch (e) {
/** ignore */
}
expect(catchRetryableEsClientErrors).toHaveBeenCalledWith(retryableError);
});

it('calls `client.search` with the correct parameters', async () => {
const client = elasticsearchClientMock.createInternalClient(
elasticsearchClientMock.createSuccessTransportRequestPromise({ hits: { hits: [] } })
);

const task = checkForUnknownDocs({
client,
indexName: '.kibana_8.0.0',
knownTypes,
unusedTypesQuery,
});

await task();

expect(client.search).toHaveBeenCalledTimes(1);
expect(client.search).toHaveBeenCalledWith({
index: '.kibana_8.0.0',
body: {
query: {
bool: {
must: unusedTypesQuery,
must_not: knownTypes.map((type) => ({
term: {
type,
},
})),
},
},
},
});
});

it('resolves with `Either.right` when no unknown docs are found', async () => {
const client = elasticsearchClientMock.createInternalClient(
elasticsearchClientMock.createSuccessTransportRequestPromise({ hits: { hits: [] } })
);

const task = checkForUnknownDocs({
client,
indexName: '.kibana_8.0.0',
knownTypes,
unusedTypesQuery,
});

const result = await task();

expect(Either.isRight(result)).toBe(true);
});

it('resolves with `Either.left` when unknown docs are found', async () => {
const client = elasticsearchClientMock.createInternalClient(
elasticsearchClientMock.createSuccessTransportRequestPromise({
hits: {
hits: [
{ _id: '12', _source: { type: 'foo' } },
{ _id: '14', _source: { type: 'bar' } },
],
},
})
);

const task = checkForUnknownDocs({
client,
indexName: '.kibana_8.0.0',
knownTypes,
unusedTypesQuery,
});

const result = await task();

expect(Either.isLeft(result)).toBe(true);
expect((result as Either.Left<any>).left).toEqual({
type: 'unknown_docs_found',
unknownDocs: [
{ id: '12', type: 'foo' },
{ id: '14', type: 'bar' },
],
});
});

it('uses `unknown` as the type when the document does not contain a type field', async () => {
const client = elasticsearchClientMock.createInternalClient(
elasticsearchClientMock.createSuccessTransportRequestPromise({
hits: {
hits: [{ _id: '12', _source: {} }],
},
})
);

const task = checkForUnknownDocs({
client,
indexName: '.kibana_8.0.0',
knownTypes,
unusedTypesQuery,
});

const result = await task();

expect(Either.isLeft(result)).toBe(true);
expect((result as Either.Left<any>).left).toEqual({
type: 'unknown_docs_found',
unknownDocs: [{ id: '12', type: 'unknown' }],
});
});
});
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0 and the Server Side Public License, v 1; you may not use this file except
* in compliance with, at your election, the Elastic License 2.0 or the Server
* Side Public License, v 1.
*/

import * as Either from 'fp-ts/lib/Either';
import * as TaskEither from 'fp-ts/lib/TaskEither';
import { estypes } from '@elastic/elasticsearch';
import type { SavedObjectsRawDocSource } from '../../serialization';
import { ElasticsearchClient } from '../../../elasticsearch';
import {
catchRetryableEsClientErrors,
RetryableEsClientError,
} from './catch_retryable_es_client_errors';

/** @internal */
export interface CheckForUnknownDocsParams {
client: ElasticsearchClient;
indexName: string;
unusedTypesQuery: estypes.QueryDslQueryContainer;
knownTypes: string[];
}

/** @internal */
export interface CheckForUnknownDocsFoundDoc {
id: string;
type: string;
}

/** @internal */
export interface UnknownDocsFound {
type: 'unknown_docs_found';
unknownDocs: CheckForUnknownDocsFoundDoc[];
}

export const checkForUnknownDocs = ({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this file get an explicit unit test?

client,
indexName,
unusedTypesQuery,
knownTypes,
}: CheckForUnknownDocsParams): TaskEither.TaskEither<
RetryableEsClientError | UnknownDocsFound,
{}
> => () => {
const query = createUnknownDocQuery(unusedTypesQuery, knownTypes);

return client
.search<SavedObjectsRawDocSource>({
index: indexName,
body: {
query,
},
})
.then((response) => {
const { hits } = response.body.hits;
if (hits.length) {
return Either.left({
type: 'unknown_docs_found' as const,
unknownDocs: hits.map((hit) => ({ id: hit._id, type: hit._source?.type ?? 'unknown' })),
});
} else {
return Either.right({});
}
})
.catch(catchRetryableEsClientErrors);
};

const createUnknownDocQuery = (
unusedTypesQuery: estypes.QueryDslQueryContainer,
knownTypes: string[]
): estypes.QueryDslQueryContainer => {
return {
bool: {
must: unusedTypesQuery,
must_not: knownTypes.map((type) => ({
term: {
type,
},
})),
},
};
Comment on lines +75 to +84
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I included the unusedTypesQuery in the unknown doc queries, are we're not migrating them to the temp index, and as it would allow to 'remove' old types by unregistering them while adding them to the unused query.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean that users must remove them to continue the migration? If so, it might be considered as a breaking change even though we just fixed a bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I mean the opposite actually: I'm excluding the objects matching the unusedTypesQuery, to avoid failing if any is encountered given than we don't migrate them anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, must usage is a bit misleading here because unusedTypesQuery contains another must_not inside.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, the whole unusedTypesQuery naming is misleading because it's actually an excludeUnusedTypesQuery

};
14 changes: 14 additions & 0 deletions src/core/server/saved_objects/migrationsv2/actions/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -60,12 +60,14 @@ export type { ReindexResponse, ReindexParams } from './reindex';
export { reindex } from './reindex';

import type { IncompatibleMappingException } from './wait_for_reindex_task';

export { waitForReindexTask } from './wait_for_reindex_task';

export type { VerifyReindexParams } from './verify_reindex';
export { verifyReindex } from './verify_reindex';

import type { AliasNotFound, RemoveIndexNotAConcreteIndex } from './update_aliases';

export type { AliasAction, UpdateAliasesParams } from './update_aliases';
export { updateAliases } from './update_aliases';

Expand All @@ -78,6 +80,14 @@ export type {
} from './update_and_pickup_mappings';
export { updateAndPickupMappings } from './update_and_pickup_mappings';

import type { UnknownDocsFound } from './check_for_unknown_docs';
export type {
CheckForUnknownDocsParams,
UnknownDocsFound,
CheckForUnknownDocsFoundDoc,
} from './check_for_unknown_docs';
export { checkForUnknownDocs } from './check_for_unknown_docs';

export { waitForPickupUpdatedMappingsTask } from './wait_for_pickup_updated_mappings_task';

export type {
Expand All @@ -96,9 +106,11 @@ export interface IndexNotFound {
type: 'index_not_found_exception';
index: string;
}

export interface WaitForReindexTaskFailure {
readonly cause: { type: string; reason: string };
}

export interface TargetIndexHadWriteBlock {
type: 'target_index_had_write_block';
}
Expand All @@ -108,6 +120,7 @@ export interface AcknowledgeResponse {
acknowledged: boolean;
shardsAcknowledged: boolean;
}

// Map of left response 'type' string -> response interface
export interface ActionErrorTypeMap {
wait_for_task_completion_timeout: WaitForTaskCompletionTimeout;
Expand All @@ -118,6 +131,7 @@ export interface ActionErrorTypeMap {
alias_not_found_exception: AliasNotFound;
remove_index_not_a_concrete_index: RemoveIndexNotAConcreteIndex;
documents_transform_failed: DocumentsTransformFailed;
unknown_docs_found: UnknownDocsFound;
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
* in compliance with, at your election, the Elastic License 2.0 or the Server
* Side Public License, v 1.
*/

import * as Either from 'fp-ts/lib/Either';
import * as TaskEither from 'fp-ts/lib/TaskEither';
import { ElasticsearchClient } from '../../../elasticsearch';
Expand Down
7 changes: 6 additions & 1 deletion src/core/server/saved_objects/migrationsv2/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,11 @@ import type { SavedObjectsMigrationVersion } from '../types';
import type { TransformRawDocs } from './types';
import { MigrationResult } from '../migrations/core';
import { next } from './next';
import { createInitialState, model } from './model';
import { model } from './model';
import { createInitialState } from './initial_state';
import { migrationStateActionMachine } from './migrations_state_action_machine';
import { SavedObjectsMigrationConfigType } from '../saved_objects_config';
import type { ISavedObjectTypeRegistry } from '../saved_objects_type_registry';

/**
* Migrates the provided indexPrefix index using a resilient algorithm that is
Expand All @@ -32,6 +34,7 @@ export async function runResilientMigrator({
migrationVersionPerType,
indexPrefix,
migrationsConfig,
typeRegistry,
}: {
client: ElasticsearchClient;
kibanaVersion: string;
Expand All @@ -42,6 +45,7 @@ export async function runResilientMigrator({
migrationVersionPerType: SavedObjectsMigrationVersion;
indexPrefix: string;
migrationsConfig: SavedObjectsMigrationConfigType;
typeRegistry: ISavedObjectTypeRegistry;
}): Promise<MigrationResult> {
const initialState = createInitialState({
kibanaVersion,
Expand All @@ -50,6 +54,7 @@ export async function runResilientMigrator({
migrationVersionPerType,
indexPrefix,
migrationsConfig,
typeRegistry,
});
return migrationStateActionMachine({
initialState,
Expand Down
Loading