From 24e8f7f562c28f090bed7cae718c26f834091f5d Mon Sep 17 00:00:00 2001 From: Rudolf Meijering Date: Wed, 25 Nov 2020 15:22:04 +0100 Subject: [PATCH 1/9] Instead of cloning, reindex legacy index --- rfcs/text/0013_saved_object_migrations.md | 50 ++++++++++++++--------- 1 file changed, 31 insertions(+), 19 deletions(-) diff --git a/rfcs/text/0013_saved_object_migrations.md b/rfcs/text/0013_saved_object_migrations.md index 6e125c28c04c0..b4dfa39dca003 100644 --- a/rfcs/text/0013_saved_object_migrations.md +++ b/rfcs/text/0013_saved_object_migrations.md @@ -214,31 +214,43 @@ Note: 2. If the source is a < v6.5 `.kibana` index or < 7.4 `.kibana_task_manager` index prepare the legacy index for a migration: 1. Mark the legacy index as read-only and wait for all in-flight operations to drain (requires https://github.com/elastic/elasticsearch/pull/58094). This prevents any further writes from outdated nodes. Assuming this API is similar to the existing `//_close` API, we expect to receive `"acknowledged" : true` and `"shards_acknowledged" : true`. If all shards don’t acknowledge within the timeout, retry the operation until it succeeds. - 2. Clone the legacy index into a new index which has writes enabled. Use a fixed index name i.e `.kibana_pre6.5.0_001` or `.kibana_task_manager_pre7.4.0_001`. `POST /.kibana/_clone/.kibana_pre6.5.0_001?wait_for_active_shards=all {"settings": {"index.blocks.write": false}}`. Ignore errors if the clone already exists. Ignore errors if the legacy source doesn't exist. - 3. Wait for the cloning to complete `GET /_cluster/health/.kibana_pre6.5.0_001?wait_for_status=green&timeout=60s` If cloning doesn’t complete within the 60s timeout, log a warning for visibility and poll again. - 4. Apply the `convertToAlias` script if defined `POST /.kibana_pre6.5.0_001/_update_by_query?conflicts=proceed {"script": {...}}`. The `convertToAlias` script will have to be idempotent, preferably setting `ctx.op="noop"` on subsequent runs to avoid unecessary writes. + 2. Create a new index which will become the source index after the legacy + pre-migration is complete. This index should have the same mappings as + the legacy index. Use a fixed index name i.e `.kibana_pre6.5.0_001` or + `.kibana_task_manager_pre7.4.0_001`. Ignore index already exists errors. + 3. Reindex the legacy index into the new source index with the + `convertToAlias` script if specified. Use `wait_for_completion: false` + to run this as a task. Ignore errors if the legacy source doesn't exist. + 4. Wait for the reindex task to complete. If the task doesn’t complete + within the 60s timeout, log a warning for visibility and poll again. + Ignore errors if the legacy source doesn't exist. 5. Delete the legacy index and replace it with an alias of the same name ``` POST /_aliases { "actions" : [ - { "add": { "index": ".kibana_pre6.5.0_001", "alias": ".kibana" } }, { "remove_index": { "index": ".kibana" } } + { "add": { "index": ".kibana_pre6.5.0_001", "alias": ".kibana" } }, ] } ```. Unlike the delete index API, the `remove_index` action will fail if - provided with an _alias_. Ignore "The provided expression [.kibana] - matches an alias, specify the corresponding concrete indices instead." - or "index_not_found_exception" errors. These actions are applied - atomically so that other Kibana instances will always see either a - `.kibana` index or an alias, but never neither. - 6. Use the cloned `.kibana_pre6.5.0_001` as the source for the rest of the migration algorithm. + provided with an _alias_. Therefore, if another instance completed this + step, the `.kibana` alias won't be added to `.kibana_pre6.5.0_001` a + second time. This avoids a situation where `.kibana` could point to both + `.kibana_pre6.5.0_001` and `.kibana_7.10.0_001`. These actions are + applied atomically so that other Kibana instances will always see either + a `.kibana` index or an alias, but never neither. + + Ignore "The provided expression [.kibana] matches an alias, specify the + corresponding concrete indices instead." or "index_not_found_exception" + errors as this means another instance has already completed this step. + 6. Use the reindexed legacy `.kibana_pre6.5.0_001` as the source for the rest of the migration algorithm. 3. If `.kibana` and `.kibana_7.10.0` both exists and are pointing to the same index this version's migration has already been completed. 1. Because the same version can have plugins enabled at any point in time, - perform the mappings update in step (6) and migrate outdated documents - with step (7). - 2. Skip to step (9) to start serving traffic. + perform the mappings update in step (7) and migrate outdated documents + with step (8). + 2. Skip to step (10) to start serving traffic. 4. Fail the migration if: 1. `.kibana` is pointing to an index that belongs to a later version of Kibana .e.g. `.kibana_7.12.0_001` 2. (Only in 8.x) The source index contains documents that belong to an unknown Saved Object type (from a disabled plugin). Log an error explaining that the plugin that created these documents needs to be enabled again or that these objects should be deleted. See section (4.2.1.4). @@ -256,12 +268,12 @@ Note: 8. Transform documents by reading batches of outdated documents from the target index then transforming and updating them with optimistic concurrency control. 1. Ignore any version conflict errors. 2. If a document transform throws an exception, add the document to a failure list and continue trying to transform all other documents. If any failures occured, log the complete list of documents that failed to transform. Fail the migration. -9. Mark the migration as complete. This is done as a single atomic - operation (requires https://github.com/elastic/elasticsearch/pull/58100) - to guarantees when multiple versions of Kibana are performing the - migration in parallel, only one version will win. E.g. if 7.11 and 7.12 - are started in parallel and migrate from a 7.9 index, either 7.11 or 7.12 - should succeed and accept writes, but not both. +9. Mark the migration as complete. This is done as a single atomic operation + (requires https://github.com/elastic/elasticsearch/pull/58100) to + guarantee when multiple versions of Kibana are performing the migration in + parallel, only one version will win. E.g. if 7.11 and 7.12 are started in + parallel and migrate from a 7.9 index, either 7.11 or 7.12 should succeed + and accept writes, but not both. 3. Checks that `.kibana` alias is still pointing to the source index 4. Points the `.kibana_7.10.0` and `.kibana` aliases to the target index. 5. If this fails with a "required alias [.kibana] does not exist" error fetch `.kibana` again: From edacd24783a68fc6632102dd55a50a4f087b0cae Mon Sep 17 00:00:00 2001 From: Rudolf Meijering Date: Thu, 10 Dec 2020 10:39:04 +0100 Subject: [PATCH 2/9] Reindex for every v2 migration --- rfcs/text/0013_saved_object_migrations.md | 45 ++++++++++++----------- 1 file changed, 23 insertions(+), 22 deletions(-) diff --git a/rfcs/text/0013_saved_object_migrations.md b/rfcs/text/0013_saved_object_migrations.md index b4dfa39dca003..cf1cc0ff71946 100644 --- a/rfcs/text/0013_saved_object_migrations.md +++ b/rfcs/text/0013_saved_object_migrations.md @@ -248,38 +248,39 @@ Note: 6. Use the reindexed legacy `.kibana_pre6.5.0_001` as the source for the rest of the migration algorithm. 3. If `.kibana` and `.kibana_7.10.0` both exists and are pointing to the same index this version's migration has already been completed. 1. Because the same version can have plugins enabled at any point in time, - perform the mappings update in step (7) and migrate outdated documents - with step (8). + migrate outdated documents with step (8) and perform the mappings update in step (9). 2. Skip to step (10) to start serving traffic. 4. Fail the migration if: 1. `.kibana` is pointing to an index that belongs to a later version of Kibana .e.g. `.kibana_7.12.0_001` 2. (Only in 8.x) The source index contains documents that belong to an unknown Saved Object type (from a disabled plugin). Log an error explaining that the plugin that created these documents needs to be enabled again or that these objects should be deleted. See section (4.2.1.4). 5. Mark the source index as read-only and wait for all in-flight operations to drain (requires https://github.com/elastic/elasticsearch/pull/58094). This prevents any further writes from outdated nodes. Assuming this API is similar to the existing `//_close` API, we expect to receive `"acknowledged" : true` and `"shards_acknowledged" : true`. If all shards don’t acknowledge within the timeout, retry the operation until it succeeds. -6. Clone the source index into a new target index which has writes enabled. All nodes on the same version will use the same fixed index name e.g. `.kibana_7.10.0_001`. The `001` postfix isn't used by Kibana, but allows for re-indexing an index should this be required by an Elasticsearch upgrade. E.g. re-index `.kibana_7.10.0_001` into `.kibana_7.10.0_002` and point the `.kibana_7.10.0` alias to `.kibana_7.10.0_002`. - 1. `POST /.kibana_n/_clone/.kibana_7.10.0_001?wait_for_active_shards=all {"settings": {"index.blocks.write": false}}`. Ignore errors if the clone already exists. - 2. Wait for the cloning to complete `GET /_cluster/health/.kibana_7.10.0_001?wait_for_status=green&timeout=60s` If cloning doesn’t complete within the 60s timeout, log a warning for visibility and poll again. -7. Update the mappings of the target index - 1. Retrieve the existing mappings including the `migrationMappingPropertyHashes` metadata. - 2. Update the mappings with `PUT /.kibana_7.10.0_001/_mapping`. The API deeply merges any updates so this won't remove the mappings of any plugins that were enabled in a previous version but are now disabled. - 3. Ensure that fields are correctly indexed using the target index's latest mappings `POST /.kibana_7.10.0_001/_update_by_query?conflicts=proceed`. In the future we could optimize this query by only targeting documents: - 1. That belong to a known saved object type. - 2. Which don't have outdated migrationVersion numbers since these will be transformed anyway. - 3. That belong to a type whose mappings were changed by comparing the `migrationMappingPropertyHashes`. (Metadata, unlike the mappings isn't commutative, so there is a small chance that the metadata hashes do not accurately reflect the latest mappings, however, this will just result in an less efficient query). +6. Create a target index with `dynamic: false` on the top-level mappings so that any kind of document can be written to the index. This allows us to write untransformed documents to the index which might have fields which have been removed from the latest mappings defined by the plugin. Define `dynamic:true` mappings for the `migrationVersion` field so that we're still able to search for outdated documents that need to be transformed. + 1. Ignore errors if the target index already exists. +7. Reindex the source index into a the new target index. All nodes on the same version will use the same fixed index name e.g. `.kibana_7.10.0_001`. The `001` postfix isn't used by Kibana, but allows for re-indexing an index should this be required by an Elasticsearch upgrade. E.g. re-index `.kibana_7.10.0_001` into `.kibana_7.10.0_002` and point the `.kibana_7.10.0` alias to `.kibana_7.10.0_002`. + 1. Use `op_type=create` `conflicts=proceed` and `wait_for_completion=false` + so that multiple instances can perform the reindex in parallel but only + one write per document will succeed. + 2. Wait for the reindex task to complete. If reindexing doesn’t complete within the 60s timeout, log a warning for visibility and poll again. 8. Transform documents by reading batches of outdated documents from the target index then transforming and updating them with optimistic concurrency control. 1. Ignore any version conflict errors. 2. If a document transform throws an exception, add the document to a failure list and continue trying to transform all other documents. If any failures occured, log the complete list of documents that failed to transform. Fail the migration. -9. Mark the migration as complete. This is done as a single atomic operation - (requires https://github.com/elastic/elasticsearch/pull/58100) to - guarantee when multiple versions of Kibana are performing the migration in - parallel, only one version will win. E.g. if 7.11 and 7.12 are started in - parallel and migrate from a 7.9 index, either 7.11 or 7.12 should succeed - and accept writes, but not both. - 3. Checks that `.kibana` alias is still pointing to the source index - 4. Points the `.kibana_7.10.0` and `.kibana` aliases to the target index. - 5. If this fails with a "required alias [.kibana] does not exist" error fetch `.kibana` again: +9. Update the mappings of the target index + 1. Retrieve the existing mappings including the `migrationMappingPropertyHashes` metadata. + 2. Update the mappings with `PUT /.kibana_7.10.0_001/_mapping`. The API deeply merges any updates so this won't remove the mappings of any plugins that are disabled on this instance but have been enabled on another instance that also migrated this index. + 3. Ensure that fields are correctly indexed using the target index's latest mappings `POST /.kibana_7.10.0_001/_update_by_query?conflicts=proceed`. In the future we could optimize this query by only targeting documents: + 1. That belong to a known saved object type. +10. Mark the migration as complete. This is done as a single atomic + operation (requires https://github.com/elastic/elasticsearch/pull/58100) + to guarantees when multiple versions of Kibana are performing the + migration in parallel, only one version will win. E.g. if 7.11 and 7.12 + are started in parallel and migrate from a 7.9 index, either 7.11 or 7.12 + should succeed and accept writes, but not both. + 4. Checks that `.kibana` alias is still pointing to the source index + 5. Points the `.kibana_7.10.0` and `.kibana` aliases to the target index. + 6. If this fails with a "required alias [.kibana] does not exist" error fetch `.kibana` again: 1. If `.kibana` is _not_ pointing to our target index fail the migration. 2. If `.kibana` is pointing to our target index the migration has succeeded and we can proceed to step (10). -10. Start serving traffic. All saved object reads/writes happen through the +11. Start serving traffic. All saved object reads/writes happen through the version-specific alias `.kibana_7.10.0`. Together with the limitations, this algorithm ensures that migrations are From d7237ca42c4167b6931fe8c544ac7c40e27afc6c Mon Sep 17 00:00:00 2001 From: Rudolf Meijering Date: Tue, 15 Dec 2020 01:12:12 +0100 Subject: [PATCH 3/9] Use _reindex?require_alias=true and a write block toggle to prevent lost deletes --- rfcs/text/0013_saved_object_migrations.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/rfcs/text/0013_saved_object_migrations.md b/rfcs/text/0013_saved_object_migrations.md index cf1cc0ff71946..cbadbf372dcfc 100644 --- a/rfcs/text/0013_saved_object_migrations.md +++ b/rfcs/text/0013_saved_object_migrations.md @@ -254,13 +254,14 @@ Note: 1. `.kibana` is pointing to an index that belongs to a later version of Kibana .e.g. `.kibana_7.12.0_001` 2. (Only in 8.x) The source index contains documents that belong to an unknown Saved Object type (from a disabled plugin). Log an error explaining that the plugin that created these documents needs to be enabled again or that these objects should be deleted. See section (4.2.1.4). 5. Mark the source index as read-only and wait for all in-flight operations to drain (requires https://github.com/elastic/elasticsearch/pull/58094). This prevents any further writes from outdated nodes. Assuming this API is similar to the existing `//_close` API, we expect to receive `"acknowledged" : true` and `"shards_acknowledged" : true`. If all shards don’t acknowledge within the timeout, retry the operation until it succeeds. -6. Create a target index with `dynamic: false` on the top-level mappings so that any kind of document can be written to the index. This allows us to write untransformed documents to the index which might have fields which have been removed from the latest mappings defined by the plugin. Define `dynamic:true` mappings for the `migrationVersion` field so that we're still able to search for outdated documents that need to be transformed. +6. Create a target index `.kibana_7.10.0_001` with an `.kibana_7.10.0` alias. Specify `dynamic: false` for the top-level mappings and the minimal mappings for the `migrationVersion` and `type` fields. This allows us to write untransformed documents to the index which might have fields which have been removed from the latest mappings defined by the plugin while still being able to search for outdated documents that need to be transformed. All nodes on the same version will use the same fixed index name. The `001` postfix isn't used by Kibana, but allows for re-indexing an index should this be required by an Elasticsearch upgrade. E.g. re-index `.kibana_7.10.0_001` into `.kibana_7.10.0_002` and point the `.kibana_7.10.0` alias to `.kibana_7.10.0_002`. 1. Ignore errors if the target index already exists. -7. Reindex the source index into a the new target index. All nodes on the same version will use the same fixed index name e.g. `.kibana_7.10.0_001`. The `001` postfix isn't used by Kibana, but allows for re-indexing an index should this be required by an Elasticsearch upgrade. E.g. re-index `.kibana_7.10.0_001` into `.kibana_7.10.0_002` and point the `.kibana_7.10.0` alias to `.kibana_7.10.0_002`. - 1. Use `op_type=create` `conflicts=proceed` and `wait_for_completion=false` +7. Reindex from the `.kibana` alias into the target alias `.kibana_7.10.0`. + 1. Specify `require_alias=true` so that once the migration is complete other instances are blocked from initiating a new reindex operation by an error like `"type":"action_request_validation_exception","reason":"Validation Failed: 1: reindex cannot write into an index it is reading from` (which can safely be ignored). + 2. Use `op_type=create` `conflicts=proceed` and `wait_for_completion=false` so that multiple instances can perform the reindex in parallel but only one write per document will succeed. - 2. Wait for the reindex task to complete. If reindexing doesn’t complete within the 60s timeout, log a warning for visibility and poll again. + 3. Wait for the reindex task to complete. If reindexing doesn’t complete within the 60s timeout, log a warning for visibility and poll again. 8. Transform documents by reading batches of outdated documents from the target index then transforming and updating them with optimistic concurrency control. 1. Ignore any version conflict errors. 2. If a document transform throws an exception, add the document to a failure list and continue trying to transform all other documents. If any failures occured, log the complete list of documents that failed to transform. Fail the migration. @@ -275,9 +276,10 @@ Note: migration in parallel, only one version will win. E.g. if 7.11 and 7.12 are started in parallel and migrate from a 7.9 index, either 7.11 or 7.12 should succeed and accept writes, but not both. - 4. Checks that `.kibana` alias is still pointing to the source index - 5. Points the `.kibana_7.10.0` and `.kibana` aliases to the target index. - 6. If this fails with a "required alias [.kibana] does not exist" error fetch `.kibana` again: + - Checks that `.kibana` alias is still pointing to the source index + - Points the `.kibana` alias to the target index. + 4. If this succeeds there is a chance that other instances initiated a reindex operation that is still in progress. Wait until all submitted reindex operations are complete by setting a write block and then removing the write block. + 5. If this fails with a "required alias [.kibana] does not exist" error fetch `.kibana` again: 1. If `.kibana` is _not_ pointing to our target index fail the migration. 2. If `.kibana` is pointing to our target index the migration has succeeded and we can proceed to step (10). 11. Start serving traffic. All saved object reads/writes happen through the From 8baf9b13dbbe50c1dec0d4844f170304ecc0b883 Mon Sep 17 00:00:00 2001 From: Rudolf Meijering Date: Tue, 15 Dec 2020 02:47:18 +0100 Subject: [PATCH 4/9] Use a ..._reindex_in_progress alias so that waiting for and preventing other reindex operations is idempotent The first version of the reindex block had only the instance which was able to mark the migration as complete set and remove the write block. This means other instances couldn't know if any reindex operaitons were in progress if the migration was already marked as complete. It also meant that a failure in this critical step could result in a permanent write block. --- rfcs/text/0013_saved_object_migrations.md | 46 +++++++++++++---------- 1 file changed, 26 insertions(+), 20 deletions(-) diff --git a/rfcs/text/0013_saved_object_migrations.md b/rfcs/text/0013_saved_object_migrations.md index cbadbf372dcfc..7efe6194ebad3 100644 --- a/rfcs/text/0013_saved_object_migrations.md +++ b/rfcs/text/0013_saved_object_migrations.md @@ -220,7 +220,8 @@ Note: `.kibana_task_manager_pre7.4.0_001`. Ignore index already exists errors. 3. Reindex the legacy index into the new source index with the `convertToAlias` script if specified. Use `wait_for_completion: false` - to run this as a task. Ignore errors if the legacy source doesn't exist. + to run this as a task. Ignore errors if the legacy index doesn't exist + or if there's a write block on the source index. 4. Wait for the reindex task to complete. If the task doesn’t complete within the 60s timeout, log a warning for visibility and poll again. Ignore errors if the legacy source doesn't exist. @@ -248,29 +249,35 @@ Note: 6. Use the reindexed legacy `.kibana_pre6.5.0_001` as the source for the rest of the migration algorithm. 3. If `.kibana` and `.kibana_7.10.0` both exists and are pointing to the same index this version's migration has already been completed. 1. Because the same version can have plugins enabled at any point in time, - migrate outdated documents with step (8) and perform the mappings update in step (9). - 2. Skip to step (10) to start serving traffic. + migrate outdated documents with step (9) and perform the mappings update in step (10). + 2. Skip to step (12) to start serving traffic. 4. Fail the migration if: 1. `.kibana` is pointing to an index that belongs to a later version of Kibana .e.g. `.kibana_7.12.0_001` 2. (Only in 8.x) The source index contains documents that belong to an unknown Saved Object type (from a disabled plugin). Log an error explaining that the plugin that created these documents needs to be enabled again or that these objects should be deleted. See section (4.2.1.4). 5. Mark the source index as read-only and wait for all in-flight operations to drain (requires https://github.com/elastic/elasticsearch/pull/58094). This prevents any further writes from outdated nodes. Assuming this API is similar to the existing `//_close` API, we expect to receive `"acknowledged" : true` and `"shards_acknowledged" : true`. If all shards don’t acknowledge within the timeout, retry the operation until it succeeds. -6. Create a target index `.kibana_7.10.0_001` with an `.kibana_7.10.0` alias. Specify `dynamic: false` for the top-level mappings and the minimal mappings for the `migrationVersion` and `type` fields. This allows us to write untransformed documents to the index which might have fields which have been removed from the latest mappings defined by the plugin while still being able to search for outdated documents that need to be transformed. All nodes on the same version will use the same fixed index name. The `001` postfix isn't used by Kibana, but allows for re-indexing an index should this be required by an Elasticsearch upgrade. E.g. re-index `.kibana_7.10.0_001` into `.kibana_7.10.0_002` and point the `.kibana_7.10.0` alias to `.kibana_7.10.0_002`. +6. Create a target index `.kibana_7.10.0_001` with the following aliases: `.kibana_7.10.0`, `.kibana_7.10.0_reindex_in_progress`. Specify `dynamic: false` for the top-level mappings and the minimal mappings for the `migrationVersion` and `type` fields. This allows us to write untransformed documents to the index which might have fields which have been removed from the latest mappings defined by the plugin while still being able to search for outdated documents that need to be transformed. All nodes on the same version will use the same fixed index name. The `001` postfix isn't used by Kibana, but allows for re-indexing an index should this be required by an Elasticsearch upgrade. E.g. re-index `.kibana_7.10.0_001` into `.kibana_7.10.0_002` and point the `.kibana_7.10.0` alias to `.kibana_7.10.0_002`. 1. Ignore errors if the target index already exists. -7. Reindex from the `.kibana` alias into the target alias `.kibana_7.10.0`. - 1. Specify `require_alias=true` so that once the migration is complete other instances are blocked from initiating a new reindex operation by an error like `"type":"action_request_validation_exception","reason":"Validation Failed: 1: reindex cannot write into an index it is reading from` (which can safely be ignored). - 2. Use `op_type=create` `conflicts=proceed` and `wait_for_completion=false` - so that multiple instances can perform the reindex in parallel but only - one write per document will succeed. +7. Reindex from the source alias `.kibana` into the target alias `.kibana_7.10.0_reindex_in_progress`. + 1. Specify `require_alias=true` so that once step (8) is complete other instances are blocked from initiating a new reindex operation. + 2. Use `op_type=create` `conflicts=proceed` and `wait_for_completion=false` so that multiple instances can perform the reindex in parallel but only one write per document will succeed. 3. Wait for the reindex task to complete. If reindexing doesn’t complete within the 60s timeout, log a warning for visibility and poll again. -8. Transform documents by reading batches of outdated documents from the target index then transforming and updating them with optimistic concurrency control. - 1. Ignore any version conflict errors. - 2. If a document transform throws an exception, add the document to a failure list and continue trying to transform all other documents. If any failures occured, log the complete list of documents that failed to transform. Fail the migration. -9. Update the mappings of the target index - 1. Retrieve the existing mappings including the `migrationMappingPropertyHashes` metadata. - 2. Update the mappings with `PUT /.kibana_7.10.0_001/_mapping`. The API deeply merges any updates so this won't remove the mappings of any plugins that are disabled on this instance but have been enabled on another instance that also migrated this index. - 3. Ensure that fields are correctly indexed using the target index's latest mappings `POST /.kibana_7.10.0_001/_update_by_query?conflicts=proceed`. In the future we could optimize this query by only targeting documents: + 4. Ignore the following errors which mean another instance has already completed the reindex operation and we can continue to step (8): + 1. `"type": "cluster_block_exception", "reason": "index [...] blocked by: [FORBIDDEN\/8/index write (api)]"` (another instance completed step 8.1) + 2. `"type": "index_not_found_exception","reason":"no such index [...] and [require_alias] request flag is [true] and [...] is not an alias` (another instance completed step 8.2) +8. Mark the reindex operation as complete: + 1. Set a write block on the target index using the `.kibana_7.10.0_reindex_in_progress` alias. This will wait until all reindex operations potentially started by other instances are complete. It will also block new reindex operations until step (8.2) is completed. Ignore `index_not_found_exception` as this means another node has already removed the alias. + 2. Remove the `.kibana_7.10.0_reindex_in_progress` alias. This prevents any other instances from starting a new reindex operation even if we remove the write block. Ignore if the alias doesn't exist. + 3. Remove the write block from the target index using the index name i.e. `.kibana_7.10.0_001`. + 4. (Together these operations prevent lost deletes when one instance completes the migration and deletes a document, while another instance still had an in-progress reindex that recreates the deleted document.) +9. Transform documents by reading batches of outdated documents from the target index then transforming and updating them with optimistic concurrency control. + 5. Ignore any version conflict errors. + 6. If a document transform throws an exception, add the document to a failure list and continue trying to transform all other documents. If any failures occured, log the complete list of documents that failed to transform. Fail the migration. +10. Update the mappings of the target index + 7. Retrieve the existing mappings including the `migrationMappingPropertyHashes` metadata. + 8. Update the mappings with `PUT /.kibana_7.10.0_001/_mapping`. The API deeply merges any updates so this won't remove the mappings of any plugins that are disabled on this instance but have been enabled on another instance that also migrated this index. + 9. Ensure that fields are correctly indexed using the target index's latest mappings `POST /.kibana_7.10.0_001/_update_by_query?conflicts=proceed`. In the future we could optimize this query by only targeting documents: 1. That belong to a known saved object type. -10. Mark the migration as complete. This is done as a single atomic +11. Mark the migration as complete. This is done as a single atomic operation (requires https://github.com/elastic/elasticsearch/pull/58100) to guarantees when multiple versions of Kibana are performing the migration in parallel, only one version will win. E.g. if 7.11 and 7.12 @@ -278,11 +285,10 @@ Note: should succeed and accept writes, but not both. - Checks that `.kibana` alias is still pointing to the source index - Points the `.kibana` alias to the target index. - 4. If this succeeds there is a chance that other instances initiated a reindex operation that is still in progress. Wait until all submitted reindex operations are complete by setting a write block and then removing the write block. - 5. If this fails with a "required alias [.kibana] does not exist" error fetch `.kibana` again: + 10. If this fails with a "required alias [.kibana] does not exist" error fetch `.kibana` again: 1. If `.kibana` is _not_ pointing to our target index fail the migration. 2. If `.kibana` is pointing to our target index the migration has succeeded and we can proceed to step (10). -11. Start serving traffic. All saved object reads/writes happen through the +12. Start serving traffic. All saved object reads/writes happen through the version-specific alias `.kibana_7.10.0`. Together with the limitations, this algorithm ensures that migrations are From 4ef8034bc0c4c2e326e30ac7ea3af3689eb40bde Mon Sep 17 00:00:00 2001 From: Rudolf Meijering Date: Tue, 15 Dec 2020 12:44:14 +0100 Subject: [PATCH 5/9] Revert "Use a ..._reindex_in_progress alias so that waiting for and preventing other reindex operations is idempotent" This reverts commit 8baf9b13dbbe50c1dec0d4844f170304ecc0b883. --- rfcs/text/0013_saved_object_migrations.md | 46 ++++++++++------------- 1 file changed, 20 insertions(+), 26 deletions(-) diff --git a/rfcs/text/0013_saved_object_migrations.md b/rfcs/text/0013_saved_object_migrations.md index 7efe6194ebad3..cbadbf372dcfc 100644 --- a/rfcs/text/0013_saved_object_migrations.md +++ b/rfcs/text/0013_saved_object_migrations.md @@ -220,8 +220,7 @@ Note: `.kibana_task_manager_pre7.4.0_001`. Ignore index already exists errors. 3. Reindex the legacy index into the new source index with the `convertToAlias` script if specified. Use `wait_for_completion: false` - to run this as a task. Ignore errors if the legacy index doesn't exist - or if there's a write block on the source index. + to run this as a task. Ignore errors if the legacy source doesn't exist. 4. Wait for the reindex task to complete. If the task doesn’t complete within the 60s timeout, log a warning for visibility and poll again. Ignore errors if the legacy source doesn't exist. @@ -249,35 +248,29 @@ Note: 6. Use the reindexed legacy `.kibana_pre6.5.0_001` as the source for the rest of the migration algorithm. 3. If `.kibana` and `.kibana_7.10.0` both exists and are pointing to the same index this version's migration has already been completed. 1. Because the same version can have plugins enabled at any point in time, - migrate outdated documents with step (9) and perform the mappings update in step (10). - 2. Skip to step (12) to start serving traffic. + migrate outdated documents with step (8) and perform the mappings update in step (9). + 2. Skip to step (10) to start serving traffic. 4. Fail the migration if: 1. `.kibana` is pointing to an index that belongs to a later version of Kibana .e.g. `.kibana_7.12.0_001` 2. (Only in 8.x) The source index contains documents that belong to an unknown Saved Object type (from a disabled plugin). Log an error explaining that the plugin that created these documents needs to be enabled again or that these objects should be deleted. See section (4.2.1.4). 5. Mark the source index as read-only and wait for all in-flight operations to drain (requires https://github.com/elastic/elasticsearch/pull/58094). This prevents any further writes from outdated nodes. Assuming this API is similar to the existing `//_close` API, we expect to receive `"acknowledged" : true` and `"shards_acknowledged" : true`. If all shards don’t acknowledge within the timeout, retry the operation until it succeeds. -6. Create a target index `.kibana_7.10.0_001` with the following aliases: `.kibana_7.10.0`, `.kibana_7.10.0_reindex_in_progress`. Specify `dynamic: false` for the top-level mappings and the minimal mappings for the `migrationVersion` and `type` fields. This allows us to write untransformed documents to the index which might have fields which have been removed from the latest mappings defined by the plugin while still being able to search for outdated documents that need to be transformed. All nodes on the same version will use the same fixed index name. The `001` postfix isn't used by Kibana, but allows for re-indexing an index should this be required by an Elasticsearch upgrade. E.g. re-index `.kibana_7.10.0_001` into `.kibana_7.10.0_002` and point the `.kibana_7.10.0` alias to `.kibana_7.10.0_002`. +6. Create a target index `.kibana_7.10.0_001` with an `.kibana_7.10.0` alias. Specify `dynamic: false` for the top-level mappings and the minimal mappings for the `migrationVersion` and `type` fields. This allows us to write untransformed documents to the index which might have fields which have been removed from the latest mappings defined by the plugin while still being able to search for outdated documents that need to be transformed. All nodes on the same version will use the same fixed index name. The `001` postfix isn't used by Kibana, but allows for re-indexing an index should this be required by an Elasticsearch upgrade. E.g. re-index `.kibana_7.10.0_001` into `.kibana_7.10.0_002` and point the `.kibana_7.10.0` alias to `.kibana_7.10.0_002`. 1. Ignore errors if the target index already exists. -7. Reindex from the source alias `.kibana` into the target alias `.kibana_7.10.0_reindex_in_progress`. - 1. Specify `require_alias=true` so that once step (8) is complete other instances are blocked from initiating a new reindex operation. - 2. Use `op_type=create` `conflicts=proceed` and `wait_for_completion=false` so that multiple instances can perform the reindex in parallel but only one write per document will succeed. +7. Reindex from the `.kibana` alias into the target alias `.kibana_7.10.0`. + 1. Specify `require_alias=true` so that once the migration is complete other instances are blocked from initiating a new reindex operation by an error like `"type":"action_request_validation_exception","reason":"Validation Failed: 1: reindex cannot write into an index it is reading from` (which can safely be ignored). + 2. Use `op_type=create` `conflicts=proceed` and `wait_for_completion=false` + so that multiple instances can perform the reindex in parallel but only + one write per document will succeed. 3. Wait for the reindex task to complete. If reindexing doesn’t complete within the 60s timeout, log a warning for visibility and poll again. - 4. Ignore the following errors which mean another instance has already completed the reindex operation and we can continue to step (8): - 1. `"type": "cluster_block_exception", "reason": "index [...] blocked by: [FORBIDDEN\/8/index write (api)]"` (another instance completed step 8.1) - 2. `"type": "index_not_found_exception","reason":"no such index [...] and [require_alias] request flag is [true] and [...] is not an alias` (another instance completed step 8.2) -8. Mark the reindex operation as complete: - 1. Set a write block on the target index using the `.kibana_7.10.0_reindex_in_progress` alias. This will wait until all reindex operations potentially started by other instances are complete. It will also block new reindex operations until step (8.2) is completed. Ignore `index_not_found_exception` as this means another node has already removed the alias. - 2. Remove the `.kibana_7.10.0_reindex_in_progress` alias. This prevents any other instances from starting a new reindex operation even if we remove the write block. Ignore if the alias doesn't exist. - 3. Remove the write block from the target index using the index name i.e. `.kibana_7.10.0_001`. - 4. (Together these operations prevent lost deletes when one instance completes the migration and deletes a document, while another instance still had an in-progress reindex that recreates the deleted document.) -9. Transform documents by reading batches of outdated documents from the target index then transforming and updating them with optimistic concurrency control. - 5. Ignore any version conflict errors. - 6. If a document transform throws an exception, add the document to a failure list and continue trying to transform all other documents. If any failures occured, log the complete list of documents that failed to transform. Fail the migration. -10. Update the mappings of the target index - 7. Retrieve the existing mappings including the `migrationMappingPropertyHashes` metadata. - 8. Update the mappings with `PUT /.kibana_7.10.0_001/_mapping`. The API deeply merges any updates so this won't remove the mappings of any plugins that are disabled on this instance but have been enabled on another instance that also migrated this index. - 9. Ensure that fields are correctly indexed using the target index's latest mappings `POST /.kibana_7.10.0_001/_update_by_query?conflicts=proceed`. In the future we could optimize this query by only targeting documents: +8. Transform documents by reading batches of outdated documents from the target index then transforming and updating them with optimistic concurrency control. + 1. Ignore any version conflict errors. + 2. If a document transform throws an exception, add the document to a failure list and continue trying to transform all other documents. If any failures occured, log the complete list of documents that failed to transform. Fail the migration. +9. Update the mappings of the target index + 1. Retrieve the existing mappings including the `migrationMappingPropertyHashes` metadata. + 2. Update the mappings with `PUT /.kibana_7.10.0_001/_mapping`. The API deeply merges any updates so this won't remove the mappings of any plugins that are disabled on this instance but have been enabled on another instance that also migrated this index. + 3. Ensure that fields are correctly indexed using the target index's latest mappings `POST /.kibana_7.10.0_001/_update_by_query?conflicts=proceed`. In the future we could optimize this query by only targeting documents: 1. That belong to a known saved object type. -11. Mark the migration as complete. This is done as a single atomic +10. Mark the migration as complete. This is done as a single atomic operation (requires https://github.com/elastic/elasticsearch/pull/58100) to guarantees when multiple versions of Kibana are performing the migration in parallel, only one version will win. E.g. if 7.11 and 7.12 @@ -285,10 +278,11 @@ Note: should succeed and accept writes, but not both. - Checks that `.kibana` alias is still pointing to the source index - Points the `.kibana` alias to the target index. - 10. If this fails with a "required alias [.kibana] does not exist" error fetch `.kibana` again: + 4. If this succeeds there is a chance that other instances initiated a reindex operation that is still in progress. Wait until all submitted reindex operations are complete by setting a write block and then removing the write block. + 5. If this fails with a "required alias [.kibana] does not exist" error fetch `.kibana` again: 1. If `.kibana` is _not_ pointing to our target index fail the migration. 2. If `.kibana` is pointing to our target index the migration has succeeded and we can proceed to step (10). -12. Start serving traffic. All saved object reads/writes happen through the +11. Start serving traffic. All saved object reads/writes happen through the version-specific alias `.kibana_7.10.0`. Together with the limitations, this algorithm ensures that migrations are From 8190b65b7aaff18287bb5b78726a4036396ac267 Mon Sep 17 00:00:00 2001 From: Rudolf Meijering Date: Tue, 15 Dec 2020 12:44:31 +0100 Subject: [PATCH 6/9] Revert "Use _reindex?require_alias=true and a write block toggle to prevent lost deletes" This reverts commit d7237ca42c4167b6931fe8c544ac7c40e27afc6c. --- rfcs/text/0013_saved_object_migrations.md | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/rfcs/text/0013_saved_object_migrations.md b/rfcs/text/0013_saved_object_migrations.md index cbadbf372dcfc..cf1cc0ff71946 100644 --- a/rfcs/text/0013_saved_object_migrations.md +++ b/rfcs/text/0013_saved_object_migrations.md @@ -254,14 +254,13 @@ Note: 1. `.kibana` is pointing to an index that belongs to a later version of Kibana .e.g. `.kibana_7.12.0_001` 2. (Only in 8.x) The source index contains documents that belong to an unknown Saved Object type (from a disabled plugin). Log an error explaining that the plugin that created these documents needs to be enabled again or that these objects should be deleted. See section (4.2.1.4). 5. Mark the source index as read-only and wait for all in-flight operations to drain (requires https://github.com/elastic/elasticsearch/pull/58094). This prevents any further writes from outdated nodes. Assuming this API is similar to the existing `//_close` API, we expect to receive `"acknowledged" : true` and `"shards_acknowledged" : true`. If all shards don’t acknowledge within the timeout, retry the operation until it succeeds. -6. Create a target index `.kibana_7.10.0_001` with an `.kibana_7.10.0` alias. Specify `dynamic: false` for the top-level mappings and the minimal mappings for the `migrationVersion` and `type` fields. This allows us to write untransformed documents to the index which might have fields which have been removed from the latest mappings defined by the plugin while still being able to search for outdated documents that need to be transformed. All nodes on the same version will use the same fixed index name. The `001` postfix isn't used by Kibana, but allows for re-indexing an index should this be required by an Elasticsearch upgrade. E.g. re-index `.kibana_7.10.0_001` into `.kibana_7.10.0_002` and point the `.kibana_7.10.0` alias to `.kibana_7.10.0_002`. +6. Create a target index with `dynamic: false` on the top-level mappings so that any kind of document can be written to the index. This allows us to write untransformed documents to the index which might have fields which have been removed from the latest mappings defined by the plugin. Define `dynamic:true` mappings for the `migrationVersion` field so that we're still able to search for outdated documents that need to be transformed. 1. Ignore errors if the target index already exists. -7. Reindex from the `.kibana` alias into the target alias `.kibana_7.10.0`. - 1. Specify `require_alias=true` so that once the migration is complete other instances are blocked from initiating a new reindex operation by an error like `"type":"action_request_validation_exception","reason":"Validation Failed: 1: reindex cannot write into an index it is reading from` (which can safely be ignored). - 2. Use `op_type=create` `conflicts=proceed` and `wait_for_completion=false` +7. Reindex the source index into a the new target index. All nodes on the same version will use the same fixed index name e.g. `.kibana_7.10.0_001`. The `001` postfix isn't used by Kibana, but allows for re-indexing an index should this be required by an Elasticsearch upgrade. E.g. re-index `.kibana_7.10.0_001` into `.kibana_7.10.0_002` and point the `.kibana_7.10.0` alias to `.kibana_7.10.0_002`. + 1. Use `op_type=create` `conflicts=proceed` and `wait_for_completion=false` so that multiple instances can perform the reindex in parallel but only one write per document will succeed. - 3. Wait for the reindex task to complete. If reindexing doesn’t complete within the 60s timeout, log a warning for visibility and poll again. + 2. Wait for the reindex task to complete. If reindexing doesn’t complete within the 60s timeout, log a warning for visibility and poll again. 8. Transform documents by reading batches of outdated documents from the target index then transforming and updating them with optimistic concurrency control. 1. Ignore any version conflict errors. 2. If a document transform throws an exception, add the document to a failure list and continue trying to transform all other documents. If any failures occured, log the complete list of documents that failed to transform. Fail the migration. @@ -276,10 +275,9 @@ Note: migration in parallel, only one version will win. E.g. if 7.11 and 7.12 are started in parallel and migrate from a 7.9 index, either 7.11 or 7.12 should succeed and accept writes, but not both. - - Checks that `.kibana` alias is still pointing to the source index - - Points the `.kibana` alias to the target index. - 4. If this succeeds there is a chance that other instances initiated a reindex operation that is still in progress. Wait until all submitted reindex operations are complete by setting a write block and then removing the write block. - 5. If this fails with a "required alias [.kibana] does not exist" error fetch `.kibana` again: + 4. Checks that `.kibana` alias is still pointing to the source index + 5. Points the `.kibana_7.10.0` and `.kibana` aliases to the target index. + 6. If this fails with a "required alias [.kibana] does not exist" error fetch `.kibana` again: 1. If `.kibana` is _not_ pointing to our target index fail the migration. 2. If `.kibana` is pointing to our target index the migration has succeeded and we can proceed to step (10). 11. Start serving traffic. All saved object reads/writes happen through the From d98faaab8f53cea4a639ffd611faf41848d98cb0 Mon Sep 17 00:00:00 2001 From: Rudolf Meijering Date: Tue, 15 Dec 2020 12:49:30 +0100 Subject: [PATCH 7/9] Use reindex + clone as a way to prevent lost deletes --- rfcs/text/0013_saved_object_migrations.md | 47 ++++++++++++----------- 1 file changed, 25 insertions(+), 22 deletions(-) diff --git a/rfcs/text/0013_saved_object_migrations.md b/rfcs/text/0013_saved_object_migrations.md index cf1cc0ff71946..80c4fa90ffe43 100644 --- a/rfcs/text/0013_saved_object_migrations.md +++ b/rfcs/text/0013_saved_object_migrations.md @@ -248,46 +248,49 @@ Note: 6. Use the reindexed legacy `.kibana_pre6.5.0_001` as the source for the rest of the migration algorithm. 3. If `.kibana` and `.kibana_7.10.0` both exists and are pointing to the same index this version's migration has already been completed. 1. Because the same version can have plugins enabled at any point in time, - migrate outdated documents with step (8) and perform the mappings update in step (9). - 2. Skip to step (10) to start serving traffic. + migrate outdated documents with step (9) and perform the mappings update in step (10). + 2. Skip to step (12) to start serving traffic. 4. Fail the migration if: 1. `.kibana` is pointing to an index that belongs to a later version of Kibana .e.g. `.kibana_7.12.0_001` 2. (Only in 8.x) The source index contains documents that belong to an unknown Saved Object type (from a disabled plugin). Log an error explaining that the plugin that created these documents needs to be enabled again or that these objects should be deleted. See section (4.2.1.4). -5. Mark the source index as read-only and wait for all in-flight operations to drain (requires https://github.com/elastic/elasticsearch/pull/58094). This prevents any further writes from outdated nodes. Assuming this API is similar to the existing `//_close` API, we expect to receive `"acknowledged" : true` and `"shards_acknowledged" : true`. If all shards don’t acknowledge within the timeout, retry the operation until it succeeds. -6. Create a target index with `dynamic: false` on the top-level mappings so that any kind of document can be written to the index. This allows us to write untransformed documents to the index which might have fields which have been removed from the latest mappings defined by the plugin. Define `dynamic:true` mappings for the `migrationVersion` field so that we're still able to search for outdated documents that need to be transformed. +5. Set a write block on the source index. This prevents any further writes from outdated nodes. +6. Create a new temporary index `.kibana_7.10.0_reindex_temp` with `dynamic: false` on the top-level mappings so that any kind of document can be written to the index. This allows us to write untransformed documents to the index which might have fields which have been removed from the latest mappings defined by the plugin. Define minimal mappings for the `migrationVersion` and `type` fields so that we're still able to search for outdated documents that need to be transformed. 1. Ignore errors if the target index already exists. -7. Reindex the source index into a the new target index. All nodes on the same version will use the same fixed index name e.g. `.kibana_7.10.0_001`. The `001` postfix isn't used by Kibana, but allows for re-indexing an index should this be required by an Elasticsearch upgrade. E.g. re-index `.kibana_7.10.0_001` into `.kibana_7.10.0_002` and point the `.kibana_7.10.0` alias to `.kibana_7.10.0_002`. - 1. Use `op_type=create` `conflicts=proceed` and `wait_for_completion=false` - so that multiple instances can perform the reindex in parallel but only - one write per document will succeed. +7. Reindex the source index into the new temporary index. + 1. Use `op_type=create` `conflicts=proceed` and `wait_for_completion=false` so that multiple instances can perform the reindex in parallel but only one write per document will succeed. 2. Wait for the reindex task to complete. If reindexing doesn’t complete within the 60s timeout, log a warning for visibility and poll again. -8. Transform documents by reading batches of outdated documents from the target index then transforming and updating them with optimistic concurrency control. - 1. Ignore any version conflict errors. - 2. If a document transform throws an exception, add the document to a failure list and continue trying to transform all other documents. If any failures occured, log the complete list of documents that failed to transform. Fail the migration. -9. Update the mappings of the target index - 1. Retrieve the existing mappings including the `migrationMappingPropertyHashes` metadata. - 2. Update the mappings with `PUT /.kibana_7.10.0_001/_mapping`. The API deeply merges any updates so this won't remove the mappings of any plugins that are disabled on this instance but have been enabled on another instance that also migrated this index. - 3. Ensure that fields are correctly indexed using the target index's latest mappings `POST /.kibana_7.10.0_001/_update_by_query?conflicts=proceed`. In the future we could optimize this query by only targeting documents: +8. Clone the temporary index into the target index `.kibana_7.10.0_001`. Since any further writes will only happen against the cloned target index this prevents a lost delete from occuring where one instance finishes the migration and deletes a document and another instance's reindex operation re-creates the deleted document. + 1. Set a write block on the temporary index + 2. Clone the temporary index into the target index while specifying that the target index should have writes enabled. + 3. If the clone operation fails because the target index already exist, ignore the error and wait for the target index to become green before proceeding. + 4. (The `001` postfix in the target index name isn't used by Kibana, but allows for re-indexing an index should this be required by an Elasticsearch upgrade. E.g. re-index `.kibana_7.10.0_001` into `.kibana_7.10.0_002` and point the `.kibana_7.10.0` alias to `.kibana_7.10.0_002`.) +9. Transform documents by reading batches of outdated documents from the target index then transforming and updating them with optimistic concurrency control. + 5. Ignore any version conflict errors. + 6. If a document transform throws an exception, add the document to a failure list and continue trying to transform all other documents. If any failures occured, log the complete list of documents that failed to transform. Fail the migration. +10. Update the mappings of the target index + 7. Retrieve the existing mappings including the `migrationMappingPropertyHashes` metadata. + 8. Update the mappings with `PUT /.kibana_7.10.0_001/_mapping`. The API deeply merges any updates so this won't remove the mappings of any plugins that are disabled on this instance but have been enabled on another instance that also migrated this index. + 9. Ensure that fields are correctly indexed using the target index's latest mappings `POST /.kibana_7.10.0_001/_update_by_query?conflicts=proceed`. In the future we could optimize this query by only targeting documents: 1. That belong to a known saved object type. -10. Mark the migration as complete. This is done as a single atomic +11. Mark the migration as complete. This is done as a single atomic operation (requires https://github.com/elastic/elasticsearch/pull/58100) - to guarantees when multiple versions of Kibana are performing the + to guarantee that when multiple versions of Kibana are performing the migration in parallel, only one version will win. E.g. if 7.11 and 7.12 are started in parallel and migrate from a 7.9 index, either 7.11 or 7.12 should succeed and accept writes, but not both. - 4. Checks that `.kibana` alias is still pointing to the source index - 5. Points the `.kibana_7.10.0` and `.kibana` aliases to the target index. - 6. If this fails with a "required alias [.kibana] does not exist" error fetch `.kibana` again: + 1. Checks that `.kibana` alias is still pointing to the source index + 2. Points the `.kibana_7.10.0` and `.kibana` aliases to the target index. + 3. Removes the temporary index `.kibana_7.10.0_reindex_temp` + 4. If this fails with a "required alias [.kibana] does not exist" error fetch `.kibana` again: 1. If `.kibana` is _not_ pointing to our target index fail the migration. 2. If `.kibana` is pointing to our target index the migration has succeeded and we can proceed to step (10). -11. Start serving traffic. All saved object reads/writes happen through the +12. Start serving traffic. All saved object reads/writes happen through the version-specific alias `.kibana_7.10.0`. Together with the limitations, this algorithm ensures that migrations are idempotent. If two nodes are started simultaneously, both of them will start transforming documents in that version's target index, but because migrations are idempotent, it doesn’t matter which node’s writes win. - #### Known weaknesses: (Also present in our existing migration algorithm since v7.4) When the task manager index gets reindexed a reindex script is applied. From 699d5304e532f0c8efbaf72a66452216f659021d Mon Sep 17 00:00:00 2001 From: Rudolf Meijering Date: Mon, 11 Jan 2021 15:49:15 +0100 Subject: [PATCH 8/9] Fix numbering and ignore index_not_found_exceptionfor temporary index --- rfcs/text/0013_saved_object_migrations.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/rfcs/text/0013_saved_object_migrations.md b/rfcs/text/0013_saved_object_migrations.md index c0ed6183e7c4e..ec27c82446b90 100644 --- a/rfcs/text/0013_saved_object_migrations.md +++ b/rfcs/text/0013_saved_object_migrations.md @@ -265,12 +265,12 @@ Note: 3. If the clone operation fails because the target index already exist, ignore the error and wait for the target index to become green before proceeding. 4. (The `001` postfix in the target index name isn't used by Kibana, but allows for re-indexing an index should this be required by an Elasticsearch upgrade. E.g. re-index `.kibana_7.10.0_001` into `.kibana_7.10.0_002` and point the `.kibana_7.10.0` alias to `.kibana_7.10.0_002`.) 9. Transform documents by reading batches of outdated documents from the target index then transforming and updating them with optimistic concurrency control. - 5. Ignore any version conflict errors. - 6. If a document transform throws an exception, add the document to a failure list and continue trying to transform all other documents. If any failures occured, log the complete list of documents that failed to transform. Fail the migration. + 1. Ignore any version conflict errors. + 2. If a document transform throws an exception, add the document to a failure list and continue trying to transform all other documents. If any failures occured, log the complete list of documents that failed to transform. Fail the migration. 10. Update the mappings of the target index - 7. Retrieve the existing mappings including the `migrationMappingPropertyHashes` metadata. - 8. Update the mappings with `PUT /.kibana_7.10.0_001/_mapping`. The API deeply merges any updates so this won't remove the mappings of any plugins that are disabled on this instance but have been enabled on another instance that also migrated this index. - 9. Ensure that fields are correctly indexed using the target index's latest mappings `POST /.kibana_7.10.0_001/_update_by_query?conflicts=proceed`. In the future we could optimize this query by only targeting documents: + 1. Retrieve the existing mappings including the `migrationMappingPropertyHashes` metadata. + 2. Update the mappings with `PUT /.kibana_7.10.0_001/_mapping`. The API deeply merges any updates so this won't remove the mappings of any plugins that are disabled on this instance but have been enabled on another instance that also migrated this index. + 3. Ensure that fields are correctly indexed using the target index's latest mappings `POST /.kibana_7.10.0_001/_update_by_query?conflicts=proceed`. In the future we could optimize this query by only targeting documents: 1. That belong to a known saved object type. 11. Mark the migration as complete. This is done as a single atomic operation (requires https://github.com/elastic/elasticsearch/pull/58100) @@ -278,10 +278,10 @@ Note: migration in parallel, only one version will win. E.g. if 7.11 and 7.12 are started in parallel and migrate from a 7.9 index, either 7.11 or 7.12 should succeed and accept writes, but not both. - 10. Checks that `.kibana` alias is still pointing to the source index - 11. Points the `.kibana_7.10.0` and `.kibana` aliases to the target index. - 12. Removes the temporary index `.kibana_7.10.0_reindex_temp` - 13. If this fails with a "required alias [.kibana] does not exist" error fetch `.kibana` again: + 1. Checks that `.kibana` alias is still pointing to the source index + 2. Points the `.kibana_7.10.0` and `.kibana` aliases to the target index. + 3. Removes the temporary index `.kibana_7.10.0_reindex_temp` + 4. If this fails with a "required alias [.kibana] does not exist" error or "index_not_found_exception" for the temporary index, fetch `.kibana` again: 1. If `.kibana` is _not_ pointing to our target index fail the migration. 2. If `.kibana` is pointing to our target index the migration has succeeded and we can proceed to step (10). 12. Start serving traffic. All saved object reads/writes happen through the From 4639e4766326555ec6f4a5852002d1e532e27966 Mon Sep 17 00:00:00 2001 From: Rudolf Meijering Date: Wed, 13 Jan 2021 12:32:22 +0100 Subject: [PATCH 9/9] Apply suggestions from code review Co-authored-by: Josh Dover --- rfcs/text/0013_saved_object_migrations.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/rfcs/text/0013_saved_object_migrations.md b/rfcs/text/0013_saved_object_migrations.md index ec27c82446b90..88879e5e706eb 100644 --- a/rfcs/text/0013_saved_object_migrations.md +++ b/rfcs/text/0013_saved_object_migrations.md @@ -278,12 +278,12 @@ Note: migration in parallel, only one version will win. E.g. if 7.11 and 7.12 are started in parallel and migrate from a 7.9 index, either 7.11 or 7.12 should succeed and accept writes, but not both. - 1. Checks that `.kibana` alias is still pointing to the source index - 2. Points the `.kibana_7.10.0` and `.kibana` aliases to the target index. - 3. Removes the temporary index `.kibana_7.10.0_reindex_temp` + 1. Check that `.kibana` alias is still pointing to the source index + 2. Point the `.kibana_7.10.0` and `.kibana` aliases to the target index. + 3. Remove the temporary index `.kibana_7.10.0_reindex_temp` 4. If this fails with a "required alias [.kibana] does not exist" error or "index_not_found_exception" for the temporary index, fetch `.kibana` again: 1. If `.kibana` is _not_ pointing to our target index fail the migration. - 2. If `.kibana` is pointing to our target index the migration has succeeded and we can proceed to step (10). + 2. If `.kibana` is pointing to our target index the migration has succeeded and we can proceed to step (12). 12. Start serving traffic. All saved object reads/writes happen through the version-specific alias `.kibana_7.10.0`.