Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only migrate an index if necessary #124946

Closed
rudolf opened this issue Feb 8, 2022 · 9 comments · Fixed by #147371
Closed

Only migrate an index if necessary #124946

rudolf opened this issue Feb 8, 2022 · 9 comments · Fixed by #147371
Labels
Feature:Migrations loe:x-large Extra Large Level of Effort project:ResilientSavedObjectMigrations Reduce Kibana upgrade failures by making saved object migrations more resilient Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@rudolf
Copy link
Contributor

rudolf commented Feb 8, 2022

In order to support different Kibana instances having different plugins enabled, the v2 migration algorithm had to perform a migration for every version, because even if instance 1 doesn't think any migrations are necessary, instance 2 might have a plugin which defines a migration for this version.

Given that we no longer support unknown saved object types #101351 I don't think there are any reasons that we need to run the migration on every upgrade. If the mappings have stayed the same, we could just add new version specific aliases to the existing index. This will improve the performance of migrations in the majority of cases,

but for some edge case configurations it could cause unexpected downtime once a plugin gets enabledIf a Kibana deployment has:

  • a disabled plugin
  • the plugin has never created any saved objects
  • the plugin has a migration function which hasn't been run

Then, as soon as the plugin is enabled, Kibana will see that the indices are no longer up to date and will therefore cause downtime while performing a full migration because of the migration function/mapping changes in the newly enabled plugin.

This means we could have an alias like .kibana_8.0.0 pointing to the index .kibana_7.16.0_001 which might seem wrong. We would have to educate users/support that this could be the expected behaviour. Alternatively we could clone the index into .kibana_8.0.0_001 but this seems expensive. Another alternative is to obfuscate the version by hashing the version number. Though at the end of the day this is internal implementation details that users shouldn't have to even know about so just keeping the existing index name with a version-specific alias seems simpler.

This would be the first step towards scaling migration performance by having an index per saved object type #104081 It would also give immediate benefits since we typically migrate the task manager index much less frequently than the .kibana index.

@rudolf rudolf added project:ResilientSavedObjectMigrations Reduce Kibana upgrade failures by making saved object migrations more resilient Feature:Migrations labels Feb 8, 2022
@botelastic botelastic bot added the needs-team Issues missing a team label label Feb 8, 2022
@rudolf rudolf added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Feb 8, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@botelastic botelastic bot removed the needs-team Issues missing a team label label Feb 8, 2022
@planadecu planadecu added the loe:x-large Extra Large Level of Effort label Feb 8, 2022
@rudolf
Copy link
Contributor Author

rudolf commented Sep 6, 2022

An additional benefit of checking the mappings would be that we don't need to run the UPDATE_TARGET_MAPPINGS step on every restart. While this step is usually fairly fast, we have had some reports that clusters where the .kibana index is really large like > 10GB this could take a significant amount of time (> 7 minutes).

@rudolf
Copy link
Contributor Author

rudolf commented Oct 21, 2022

For reference...

Here is where we calculate the md5 hash of every plugins mapping fields: https://github.com/elastic/kibana/blob/main/packages/core/saved-objects/core-saved-objects-migration-server-internal/src/core/build_active_mappings.ts#L27

And we have an existing diffMappings function to test if mappings are different in an incompatible way. This was used in v1 migrations but we didn't yet plug it into v2 migrations
https://github.com/elastic/kibana/blob/main/packages/core/saved-objects/core-saved-objects-migration-server-internal/src/core/build_active_mappings.ts#L49

@exalate-issue-sync exalate-issue-sync bot changed the title migrations: only migrate an index if necessary only migrate an index if necessary Nov 11, 2022
@exalate-issue-sync exalate-issue-sync bot changed the title only migrate an index if necessary Only migrate an index if necessary Nov 15, 2022
@rudolf
Copy link
Contributor Author

rudolf commented Nov 28, 2022

We will have to remember that upgrades from 7.x to 8.x rewrite the _id of saved object types that become namespaceType: multiple-isolated. In theory, all saved objects that wanted to become multiple-isolated would have done so with the 8.0 release but there are no technical controls to guarantee this.

We would either have to add a check so that we still kick off a full migration if a type becomes multiple-isolated, but such a check would likely only work when upgrading to the next release. E.g. we can add the check in 8.8 and then when someone upgrades from 8.8 to 8.x we could safely skip the reindexing but this significantly increases the time to creating value for users.

Alternatively, we could audit all 8.x releases and confirm that there has been no changes to namespaceType: 'multiple-isolated'. We could then add a technical control to prevent further changes (or rely on our existing automatic code reviews for migration-related changes?). This would allow us to assume any upgrade from 8.0+ can safely skip the reindexing if the mappings didn't change.

@pgayvallet
Copy link
Contributor

We could then add a technical control to prevent further changes (or rely on our existing automatic code reviews for migration-related changes?). This would allow us to assume any upgrade from 8.0+ can safely skip the reindexing if the mappings didn't change.

We know we will need to prevent changing more existing types to be shareable for other features (as we won't be able to support id rewriting), so I'd say checking / blocking SavedObjectType.convertToMultiNamespaceTypeVersion from being higher than an arbitrary version would be acceptable?

@rudolf
Copy link
Contributor Author

rudolf commented Dec 8, 2022

so I'd say checking / blocking SavedObjectType.convertToMultiNamespaceTypeVersion from being higher than an arbitrary version would be acceptable?

Yes, that's a really nice way to add a technical control 😁

@pgayvallet
Copy link
Contributor

pgayvallet commented Dec 8, 2022

This means we could have an alias like .kibana_8.0.0 pointing to the index .kibana_7.16.0_001 which might seem wrong. We would have to educate users/support that this could be the expected behaviour.

I know this probably implies more changes, but given we kinda already know we're planning on decoupling model version from Kibana version, would it make sense to go further and change the index naming to something not including the version number? We will likely need to do it later anyway, so now may be the correct option. WDTY @rudolf?

(Hard not to troll here, but we could, idk, use a number-based versioning instead?)

@rudolf
Copy link
Contributor Author

rudolf commented Dec 8, 2022

Hard not to troll here, but we could, idk, use a number-based versioning instead?

That would just be absurd 👎 :trollface:

@rudolf
Copy link
Contributor Author

rudolf commented Dec 9, 2022

Unless we complete #147237 at the same time we will probably have to stop assuming there's just one index name per version to handle the edge cases. Because you could be on v8.8.0 so you have the index .kibana_8.8.0_001 but then you enable a plugin that defines new mappings so we need a new index but we're still on v8.8.0.

Because enabling a new plugin would always only add new fields it would always be a compatible mapping change so #147237 kinda solves this problem.

Edit: in this scenario we'd detect that a version migration was already completed and update the existing index with the mappings of the new plugin.

One useful feature of the version number in the index is that we block upgrades if that version number is newer. So if we remove it from the index we would have to put the Kibana version also into the mappings _meta.

gsoldevila added a commit that referenced this issue Dec 23, 2022
Fixes #124946

## Summary

Takes a step toward optimising our migration paths by only reindexing
(an expensive operation) when needed by checking whether the current SO
mappings have "changed".

By "changed" we mean that we have detected a new md5 checksum of any
registered saved object type relative to the hashes we have stored.

## How to test

These changes are constrained to the `model.ts`, a test was added for
correctly detecting that mappings are the same during the `INIT` phase
to send us down the correct migration path.

Additionally, we have a new Jest integration test `skip_reindex.test.ts`
that runs Kibana and inspects the logs for the expected model
transitions.

Everything else should remain the same.

## Happy path for skipping reindex

```
RUN INIT

IF !versionMigrationIsComplete AND
   !kibana indexBelongsToLaterVersion AND
   !waitForMigrationCompletion AND
   mappingsAreUnchanged
THEN
  the migration operations must target the existing .kibana_x.y.z_001 index

RUN PREPARE_COMPATIBLE_MIGRATION (new state)
    we remove old version aliases (prevent old reads/writes), and set the current version alias (prevent old migrations)

SKIP LEGACY_SET_WRITE_BLOCK
SKIP ...
SKIP SET_SOURCE_WRITE_BLOCK 
SKIP ...
SKIP REFRESH_TARGET

RUN OUTDATED_DOCUMENTS_SEARCH_OPEN_PIT
...
RUN MARK_VERSION_INDEX_READY
DONE
```

## Notes

* This optimisation will only be triggered when there are no mappings
changes AND we are upgrading to a new version. This does not cover all
cases. In future we will make this more sophisticated by checking for
incompatible changes to mappings and only reindexing when those occur.

## Related

* #147503

### Checklist

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

Co-authored-by: kibanamachine <[email protected]>
Co-authored-by: Gerard Soldevila <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Migrations loe:x-large Extra Large Level of Effort project:ResilientSavedObjectMigrations Reduce Kibana upgrade failures by making saved object migrations more resilient Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants