Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If user reindexes sample data from 6.8.x in 7.16 and then tries to install them in a new space Kibana displays internal server error #120137

Closed
bhavyarm opened this issue Dec 1, 2021 · 18 comments · Fixed by #122689
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:Add Data Add Data and sample data feature on Home Feature:Home Kibana home application impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. loe:small Small Level of Effort Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@bhavyarm
Copy link
Contributor

bhavyarm commented Dec 1, 2021

Kibana version: 7.16.0 snapshot Dec 1st

Elasticsearch version: 7.16.0 snapshot Dec 1st

Server OS version: darwin_x86_64

Browser version: chrome latest

Browser OS version: OS X

Original install method (e.g. download page, yum, from source, etc.): from snapshots

Describe the bug: If user has sample data installed in 6.8.x and upgrades to 7.16. and reindexes 6.8 indices and then creates a new space and tries to install sample data - Kibana displays internal server error.

Steps to reproduce:

  1. Create sample data in 6.8.20 - upgrade to 7.16
  2. Go to upgrade assistant - reindex sample data
  3. Create a new space and click on add integrations -> sample data -> try to install sample data - Kibana fails to install them with an internal server error because of
error: invalid_index_name_exception: [invalid_index_name_exception] Reason: Invalid index name [kibana_sample_data_ecommerce], already exists as alias

Screenshots (if relevant):

Screen Shot 2021-12-01 at 12 28 25 PM

Errors in browser console (if relevant):

Failed to load resource: the server responded with a status of 500 (Internal Server Error) 
:5601/s/space_x/api/sample_data/ecommerce:1 

Provide logs and/or server output (if relevant):

Kibana logs:

error  [11:22:18.074]  Error: Internal Server Error
    at HapiResponseAdapter.toInternalError (/Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/src/core/server/http/router/response_adapter.js:61:19)
    at Router.handle (/Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/src/core/server/http/router/router.js:172:34)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at handler (/Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/src/core/server/http/router/router.js:124:50)
    at exports.Manager.execute (/Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/node_modules/@hapi/hapi/lib/toolkit.js:60:28)
    at Object.internals.handler (/Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/node_modules/@hapi/hapi/lib/handler.js:46:20)
    at exports.execute (/Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/node_modules/@hapi/hapi/lib/handler.js:31:20)
    at Request._lifecycle (/Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/node_modules/@hapi/hapi/lib/request.js:371:32)
    at Request._execute (/Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/node_modules/@hapi/hapi/lib/request.js:281:9)
  log   [11:40:28.319] [info][0][1][endpoint:metadata-check-transforms-task:0][plugins][securitySolution] no endpoint metadata transforms found
  log   [12:28:23.154] [warning][home][plugins][sampleData] Unable to create sample data index "kibana_sample_data_ecommerce", error: invalid_index_name_exception: [invalid_index_name_exception] Reason: Invalid index name [kibana_sample_data_ecommerce], already exists as alias
  log   [12:28:23.154] [error][http] Error: options.statusCode is expected to be set. given options: undefined
    at Object.customError (/Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/src/core/server/http/router/response.js:129:13)
    at /Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/src/plugins/home/server/services/sample_data/routes/install.js:117:20
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at Router.handle (/Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/src/core/server/http/router/router.js:163:30)
    at handler (/Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/src/core/server/http/router/router.js:124:50)
    at exports.Manager.execute (/Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/node_modules/@hapi/hapi/lib/toolkit.js:60:28)
    at Object.internals.handler (/Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/node_modules/@hapi/hapi/lib/handler.js:46:20)
    at exports.execute (/Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/node_modules/@hapi/hapi/lib/handler.js:31:20)
    at Request._lifecycle (/Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/node_modules/@hapi/hapi/lib/request.js:371:32)
    at Request._execute (/Users/bhavyarajumandya/Desktop/7.16_snapshot_upgrade/No_error_upgrade/kibana-7.16.0-SNAPSHOT-darwin-x86_64/node_modules/@hapi/hapi/lib/request.js:281:9)
 error  [12:28:23.140]  Error: Internal Server Error
@bhavyarm bhavyarm added the bug Fixes for quality problems that affect the customer experience label Dec 1, 2021
@botelastic botelastic bot added the needs-team Issues missing a team label label Dec 1, 2021
@bhavyarm bhavyarm added Feature:Add Data Add Data and sample data feature on Home Feature:Home Kibana home application Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc labels Dec 1, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@botelastic botelastic bot removed the needs-team Issues missing a team label label Dec 1, 2021
@pgayvallet
Copy link
Contributor

  1. Go to upgrade assistant - reindex sample data

@bhavyarm is this step required to see the sample data on the 7.x instance? What happens if going directly from step 1 to step 3?

@elastic/kibana-stack-management what exactly does this action do?

@cjcenizal
Copy link
Contributor

@pgayvallet Reindexing indices from 6.x will remove these deprecated settings from them:

const removeUnsettableSettings = (settings: FlatSettings['settings']) =>
omit(settings, [
// Private ES settings
'index.allocation.existing_shards_allocator',
'index.blocks.write',
'index.creation_date',
'index.frozen',
'index.history.uuid',
'index.merge.enabled',
'index.provided_name',
'index.resize.source.name',
'index.resize.source.uuid',
'index.routing.allocation.initial_recovery._id',
'index.search.throttled',
'index.source_only',
'index.shrink.source.name',
'index.shrink.source.uuid',
'index.store.snapshot.repository_name',
'index.store.snapshot.snapshot_name',
'index.store.snapshot.snapshot_uuid',
'index.store.snapshot.index_name',
'index.store.snapshot.index_uuid',
'index.uuid',
'index.verified_before_close',
'index.version.created',
// Ignored since 6.x and forbidden in 7.x
'index.mapper.dynamic',
// Deprecated in 9.0
'index.version.upgraded',
]);
const removeDeprecatedSettings = (settings: FlatSettings['settings']) => {
const updatedSettings = { ...settings };
// Translog settings are only marked as deprecated if soft deletes is enabled
// @ts-expect-error @elastic/elasticsearch doesn't declare such a setting
if (updatedSettings['index.soft_deletes.enabled'] === 'true') {
// @ts-expect-error @elastic/elasticsearch doesn't declare such a setting
if (updatedSettings['index.translog.retention.size']) {
// @ts-expect-error @elastic/elasticsearch doesn't declare such a setting
delete updatedSettings['index.translog.retention.size'];
}
// @ts-expect-error @elastic/elasticsearch doesn't declare such a setting
if (settings['index.translog.retention.age']) {
// @ts-expect-error @elastic/elasticsearch doesn't declare such a setting
delete updatedSettings['index.translog.retention.age'];
}
}
return omit(updatedSettings, deprecatedSettings);
};
// Use `flow` to pipe the settings through each function.
const transformSettings = flow(removeUnsettableSettings, removeDeprecatedSettings);

@cjcenizal
Copy link
Contributor

Who owns the sample data? It seems like there's a bug in the logic because in @bhavyarm's screenshot, the "Add data" buttons are active:

Screen Shot 2021-12-01 at 12 28 25 PM

If the sample data was installed in 6.8, then these buttons should be replaced with "Remove data" buttons. I think an engineer from the team that owns the sample data needs to take a look at this logic and identify the root cause of this unexpected behavior.

@bhavyarm
Copy link
Contributor Author

bhavyarm commented Dec 6, 2021

@cjcenizal we are allowed to install sample data in every space. Please note this is a new space I created in 7.16.

@pgayvallet if user upgrades to 7.16 from 6.8.x and creates a new space and installs sample data without touching upgrade assistant - everything works fine. But if user goes to upgrade assistant reindexes sample data and then creates a new space and tries to install sample data things break

These log lines are the problem.

log   [12:28:23.154] [warning][home][plugins][sampleData] Unable to create sample data index "kibana_sample_data_ecommerce", error: invalid_index_name_exception: [invalid_index_name_exception] Reason: Invalid index name [kibana_sample_data_ecommerce], already exists as alias

@cjcenizal
Copy link
Contributor

Oh I missed the part about creating a new space! Thanks for pointing that out.

@pgayvallet
Copy link
Contributor

if user upgrades to 7.16 from 6.8.x and creates a new space and installs sample data without touching upgrade assistant - everything works fine. But if user goes to upgrade assistant reindexes sample data and then creates a new space and tries to install sample data things break

@bhavyarm In the scenario where the user did not use UA to reindex in 7.16, he still did install the sample data in the default space in 6.8, right? Just want to be sure, because given @cjcenizal's answer, if the UA reindex action just performs an update of the index's setting, the behavior should be the same with or without reindex from UA.

Given the error Reason: Invalid index name [kibana_sample_data_ecommerce], already exists as alias, it makes me wonder if we weren't using an alias for these sample data in 6.x. @cjcenizal, can you please just (double) confirm that the UA's reindex action does not do fancy things on moving/creating aliases?

Also may be related to #116677

@yuliacech
Copy link
Contributor

@pgayvallet @cjcenizal The reindexing action creates an alias to point from the old index to the new index. Here is an example of the process for an index test_data:

  1. Set index test_data to read only
  2. Create new index reindexed-v6.8-test_data
  3. Reindex documents from test_data to reindexed-v6.8-test_data
  4. Create alias test_data that points to reindexed-v6.8-test_data
  5. Delete index test_data

@bhavyarm
Copy link
Contributor Author

bhavyarm commented Dec 7, 2021

@bhavyarm In the scenario where the user did not use UA to reindex in 7.16, he still did install the sample data in the default space in 6.8, right?

@pgayvallet yes.

@exalate-issue-sync exalate-issue-sync bot added impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. loe:small Small Level of Effort labels Dec 7, 2021
@pgayvallet
Copy link
Contributor

pgayvallet commented Dec 8, 2021

The reindexing action creates an alias to point from the old index to the new index. Here is an example of the process for an index test_data

So at the end of the reindex operation, the test_data index is deleted, and there is a new test_data alias pointing to reindexed-v6.8-test_data. This is some disruptive action, that any consumer of the index has to be aware of and take into account when performing operations.

This causes the uninstall/reinstall logic for the sample data sets to fail: the indices.delete operation fails (silently, sigh..) because the target is an alias, and then the indices.create operation fails because an alias already exists with the same name.

// clean up any old installation of dataset
try {
await context.core.elasticsearch.client.asCurrentUser.indices.delete({
index,
});
} catch (err) {
// ignore delete errors
}
try {
await context.core.elasticsearch.client.asCurrentUser.indices.create({
index,
body: {
settings: { index: { number_of_shards: 1, auto_expand_replicas: '0-1' } },
mappings: { properties: dataIndexConfig.fields },
},
});

We could eventually add additional logic to check if the target index is an alias first, and then delete both the alias and its target, but to be honest, that doesn't feel right. I don't think any upgrade-related operation should alter the state of indices in such a significant way. After a reindex, I would expect to have an index with the same name, not an alias pointing to another index. This is dangerous for any index-based operations (such as deletes).

@cjcenizal
Copy link
Contributor

After a reindex, I would expect to have an index with the same name, not an alias pointing to another index

I tend to agree with you, though this runs up against the current Elasticsearch model. Renaming an index in place isn't possible today, AFAIK. From what I can tell the prescribed method of renaming is to use an alias, or to do a full reindex.

We should verify that a second full reindex wouldn't be feasible for UA, but assuming that it isn't, perhaps the optimal path is to add that additional logic to Sample Data. This is the first time I believe we've encountered this problem, which I do find a bit surprising but there it is.

@bhavyarm
Copy link
Contributor Author

bhavyarm commented Dec 9, 2021

cc @LeeDr

@pgayvallet
Copy link
Contributor

pgayvallet commented Dec 13, 2021

We should verify that a second full reindex wouldn't be feasible for UA, but assuming that it isn't, perhaps the optimal path is to add that additional logic to Sample Data

If this is too much work, or just isn't possible to do from UA, I agree that we'll 'just' have to do it from the home plugin. I just hate the idea to add additional bullet-proofing logic to protect against changes in the indices/aliases that aren't performed from the owning plugin. I mean, what proof do we have that these sample data indices are the only ones impacted by this problem? I doubt these are the only data indices that can be created/deleted 'internally' by Kibana?

This is the first time I believe we've encountered this problem, which I do find a bit surprising but there it is.

I agree, this is surprising 😅

@cjcenizal
Copy link
Contributor

@sebelga Could you please investigate this? I see two main questions we want to answer.

Is a second full reindex feasible?

To conclude Upgrade Assistant's reindex process with an updated index that has the same name as the original, we'd need to reindex the update index into a new index. We wouldn't need to make any changes to the destination index other than giving it a new name. What's the performance impact, and how does this performance change as the size of the index grows? If the performance impact is O(1) or O(log N) then a second full reindex is feasible. Otherwise, I'd say it isn't.

Where else might this problem manifest?

If Elastic has more logic that's similar to the Sample Data, in which datasets are installed and later detected through the presence of indices, then we'll probably trip over this again. For example, do any Integrations install datasets? If Upgrade Assistant reindexes those datasets, will that break their logic for detecting installed Integrations, uninstalling Integrations, or migrating them during version upgrades?

@sebelga
Copy link
Contributor

sebelga commented Dec 14, 2021

Is a second full reindex feasible?

Talking with @martijnvg it does seem that a bigger index will not linearly increase the time of reindex and is not an O(1) op. As per his comment:

"in the hot threads a significant time is spent on looking up the id/version during reindexing. This is something that reindex does to ensure that no duplicates are indexed into the target index. But as the destination index grows so does the cost/time it takes to lookup this _id/version".

Where else might this problem manifest?

It seems that the issue we are discovering could occur on any dataset, not only the ones from our products/integrations, but also the ones from our users.

@cjcenizal
Copy link
Contributor

cjcenizal commented Dec 14, 2021

Seb and I chatted and we agree that the ideal solution would be to rename the index in place (elastic/elasticsearch#37880). Unfortunately, the ES Data Management team tells me it's not possible to implement this for 7.17.

It's also not feasible to add a second reindexing phase as I originally proposed, because we're still stuck with the problem of having to create an index with a name that collides with an index alias. The ES Data Management team also says it's not feasible to provide an API option to force this.

As an outcome, we're left with three steps:

  1. We'll update the Upgrade Assistant UI to clarify the changes being made to indices (Inform user about changes to reindexed data in Upgrade Assistant #121220)
  2. We'll patch the sample data logic to check for the original index or an alias that points to the original index.
  3. We'll notify other Kibana contributors that deletion of indices isn't proxied by their aliases, so anywhere that our code deletes an index we should also check to see if it's aliased. This is the guidance provided by the ES Data Management team.

@sebelga
Copy link
Contributor

sebelga commented Dec 15, 2021

It's also not feasible to add a second reindexing phase as I originally proposed, because we're still stuck with the problem of having to create an index with a name that collides with an index alias.

If we add a second reindexing phase we don't need to create an alias, right?

Adding a second reindexing phase and keeping an index (instead of using an alias) seems the least intrusive change for our users. It would initially be less performant than a "rename in place API" but we'll update and use that API once it is ready. Thoughts @cjcenizal ?

@cjcenizal
Copy link
Contributor

We discussed this sync, so I'll record what we discussed here for visibility.

If we add a second reindexing phase we don't need to create an alias, right?

Right.

Adding a second reindexing phase and keeping an index (instead of using an alias) seems the least intrusive change for our users. It would initially be less performant than a "rename in place API" but we'll update and use that API once it is ready. Thoughts @cjcenizal ?

Agreed. The main concern is cost, since reindexing incurs data transfer costs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Add Data Add Data and sample data feature on Home Feature:Home Kibana home application impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. loe:small Small Level of Effort Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants