Need performant method of determining whether there are indices #112307

mattkime · 2024-08-28T19:13:57Z

Description

Kibana needs to determine whether it should display a "Getting started" flow in a number of places, which is done with this utility - https://github.com/elastic/kibana/blob/main/src/plugins/data_views/public/services/has_data.ts

The resolve/index api doesn't scale since it always returns ALL the indices. This can be extremely slow for large and complicated deployments - https://github.com/elastic/sdh-kibana/issues/4877

In my tests, using the search API for this is also relatively slow. This functionality should have a sub second response time.

The resolve/cluster api works well for this BUT its not available on serverless. As a compromise, we're using the resolve/cluster API, falling back to resolve/index as needed.

Ideally, the API would take an index pattern with exclusions and simply return a boolean. We really don't need the list of matched indices. In theory, determining if there is user data should be easy but in practice we have a list of indices that are created by kibana that we don't consider to be user data. I don't entirely understand why this is the case but it would take an organizational effort to resolve it.

… for improved performance (#191566) ## Summary The `resolve/cluster` api is MUCH more efficient for determining whether there are user created indices than the `resolve/index` api. The `resolve/index` api returns the FULL list of indices which can be very large. Unfortunately the `resolve/cluster` api isn't available on serverless so we rely on the existing `hasESData` behavior when its not available. Closes #190554 Created elastic/elasticsearch#112307 in hopes of getting an api thats performant in serverless and classic environments. Additional detail - `logs-enterprise_search.api-default` and `logs-enterprise_search.audit-default` should be ignored for the purposes of user created data. --- Testing - verify the loading data flows display as appropriate for discover and data view management. Create and delete indices. ## Release notes In deployments with thousands of indices and index aliases, browser calls to `/internal/index-pattern-management/resolve_index` can be very slow (more than 10s). Its been replaced with `/internal/data_views/has_es_data` which is much faster (<1s). --------- Co-authored-by: Davis McPhee <[email protected]>

… for improved performance (elastic#191566) ## Summary The `resolve/cluster` api is MUCH more efficient for determining whether there are user created indices than the `resolve/index` api. The `resolve/index` api returns the FULL list of indices which can be very large. Unfortunately the `resolve/cluster` api isn't available on serverless so we rely on the existing `hasESData` behavior when its not available. Closes elastic#190554 Created elastic/elasticsearch#112307 in hopes of getting an api thats performant in serverless and classic environments. Additional detail - `logs-enterprise_search.api-default` and `logs-enterprise_search.audit-default` should be ignored for the purposes of user created data. --- Testing - verify the loading data flows display as appropriate for discover and data view management. Create and delete indices. ## Release notes In deployments with thousands of indices and index aliases, browser calls to `/internal/index-pattern-management/resolve_index` can be very slow (more than 10s). Its been replaced with `/internal/data_views/has_es_data` which is much faster (<1s). --------- Co-authored-by: Davis McPhee <[email protected]> (cherry picked from commit 86cfcab)

…e index for improved performance (#191566) (#192143) # Backport This will backport the following commits from `main` to `8.15`: - [[data views / hasData] Check resolve cluster instead of resolve index for improved performance (#191566)](#191566)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport)  Co-authored-by: Matthew Kime <[email protected]>

benwtrent · 2024-09-11T19:41:21Z

@mattkime this doesn't work? https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-exists.html

elasticsearchmachine · 2024-09-13T18:29:20Z

Pinging @elastic/es-data-management (Team:Data Management)

javanna · 2024-09-13T18:46:16Z

I labeled this data management because that's the team that cares for indices level API, like resolve index. Turns out that this issue has been overcome by using resolve cluster instead against Elasticsearch. Such API is built to resolve clusters though, and I wonder whether there is a different way that I am not aware of. Or perhaps if we need a specific API for this purpose. It seems like a bug that none of our API serve valid purpose besides one that was built for an entirely different scenario :)

…a to hang (#200476) ## Summary This PR mitigates an issue where the `has_es_data` check can hang when some remote clusters are unresponsive, leaving users stuck in a loading state in some apps (e.g. Discover and Dashboard) until the request times out. There are two main changes that help mitigate this issue: - The `resolve/cluster` request in the `has_es_data` endpoint has been split into two requests -- one for local data first, then another for remote data second. In cases where remote clusters are unresponsive but there is data available in the local cluster, the remote check is never performed and the check completes quickly. This likely resolves the majority of cases and is also likely faster in general than checking both local and remote clusters in a single request. - In cases where there is no local data and the remote `resolve/cluster` request hangs, a new `data_views.hasEsDataTimeout` config has been added to `kibana.yml` (defaults to 5 seconds) to abort the request after a short delay. This scenario is handled in the front end by displaying an error toast to the user informing them of the issue, and assuming there is data available to avoid blocking them. When this occurs, a warning is also logged to the Kibana server logs. ![CleanShot 2024-11-18 at 23 47 34@2x](https://github.com/user-attachments/assets/6ea14869-b6b6-4d89-a90c-8150d6e6b043) Fixes #200280. ### Notes - Modifying the existing version of the `has_es_data` endpoint in this way should be backward compatible since the behaviour should remain unchanged from before when the client and server versions don't match (please validate if this seems accurate during review). - For a long term fix, the ES team is investigating the issue with `resolve/cluster` and will aim to have it behave like `resolve/index`, which fails quickly when remote clusters are unresponsive. They may also implement other mitigations like a configurable timeout in ES: elastic/elasticsearch#114020. The purpose of this PR is to provide an immediate solution in Kibana that mitigates the issue as much as possible. - If ES ends up providing another performant method for checking if indices exist instead of `resolve/cluster`, Kibana should migrate to that. More details in elastic/elasticsearch#112307. ### Testing notes To reproduce the issue locally, follow these steps: - Follow [these instructions](https://gist.github.com/lukasolson/d0861aa3e6ee476ac8dd7189ed476756) to set up a local CCS environment. - Stop the remote cluster process. - Use Netcat on the remote cluster port to listen to requests but not respond (e.g. on macOS: `nc -l 9600`), simulating an unresponsive cluster. See elastic/elasticsearch#32678 for more context. - Navigate to Discover and observe that the `has_es_data` request hangs. When testing in this PR branch, the request will only wait for 5 seconds before assuming data exists and displaying a toast. ### Checklist - [x] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md) - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [ ] If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the [docker list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker) - [x] This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The `release_note:breaking` label should be applied in these situations. - [ ] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed - [x] The PR description includes the appropriate Release Notes section, and the correct `release_node:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) --------- Co-authored-by: kibanamachine <[email protected]>

…a to hang (elastic#200476) ## Summary This PR mitigates an issue where the `has_es_data` check can hang when some remote clusters are unresponsive, leaving users stuck in a loading state in some apps (e.g. Discover and Dashboard) until the request times out. There are two main changes that help mitigate this issue: - The `resolve/cluster` request in the `has_es_data` endpoint has been split into two requests -- one for local data first, then another for remote data second. In cases where remote clusters are unresponsive but there is data available in the local cluster, the remote check is never performed and the check completes quickly. This likely resolves the majority of cases and is also likely faster in general than checking both local and remote clusters in a single request. - In cases where there is no local data and the remote `resolve/cluster` request hangs, a new `data_views.hasEsDataTimeout` config has been added to `kibana.yml` (defaults to 5 seconds) to abort the request after a short delay. This scenario is handled in the front end by displaying an error toast to the user informing them of the issue, and assuming there is data available to avoid blocking them. When this occurs, a warning is also logged to the Kibana server logs. ![CleanShot 2024-11-18 at 23 47 34@2x](https://github.com/user-attachments/assets/6ea14869-b6b6-4d89-a90c-8150d6e6b043) Fixes elastic#200280. ### Notes - Modifying the existing version of the `has_es_data` endpoint in this way should be backward compatible since the behaviour should remain unchanged from before when the client and server versions don't match (please validate if this seems accurate during review). - For a long term fix, the ES team is investigating the issue with `resolve/cluster` and will aim to have it behave like `resolve/index`, which fails quickly when remote clusters are unresponsive. They may also implement other mitigations like a configurable timeout in ES: elastic/elasticsearch#114020. The purpose of this PR is to provide an immediate solution in Kibana that mitigates the issue as much as possible. - If ES ends up providing another performant method for checking if indices exist instead of `resolve/cluster`, Kibana should migrate to that. More details in elastic/elasticsearch#112307. ### Testing notes To reproduce the issue locally, follow these steps: - Follow [these instructions](https://gist.github.com/lukasolson/d0861aa3e6ee476ac8dd7189ed476756) to set up a local CCS environment. - Stop the remote cluster process. - Use Netcat on the remote cluster port to listen to requests but not respond (e.g. on macOS: `nc -l 9600`), simulating an unresponsive cluster. See elastic/elasticsearch#32678 for more context. - Navigate to Discover and observe that the `has_es_data` request hangs. When testing in this PR branch, the request will only wait for 5 seconds before assuming data exists and displaying a toast. ### Checklist - [x] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md) - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [ ] If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the [docker list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker) - [x] This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The `release_note:breaking` label should be applied in these situations. - [ ] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed - [x] The PR description includes the appropriate Release Notes section, and the correct `release_node:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) --------- Co-authored-by: kibanamachine <[email protected]> (cherry picked from commit 96fd4b6)

… can cause Kibana to hang (#200476) (#201025) # Backport This will backport the following commits from `main` to `8.x`: - [[Data Views] Mitigate issue where `has_es_data` check can cause Kibana to hang (#200476)](#200476)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport)  Co-authored-by: Davis McPhee <[email protected]>

…k can cause Kibana to hang (#200476) (#201024) # Backport This will backport the following commits from `main` to `8.16`: - [[Data Views] Mitigate issue where `has_es_data` check can cause Kibana to hang (#200476)](#200476)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport)  Co-authored-by: Davis McPhee <[email protected]>

…k can cause Kibana to hang (#200476) (#201023) # Backport This will backport the following commits from `main` to `8.15`: - [[Data Views] Mitigate issue where `has_es_data` check can cause Kibana to hang (#200476)](#200476)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport)  --------- Co-authored-by: Davis McPhee <[email protected]>

…a to hang (elastic#200476) ## Summary This PR mitigates an issue where the `has_es_data` check can hang when some remote clusters are unresponsive, leaving users stuck in a loading state in some apps (e.g. Discover and Dashboard) until the request times out. There are two main changes that help mitigate this issue: - The `resolve/cluster` request in the `has_es_data` endpoint has been split into two requests -- one for local data first, then another for remote data second. In cases where remote clusters are unresponsive but there is data available in the local cluster, the remote check is never performed and the check completes quickly. This likely resolves the majority of cases and is also likely faster in general than checking both local and remote clusters in a single request. - In cases where there is no local data and the remote `resolve/cluster` request hangs, a new `data_views.hasEsDataTimeout` config has been added to `kibana.yml` (defaults to 5 seconds) to abort the request after a short delay. This scenario is handled in the front end by displaying an error toast to the user informing them of the issue, and assuming there is data available to avoid blocking them. When this occurs, a warning is also logged to the Kibana server logs. ![CleanShot 2024-11-18 at 23 47 34@2x](https://github.com/user-attachments/assets/6ea14869-b6b6-4d89-a90c-8150d6e6b043) Fixes elastic#200280. ### Notes - Modifying the existing version of the `has_es_data` endpoint in this way should be backward compatible since the behaviour should remain unchanged from before when the client and server versions don't match (please validate if this seems accurate during review). - For a long term fix, the ES team is investigating the issue with `resolve/cluster` and will aim to have it behave like `resolve/index`, which fails quickly when remote clusters are unresponsive. They may also implement other mitigations like a configurable timeout in ES: elastic/elasticsearch#114020. The purpose of this PR is to provide an immediate solution in Kibana that mitigates the issue as much as possible. - If ES ends up providing another performant method for checking if indices exist instead of `resolve/cluster`, Kibana should migrate to that. More details in elastic/elasticsearch#112307. ### Testing notes To reproduce the issue locally, follow these steps: - Follow [these instructions](https://gist.github.com/lukasolson/d0861aa3e6ee476ac8dd7189ed476756) to set up a local CCS environment. - Stop the remote cluster process. - Use Netcat on the remote cluster port to listen to requests but not respond (e.g. on macOS: `nc -l 9600`), simulating an unresponsive cluster. See elastic/elasticsearch#32678 for more context. - Navigate to Discover and observe that the `has_es_data` request hangs. When testing in this PR branch, the request will only wait for 5 seconds before assuming data exists and displaying a toast. ### Checklist - [x] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md) - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [ ] If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the [docker list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker) - [x] This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The `release_note:breaking` label should be applied in these situations. - [ ] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed - [x] The PR description includes the appropriate Release Notes section, and the correct `release_node:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) --------- Co-authored-by: kibanamachine <[email protected]>

mattkime added >enhancement needs:triage Requires assignment of a team area label labels Aug 28, 2024

mattkime mentioned this issue Aug 30, 2024

[data views / hasData] Check resolve cluster instead of resolve index for improved performance elastic/kibana#191566

Merged

elastic deleted a comment from seang-es Sep 13, 2024

javanna added :Data Management/Indices APIs APIs to create and manage indices and templates and removed needs:triage Requires assignment of a team area label labels Sep 13, 2024

elasticsearchmachine added the Team:Data Management Meta label for data/management team label Sep 13, 2024

This was referenced Nov 14, 2024

[Data Views] has_es_data request hangs when remote clusters are unresponsive elastic/kibana#200280

Closed

[Data Views] Mitigate issue where has_es_data check can cause Kibana to hang elastic/kibana#200476

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need performant method of determining whether there are indices #112307

Need performant method of determining whether there are indices #112307

mattkime commented Aug 28, 2024 •

edited

Loading

benwtrent commented Sep 11, 2024

elasticsearchmachine commented Sep 13, 2024

javanna commented Sep 13, 2024

Need performant method of determining whether there are indices #112307

Need performant method of determining whether there are indices #112307

Comments

mattkime commented Aug 28, 2024 • edited Loading

Description

benwtrent commented Sep 11, 2024

elasticsearchmachine commented Sep 13, 2024

javanna commented Sep 13, 2024

mattkime commented Aug 28, 2024 •

edited

Loading