Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Monitoring] Only look at ES for the missing data alert for now #83659

Closed

Conversation

chrisronline
Copy link
Contributor

Relates to #83309

Until we can come up with a strategy to detect upgrades versus legitimate downed stack products, we are changing the missing monitoring data alert to only apply to Elasticsearch, as upgrades to Elasticsearch typically persist node ids.

To test, I'd recommend creating a cloud deployment on a version that is less than the current one. Configure your local Kibana's monitoring settings to read monitoring data from this deployment. Then, upgrade the cloud deployment and ensure we don't see any missing monitoring data alerts.

@elasticmachine
Copy link
Contributor

Pinging @elastic/stack-monitoring (Team:Monitoring)

@chrisronline
Copy link
Contributor Author

@elasticmachine merge upstream

@ravikesarwani
Copy link
Contributor

@chrisronline I am thinking that we should modify the "looking back" period for this alert from 1 day to 6 hours.
This will reduce the noise as we still are learning the full behavior of this alert in different scenarios.

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Contributor

@igoristic igoristic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisronline I think we should also optimize (remove none es logic) in: x-pack/plugins/monitoring/server/lib/alerts/fetch_missing_monitoring_data.ts and make it only look in .monitoring-es-* indices, which should make it significantly faster

qn895 and others added 14 commits November 19, 2020 13:01
* update deps

* update rules

use type-aware @typescript-eslint/no-shadow instead of no-shadow. do not use no-undef, rely on TypeScript instead

* fix or mute all lint errors

* react-hooks eslint plugin fails on ? syntax

* fix wrong typings in viz

* remove React as a global type

* fix eslint errors

* update version to 4.8.1

* fix a new error
* [APM] Improve router types

* Pass processorEvent param to useDynamicIndexPattern
* Initial copy/paste of source logic

Only changed lodash imports and import order for linting

* Add types and route

* Update paths and typings

Renamed IMeta -> Meta
Used object instead of IObject

* Remove internal flash messages in favor of globals

- All instances of flashAPIErrors(e) are only placeholders until the later commit removing axios.

- buttonLoading was set to false when the error flash messages were set. For now I added a `setButtonNotLoading` action to do this manually in a finally block. This will be refactored once axios is removed.

- SourcesLogic is no longer needed because we set a queued flash message instead of trying to set it in SourcesLogic, which no longer has local flash messages

* Add return types to callback definitions

* Update routes

According to the API info getSourceReConnectData is supposed to send the source ID and not the service type. In the template, we are actually sending the ID but the logic file parameterizes it as serviceType. This is fixed here.

Usage: https://github.com/elastic/ent-search/blob/master/app/javascript/workplace_search/ContentSources/components/AddSource/ReAuthenticate.tsx#L38

* Replace axios with HttpLogic

Also removes using history in favor of KibanaLogic’s navigateToUrl

* Fix incorrect type

This selector is actually an array of strings

* Create GenericObject to satisfy TypeScript

Previously in `ent-search`, we had a generic `IObject` interface that we could use on keyed objects. It was not migrated over since it uses `any` and Kibana has a generic `object` type we can use in most situations. However, when we are checking for keys in our code, `object` does not work. This commit is an attempt at making a generic interface we can use.

* More strict object typing

Removes GenericObject from last commit and adds stricter local typing

* Add i18n

Also added for already-merged SourcesLogic

* Move button loading action to finally block

* Move route strings to inline
… edit (elastic#83578)

* Adding alert.updatedAt field that only updates on user edit

* Updating unit tests

* Functional tests

* Updating alert attributes excluded from AAD

* Fixing test

* PR comments
@chrisronline chrisronline requested review from a team as code owners November 19, 2020 18:01
@botelastic botelastic bot added Feature:Embedding Embedding content via iFrame Team:APM All issues that need APM UI Team support Team:Fleet Team label for Observability Data Collection Fleet team Team:Uptime - DEPRECATED Synthetics & RUM sub-team of Application Observability labels Nov 19, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/apm-ui (Team:apm)

@elasticmachine
Copy link
Contributor

Pinging @elastic/ingest-management (Team:Ingest Management)

@elasticmachine
Copy link
Contributor

Pinging @elastic/uptime (Team:uptime)

@chrisronline chrisronline removed Feature:Embedding Embedding content via iFrame Team:Fleet Team label for Observability Data Collection Fleet team Team:Monitoring Stack Monitoring team Team:APM All issues that need APM UI Team support Team:Uptime - DEPRECATED Synthetics & RUM sub-team of Application Observability release_note:fix review v7.10.1 v7.11.0 v8.0.0 labels Nov 19, 2020
@chrisronline
Copy link
Contributor Author

Sorry folks, bad merge. Replaced by #83839

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.