Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase alerting test stability and reduce flakiness #50246

Merged
merged 12 commits into from
Nov 18, 2019

Conversation

mikecote
Copy link
Contributor

@mikecote mikecote commented Nov 12, 2019

Fixes #50078
Fixes #48709
Fixes #50074
Fixes #50079

In this PR, I'm fixing tests that started failing on master due to Task Manager changes. By being lucky, there wasn't much of a delay between scheduling a task and executing previously. With the new performance improvements and method of claiming tasks, I had to adjust our tests to not always assume tasks run fairly quickly.

Also while stabilizing the tests I found a bug with throttling which is fixed in this PR.

@mikecote mikecote self-assigned this Nov 12, 2019
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-stack-services (Team:Stack Services)

@mikecote
Copy link
Contributor Author

@elasticmachine merge upstream

@elasticmachine

This comment has been minimized.

Copy link
Contributor

@bmcconaghy bmcconaghy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me. Thanks for prioritizing re-enabling these tests!

@elasticmachine

This comment has been minimized.

@mikecote mikecote requested a review from a team November 14, 2019 20:33
@elasticmachine

This comment has been minimized.

@elasticmachine

This comment has been minimized.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@mikecote
Copy link
Contributor Author

@elasticmachine merge upstream

@mikecote
Copy link
Contributor Author

mikecote commented Nov 18, 2019

Changed the logic for alert tests to be a bit more consistent. Where it makes sense, the tests will use the following pattern:

  1. Allow alert to execute the necessary amount of times
  2. Disable the alert (this removes the task from task manager and prevent further actions from happening)
  3. Wait for task manager tasks to finish running
  4. Do assertions without retry logic

@mikecote mikecote requested a review from pmuellr November 18, 2019 14:11
@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice to see the CI passed!

@@ -9,7 +9,7 @@ export function getTestAlertData(overwrites = {}) {
enabled: true,
name: 'abc',
alertTypeId: 'test.noop',
interval: '10s',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about the change in interval from 10s to 1m. Does the increased latency here help cut down on "noisy" alerts generating more test es docs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I mostly bumped it for protection if ever it would execute the alert again. I haven't seen that scenario happen myself but could see it if we're waiting > 10s for es docs. There was no reason behind the 10s so I figured I might as well bump it to something less likely to run.

Copy link
Contributor

@bmcconaghy bmcconaghy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@mikecote
Copy link
Contributor Author

Will merge once the flaky suite runner is finished: https://kibana-ci.elastic.co/job/kibana+flaky-test-suite-runner/10/. This will run the ciGroup1 for xpack 42 times to ensure no flaky failures.

@mikecote mikecote merged commit 10c158b into elastic:master Nov 18, 2019
mikecote added a commit to mikecote/kibana that referenced this pull request Nov 18, 2019
* Increase alerting test stability

* More changes to test methodology + bug fix in throttling

* Fix comments

* Cleanup

* Typo

* Fix broken tests

* Fix integration tests

* Fix typo
mikecote added a commit that referenced this pull request Nov 18, 2019
* Increase alerting test stability

* More changes to test methodology + bug fix in throttling

* Fix comments

* Cleanup

* Typo

* Fix broken tests

* Fix integration tests

* Fix typo
jloleysens added a commit to jloleysens/kibana that referenced this pull request Nov 18, 2019
…-fallback

* 'master' of github.com:elastic/kibana: (116 commits)
  [Maps] move apply global filter settting from layer to source (elastic#50523)
  [SIEM] Fix: Empty `Source` / `Destination` shown when only ports are populated (elastic#50843)
  [Maps] Delay vector tile layer syncing until spritesheet is loaded (elastic#48955)
  [Maps] prevent users from overflowing URL when filtering by shape (elastic#50747)
  [DOCS] Mark Beats central management as discontinued (elastic#49423)
  [page_objects/common_page] convert to ts (elastic#50771)
  [NP Kibana Migrations ] kibana plugin home (elastic#50444)
  [DOCS] Shareables naming convention (elastic#50497)
  [ML] DF Analytics - auto-populate model_memory_limit (elastic#50714)
  Increase alerting test stability and reduce flakiness (elastic#50246)
  [ML] Remaning new_job_new folder (elastic#50917)
  [Telemetry] Show opt-in changes for OSS users (elastic#50831)
  [ML] Fix lat_long anomalies table links menu and value formatting (elastic#50916)
  [Dev] Fix serialising a really big string (elastic#50915)
  Better explanation about the Prettier recommendation (extension vs. NPM module) (elastic#50629)
  [Monitoring] Use a basic monitoring user for tests (elastic#47865)
  [Monitoring] Gracefully handle issue with filebeat indices (elastic#48929)
  [Monitoring] Improve permissions required around setup mode (elastic#50421)
  Additional validation for elasticsearch username (elastic#48247)
  Revert changes to use_kibana_ui_setting (elastic#50877)
  ...

# Conflicts:
#	src/legacy/core_plugins/console/server/request.test.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment