Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Alerting] Fixes Failing test: X-Pack Alerting API Integration Tests.x-pack/test/alerting_api_integration/security_and_spaces/tests/alerting - alerting api integration security and spaces enabled Alerts do stuff when AAD is broken #84707

Merged
merged 9 commits into from
Dec 7, 2020

Conversation

ymao1
Copy link
Contributor

@ymao1 ymao1 commented Dec 1, 2020

Resolves #84007
Resolves #84216
Resolves #84324

Summary

For all the tests that test for broken AAD, added a delay between creating the alert and updating the alert in order to avoid occasional 409 conflict errors.

Flaky test runner results: https://kibana-ci.elastic.co/job/kibana+flaky-test-suite-runner/1035/

The other issue is that the 409 errors come back as 500 "Internal Server Errors". I traced this back to how it's handled within the http router, which only forwards 401 errors from the ES client and throws an internal error for any other status code.

https://github.com/elastic/kibana/blob/master/src/core/server/http/router/router.ts#L252-289

Checklist

@ymao1 ymao1 added Feature:Alerting release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v7.11.0 v8.0.0 labels Dec 2, 2020
@ymao1 ymao1 marked this pull request as ready for review December 2, 2020 00:16
@ymao1 ymao1 requested a review from a team as a code owner December 2, 2020 00:16
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@ymao1 ymao1 self-assigned this Dec 2, 2020
Copy link
Contributor

@YulNaumenko YulNaumenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@mikecote mikecote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one question on the approach as it would add ~50 seconds to the test run time.

@@ -301,6 +301,9 @@ export default function createDeleteTests({ getService }: FtrProviderContext) {
.send(getTestAlertData())
.expect(200);

// Delay before performing update to avoid 409 errors
await new Promise((resolve) => setTimeout(resolve, 1000));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should wrap the call below in a retry.try instead of adding timeouts in the tests? That way it can keep trying for a certain amount of time but in most cases finish the test at normal speed. Unless this was a consistent issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should wrap the call below in a retry.try instead of adding timeouts in the tests? That way it can keep trying for a certain amount of time but in most cases finish the test at normal speed. Unless this was a consistent issue?

Good call. The failures don't happen very often. I've updated the PR to use retry.try. Here are the flaky test runner results after that update: https://kibana-ci.elastic.co/job/kibana+flaky-test-suite-runner/1039/

@ymao1
Copy link
Contributor Author

ymao1 commented Dec 4, 2020

@elasticmachine merge upstream

@ymao1 ymao1 requested a review from mikecote December 4, 2020 22:10
@ymao1
Copy link
Contributor Author

ymao1 commented Dec 7, 2020

@elasticmachine merge upstream

Copy link
Contributor

@mikecote mikecote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM once the retry.try wrapping is changed (based on our offline discussion, see comment below) 👍

Copy link
Contributor

@mikecote mikecote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM now! Great work

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Distributable file count

id before after diff
default 46922 47682 +760

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@ymao1 ymao1 merged commit 46e8c54 into elastic:master Dec 7, 2020
ymao1 added a commit to ymao1/kibana that referenced this pull request Dec 7, 2020
…x-pack/test/alerting_api_integration/security_and_spaces/tests/alerting - alerting api integration security and spaces enabled Alerts do stuff when AAD is broken (elastic#84707)

* Adding delay between creating and updating alert to avoid 409 conflicts

* Unskipping update test

* Using retry.try instead of delay

* PR fixes

Co-authored-by: Kibana Machine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment