-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security Solution] Unskip flaky tests in Prebuilt Rules FTR Integration tests #173998
Conversation
@banderror I'd like your opinion on this, whenever you can take a look. I wasn't able to reproduce the flake reported in either of the two linked tickets - I've run the flaky test runner with 500 runs for each, and all were green. So the issue has been tough to investigate, but I have two theories/proposed solutions. 1. Cache for ES reading requests needs to be cleared This is what's currently implemented in this PR. I noticed that the issue in both tickets pops up in test suites where we attempt to read the status of the prebuilt rules before and after installing our prebuilt rules. In both cases, we first check that no rules are available for install, then install prebuilt rule assets, and then read the status again to assert that rules are now available for install. I'm not completely sure how cacheing in ES works, but I suspect that in very specific cases (one in thousands) the cache for the request is not completely cleared after the first read, and even though we correctly install the prebuilt rule assets and refresh the indices, the subsequent read operations simply responds the cached value - which is still If that is the case, we can use 2. Implement retry logic for There's a retry tool that is used extensively in the FTR Integration tests that we can make use of in the helpers that install the package. In all cases for these utils, we want to install at least one asset. If no assets are present, that means that the installation failed, and that we should retry it. So we can modify our helper to something like: export const installPrebuiltRulesPackageByVersion = async (
es: Client,
supertest: SuperTest.SuperTest<SuperTest.Test>,
version: string,
retry: RetryService
): Promise<InstallPackageResponse> => {
return retry.try(async () => {
const fleetResponse = await supertest
.post(epmRouteService.getInstallPath('security_detection_engine', version))
.set('kbn-xsrf', 'xxxx')
.set('elastic-api-version', '2023-10-31')
.type('application/json')
.send({ force: true })
.expect(200);
// This new expect clause makes sure that if no assets were installed, then
// then we should retry - as this assertion will fail and cause the whole block to be retried until it passes
expect((fleetResponse.body as InstallPackageResponse).items.length).toBeGreaterThan(0);
await es.indices.refresh({ index: ALL_SAVED_OBJECT_INDICES });
return fleetResponse.body as InstallPackageResponse;
});
}; This is not implemented in this PR; I'd like your opinion first. But my suggestion is: go with option 1 (as the PR stands) and monitor these tests. If they reappear, then the cache was not the problem, and we can try with the retry logic. I can have a PR ready for that case, so we'd only need to review it and merge it as soon (and if) the flake reappears. |
Pinging @elastic/security-detections-response (Team:Detections and Resp) |
@jpdjere Yea, it seems to be a tough one. Flaky test runsFirst, about that.
In I assume this is the same error as in #171428 -- have you got a response from @MadameSheema? Have you reached out to her elsewhere? Secondly, why in 1. Cache for ES reading requests needs to be cleared
This is a good hypothesis and we should take it into account, but it doesn't feel convincing to me from the probabilities perspective. Here there was a second test failure that occurred on the same day as the first one. The probability of that would be 1/1000000 in this case. Another reason why it's not very convincing is that we don't seem to have the same issue with all our other requests and endpoints, e.g. rule management APIs. When we create custom rules in our tests, wouldn't we have the same issue?..
I would be ok to try this if we didn't get the version conflict error in that flaky test run. The tests are still flaky. I think we need to continue investigating this and doing more flaky test runs before we unskip them. 2. Implement retry logic for security_detection_engine package installationThis idea looks more robust to me and I'd suggest to implement it in this PR. A few suggestions:
|
...es/detections_response/utils/rules/prebuilt_rules/get_prebuilt_rules_and_timelines_status.ts
Outdated
Show resolved
Hide resolved
Pinging @elastic/security-solution (Team: SecuritySolution) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Detection engine area LGTM
+++ on exploring option with retries
040880b
to
7de37c5
Compare
@jpdjere The flaky runner doesn't seem to be happy... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM, thanks 👍
Approving the PR to not being a blocker for it. Please look into the failed flaky test runs.
Ah, seems I hadn't pushed before running them. I'll rerun them and merge if they are green. Thanks for the review, feedback and discussion. |
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]Public APIs missing comments
History
To update your PR or re-run it, just comment with: cc @jpdjere |
💔 All backports failed
Manual backportTo create the backport manually run:
Questions ?Please refer to the Backport tool documentation |
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…ion tests (elastic#173998) **Addresses:** elastic#172107 elastic#171380 ## Summary Unskip skipped tests in: 1. `x-pack/test/security_solution_api_integration/test_suites/detections_response/default_license/prebuilt_rules/bundled_prebuilt_rules_package/install_latest_bundled_prebuilt_rules.ts` 2. `x-pack/test/security_solution_api_integration/test_suites/detections_response/default_license/prebuilt_rules/management/fleet_integration.ts` - Makes the `retryDelay` in the **RetryService** in `packages/kbn-ftr-common-functional-services/services/retry/retry.ts` a configurable parameter - used in our `retry` util to shorten the wait period to 200ms. - Creates `retry` wrapper util for our FTR Integration tests, that wraps `retry.try` from the **RetryService**, to implement maximum attempts. - Uses this `retry` wrapper in test utils that install the `security_detection_engine` Fleet package, and asserts that rules have been succesfully installed - and retries if they haven't. - Creates `refreshSavedObjectIndices` reusable util that cleans cache and refreshes indices. Centralizes comment spread around file into this single file. - Use this new util to clean the cache in util that install the Fleet package, and utils that read the prebuilt rule status. ## Flaky test runner **Before changes:** - For both `bundled_prebuilt_rules_package` and `management`: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4688 🟢 (250 and 250 runs) - `bundled_prebuilt_rules_package`: - ESS: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4805 (500 runs) - Serverless: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4807 (500 runs) - `management` - ESS: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4806 (500 runs) - Serverless: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4808 (500 runs) **After changes:** - `bundled_prebuilt_rules_package`: - ESS: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4825 🟢 (500 runs) - Serverless: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4826 🟢 (500 runs) - `management` - ESS: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4827 🟢 (500 runs) - Serverless: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4828 🟢 (500 runs) ### For maintainers - [ ] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) --------- Co-authored-by: kibanamachine <[email protected]> (cherry picked from commit 81d6478) # Conflicts: # x-pack/test/security_solution_api_integration/tsconfig.json
…ntegration tests (#173998) (#174761) # Backport This will backport the following commits from `main` to `8.12`: - [[Security Solution] Unskip flaky tests in Prebuilt Rules FTR Integration tests (#173998)](#173998) <!--- Backport version: 8.9.8 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Juan Pablo Djeredjian","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-01-11T12:35:55Z","message":"[Security Solution] Unskip flaky tests in Prebuilt Rules FTR Integration tests (#173998)\n\n**Addresses:**\r\nhttps://github.com//issues/172107\r\nhttps://github.com//issues/171380\r\n\r\n## Summary\r\n\r\nUnskip skipped tests in:\r\n\r\n1.\r\n`x-pack/test/security_solution_api_integration/test_suites/detections_response/default_license/prebuilt_rules/bundled_prebuilt_rules_package/install_latest_bundled_prebuilt_rules.ts`\r\n2.\r\n`x-pack/test/security_solution_api_integration/test_suites/detections_response/default_license/prebuilt_rules/management/fleet_integration.ts`\r\n\r\n- Makes the `retryDelay` in the **RetryService** in\r\n`packages/kbn-ftr-common-functional-services/services/retry/retry.ts` a\r\nconfigurable parameter - used in our `retry` util to shorten the wait\r\nperiod to 200ms.\r\n- Creates `retry` wrapper util for our FTR Integration tests, that wraps\r\n`retry.try` from the **RetryService**, to implement maximum attempts.\r\n- Uses this `retry` wrapper in test utils that install the\r\n`security_detection_engine` Fleet package, and asserts that rules have\r\nbeen succesfully installed - and retries if they haven't.\r\n- Creates `refreshSavedObjectIndices` reusable util that cleans cache\r\nand refreshes indices. Centralizes comment spread around file into this\r\nsingle file.\r\n- Use this new util to clean the cache in util that install the Fleet\r\npackage, and utils that read the prebuilt rule status.\r\n\r\n## Flaky test runner\r\n\r\n**Before changes:** \r\n- For both `bundled_prebuilt_rules_package` and `management`:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4688\r\n🟢 (250 and 250 runs)\r\n- `bundled_prebuilt_rules_package`:\r\n- ESS:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4805\r\n(500 runs)\r\n- Serverless:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4807\r\n(500 runs)\r\n- `management` \r\n- ESS:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4806\r\n(500 runs)\r\n- Serverless:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4808\r\n(500 runs)\r\n \r\n**After changes:** \r\n- `bundled_prebuilt_rules_package`:\r\n- ESS:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4825\r\n🟢 (500 runs)\r\n- Serverless:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4826\r\n🟢 (500 runs)\r\n- `management` \r\n- ESS:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4827\r\n🟢 (500 runs)\r\n- Serverless:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4828\r\n🟢 (500 runs)\r\n\r\n### For maintainers\r\n\r\n- [ ] This was checked for breaking API changes and was [labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <[email protected]>","sha":"81d6478beedeceedd4ae193c7a5ba0ee874cbf12","branchLabelMapping":{"^v8.13.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["test","release_note:skip","Team:Detections and Resp","Team: SecuritySolution","Team:Detection Rule Management","Feature:Prebuilt Detection Rules","v8.12.0","v8.12.1","v8.13.0"],"number":173998,"url":"https://github.com/elastic/kibana/pull/173998","mergeCommit":{"message":"[Security Solution] Unskip flaky tests in Prebuilt Rules FTR Integration tests (#173998)\n\n**Addresses:**\r\nhttps://github.com//issues/172107\r\nhttps://github.com//issues/171380\r\n\r\n## Summary\r\n\r\nUnskip skipped tests in:\r\n\r\n1.\r\n`x-pack/test/security_solution_api_integration/test_suites/detections_response/default_license/prebuilt_rules/bundled_prebuilt_rules_package/install_latest_bundled_prebuilt_rules.ts`\r\n2.\r\n`x-pack/test/security_solution_api_integration/test_suites/detections_response/default_license/prebuilt_rules/management/fleet_integration.ts`\r\n\r\n- Makes the `retryDelay` in the **RetryService** in\r\n`packages/kbn-ftr-common-functional-services/services/retry/retry.ts` a\r\nconfigurable parameter - used in our `retry` util to shorten the wait\r\nperiod to 200ms.\r\n- Creates `retry` wrapper util for our FTR Integration tests, that wraps\r\n`retry.try` from the **RetryService**, to implement maximum attempts.\r\n- Uses this `retry` wrapper in test utils that install the\r\n`security_detection_engine` Fleet package, and asserts that rules have\r\nbeen succesfully installed - and retries if they haven't.\r\n- Creates `refreshSavedObjectIndices` reusable util that cleans cache\r\nand refreshes indices. Centralizes comment spread around file into this\r\nsingle file.\r\n- Use this new util to clean the cache in util that install the Fleet\r\npackage, and utils that read the prebuilt rule status.\r\n\r\n## Flaky test runner\r\n\r\n**Before changes:** \r\n- For both `bundled_prebuilt_rules_package` and `management`:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4688\r\n🟢 (250 and 250 runs)\r\n- `bundled_prebuilt_rules_package`:\r\n- ESS:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4805\r\n(500 runs)\r\n- Serverless:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4807\r\n(500 runs)\r\n- `management` \r\n- ESS:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4806\r\n(500 runs)\r\n- Serverless:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4808\r\n(500 runs)\r\n \r\n**After changes:** \r\n- `bundled_prebuilt_rules_package`:\r\n- ESS:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4825\r\n🟢 (500 runs)\r\n- Serverless:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4826\r\n🟢 (500 runs)\r\n- `management` \r\n- ESS:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4827\r\n🟢 (500 runs)\r\n- Serverless:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4828\r\n🟢 (500 runs)\r\n\r\n### For maintainers\r\n\r\n- [ ] This was checked for breaking API changes and was [labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <[email protected]>","sha":"81d6478beedeceedd4ae193c7a5ba0ee874cbf12"}},"sourceBranch":"main","suggestedTargetBranches":["8.12"],"targetPullRequestStates":[{"branch":"8.12","label":"v8.12.0","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.13.0","labelRegex":"^v8.13.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/173998","number":173998,"mergeCommit":{"message":"[Security Solution] Unskip flaky tests in Prebuilt Rules FTR Integration tests (#173998)\n\n**Addresses:**\r\nhttps://github.com//issues/172107\r\nhttps://github.com//issues/171380\r\n\r\n## Summary\r\n\r\nUnskip skipped tests in:\r\n\r\n1.\r\n`x-pack/test/security_solution_api_integration/test_suites/detections_response/default_license/prebuilt_rules/bundled_prebuilt_rules_package/install_latest_bundled_prebuilt_rules.ts`\r\n2.\r\n`x-pack/test/security_solution_api_integration/test_suites/detections_response/default_license/prebuilt_rules/management/fleet_integration.ts`\r\n\r\n- Makes the `retryDelay` in the **RetryService** in\r\n`packages/kbn-ftr-common-functional-services/services/retry/retry.ts` a\r\nconfigurable parameter - used in our `retry` util to shorten the wait\r\nperiod to 200ms.\r\n- Creates `retry` wrapper util for our FTR Integration tests, that wraps\r\n`retry.try` from the **RetryService**, to implement maximum attempts.\r\n- Uses this `retry` wrapper in test utils that install the\r\n`security_detection_engine` Fleet package, and asserts that rules have\r\nbeen succesfully installed - and retries if they haven't.\r\n- Creates `refreshSavedObjectIndices` reusable util that cleans cache\r\nand refreshes indices. Centralizes comment spread around file into this\r\nsingle file.\r\n- Use this new util to clean the cache in util that install the Fleet\r\npackage, and utils that read the prebuilt rule status.\r\n\r\n## Flaky test runner\r\n\r\n**Before changes:** \r\n- For both `bundled_prebuilt_rules_package` and `management`:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4688\r\n🟢 (250 and 250 runs)\r\n- `bundled_prebuilt_rules_package`:\r\n- ESS:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4805\r\n(500 runs)\r\n- Serverless:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4807\r\n(500 runs)\r\n- `management` \r\n- ESS:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4806\r\n(500 runs)\r\n- Serverless:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4808\r\n(500 runs)\r\n \r\n**After changes:** \r\n- `bundled_prebuilt_rules_package`:\r\n- ESS:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4825\r\n🟢 (500 runs)\r\n- Serverless:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4826\r\n🟢 (500 runs)\r\n- `management` \r\n- ESS:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4827\r\n🟢 (500 runs)\r\n- Serverless:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4828\r\n🟢 (500 runs)\r\n\r\n### For maintainers\r\n\r\n- [ ] This was checked for breaking API changes and was [labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <[email protected]>","sha":"81d6478beedeceedd4ae193c7a5ba0ee874cbf12"}}]}] BACKPORT--> --------- Co-authored-by: kibanamachine <[email protected]>
…hod from retryService (#178515) Related to #176401, #175776 ## Summary This PR: - Improves logging (I've added debug logs to the helpers that does an API request such as creating a data view) - Uses retryService instead of pRetry - In case of throwing an error in pRetry, when we have 10 retries, it does not log the retry attempts and we end up in the situation that is mentioned in this [comment, item 3](#176401 (comment)) |Before|After| |---|---| |![image](https://github.com/elastic/kibana/assets/12370520/576146f2-09da-4221-a570-6d47e047f229)|![image](https://github.com/elastic/kibana/assets/12370520/0a0897a3-0bd3-4d44-9b79-8f99fb580b4a)| - Attempts to fix flakiness in rate reason message due to having different data ![image](https://github.com/elastic/kibana/assets/12370520/dff48ac1-a9bf-4b93-addb-fd40acae382e) ### Flaky test runner #### Current (after adding refresh index and adjusting timeout) - [25] https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5463 ✅ - [200] https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5465 ✅ #### Old - [25] https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5452 ✅ - [200] https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5454 [1 Failed : 25 Canceled: 174 Passed ] ##### After checking data is generated in metric threshold - [25] https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5460 ✅ - [200] https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5462 [1 Failed : 199 Canceled ] Inspired by #173998, special thanks to @jpdjere and @dmlemeshko for their support and knowledge sharing.
## Summary I took the retry wrapper created by [jpdjere](https://github.com/jpdjere) in #173998 and extended `retryForSuccess<T>` with required capabilities to provide the same functionality. This PR: 1) extends retry service with new function `tryWithRetries<T> => : Promise<T>` to retry block `options.retryCount` times within `options.timeout` period and return block result ```ts const response = await retry.tryWithRetries<SearchResponse>( 'search request', async () => { const response = await supertest .post(`/internal/search/es`) .set(ELASTIC_HTTP_VERSION_HEADER, '1') .send({ params: { body: { query: { match_all: {}, }, }, }, }) .expect(200); return response.body as SearchResponse; }, { retryCount: 4, retryDelay: 100, // optional timeout: 30000, // optional } ``` 2) removes `utils/retry.ts` wrapper and migrate tests to FTR Retry service 3) Adds descriptions to Retry service functions explaining the default use case How the failures look like: - when reached timeout before retry count limit ``` Error: retry.tryWithRetries reached timeout 11000 ms waiting for 'run console request' Error: expected '# Click the Variables button, above, to create your own variables.\nGET ${exampleVariable1} // _search\n{\n "query": {\n "${exampleVariable2}": {} // match_all\n }\n}' to sort of equal '5' at Assertion.assert (expect.js:100:11) at Assertion.eql (expect.js:244:8) at _console.ts:47:41 at processTicksAndRejections (node:internal/process/task_queues:95:5) at runAttempt (retry_with_retries.ts:29:15) at retryWithRetries (retry_with_retries.ts:97:21) at RetryService.tryForTime (retry.ts:38:12) at Context.<anonymous> (_console.ts:44:7) at Object.apply (wrap_function.js:73:16) ``` - when reached retry count limit before timeout ``` Error: retry.tryWithRetries reached the limit of attempts waiting for 'run console request': 2 out of 2 Error: expected '# Click the Variables button, above, to create your own variables.\nGET ${exampleVariable1} // _search\n{\n "query": {\n "${exampleVariable2}": {} // match_all\n }\n}' to sort of equal '5' at Assertion.assert (expect.js:100:11) at Assertion.eql (expect.js:244:8) at _console.ts:47:41 at processTicksAndRejections (node:internal/process/task_queues:95:5) at runAttempt (retry_for_success.ts:29:15) at retryForSuccess (retry_for_success.ts:97:21) at RetryService.tryWithRetries (retry.ts:115:12) at Context.<anonymous> (_console.ts:44:7) at Object.apply (wrap_function.js:73:16) ```
…8660) ## Summary I took the retry wrapper created by [jpdjere](https://github.com/jpdjere) in elastic#173998 and extended `retryForSuccess<T>` with required capabilities to provide the same functionality. This PR: 1) extends retry service with new function `tryWithRetries<T> => : Promise<T>` to retry block `options.retryCount` times within `options.timeout` period and return block result ```ts const response = await retry.tryWithRetries<SearchResponse>( 'search request', async () => { const response = await supertest .post(`/internal/search/es`) .set(ELASTIC_HTTP_VERSION_HEADER, '1') .send({ params: { body: { query: { match_all: {}, }, }, }, }) .expect(200); return response.body as SearchResponse; }, { retryCount: 4, retryDelay: 100, // optional timeout: 30000, // optional } ``` 2) removes `utils/retry.ts` wrapper and migrate tests to FTR Retry service 3) Adds descriptions to Retry service functions explaining the default use case How the failures look like: - when reached timeout before retry count limit ``` Error: retry.tryWithRetries reached timeout 11000 ms waiting for 'run console request' Error: expected '# Click the Variables button, above, to create your own variables.\nGET ${exampleVariable1} // _search\n{\n "query": {\n "${exampleVariable2}": {} // match_all\n }\n}' to sort of equal '5' at Assertion.assert (expect.js:100:11) at Assertion.eql (expect.js:244:8) at _console.ts:47:41 at processTicksAndRejections (node:internal/process/task_queues:95:5) at runAttempt (retry_with_retries.ts:29:15) at retryWithRetries (retry_with_retries.ts:97:21) at RetryService.tryForTime (retry.ts:38:12) at Context.<anonymous> (_console.ts:44:7) at Object.apply (wrap_function.js:73:16) ``` - when reached retry count limit before timeout ``` Error: retry.tryWithRetries reached the limit of attempts waiting for 'run console request': 2 out of 2 Error: expected '# Click the Variables button, above, to create your own variables.\nGET ${exampleVariable1} // _search\n{\n "query": {\n "${exampleVariable2}": {} // match_all\n }\n}' to sort of equal '5' at Assertion.assert (expect.js:100:11) at Assertion.eql (expect.js:244:8) at _console.ts:47:41 at processTicksAndRejections (node:internal/process/task_queues:95:5) at runAttempt (retry_for_success.ts:29:15) at retryForSuccess (retry_for_success.ts:97:21) at RetryService.tryWithRetries (retry.ts:115:12) at Context.<anonymous> (_console.ts:44:7) at Object.apply (wrap_function.js:73:16) ``` (cherry picked from commit 277b3fb)
…8660) (#178978) # Backport This will backport the following commits from `main` to `8.13`: - [[kbn-ftr-common-functional-services] extend retry service (#178660)](#178660) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Dzmitry Lemechko","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-03-19T15:32:59Z","message":"[kbn-ftr-common-functional-services] extend retry service (#178660)\n\n## Summary\r\n\r\nI took the retry wrapper created by\r\n[jpdjere](https://github.com/jpdjere) in #173998 and extended\r\n`retryForSuccess<T>` with required capabilities to provide the same\r\nfunctionality.\r\n\r\nThis PR:\r\n1) extends retry service with new function `tryWithRetries<T> => :\r\nPromise<T>` to retry block `options.retryCount` times within\r\n`options.timeout` period and return block result\r\n\r\n```ts\r\n const response = await retry.tryWithRetries<SearchResponse>(\r\n 'search request',\r\n async () => {\r\n const response = await supertest\r\n .post(`/internal/search/es`)\r\n .set(ELASTIC_HTTP_VERSION_HEADER, '1')\r\n .send({\r\n params: {\r\n body: {\r\n query: {\r\n match_all: {},\r\n },\r\n },\r\n },\r\n })\r\n .expect(200);\r\n return response.body as SearchResponse;\r\n },\r\n {\r\n retryCount: 4,\r\n retryDelay: 100, // optional\r\n timeout: 30000, // optional\r\n }\r\n```\r\n\r\n2) removes `utils/retry.ts` wrapper and migrate tests to FTR Retry\r\nservice\r\n3) Adds descriptions to Retry service functions explaining the default\r\nuse case\r\n\r\nHow the failures look like:\r\n\r\n- when reached timeout before retry count limit \r\n```\r\n Error: retry.tryWithRetries reached timeout 11000 ms waiting for 'run console request'\r\n Error: expected '# Click the Variables button, above, to create your own variables.\\nGET ${exampleVariable1} // _search\\n{\\n \"query\": {\\n \"${exampleVariable2}\": {} // match_all\\n }\\n}' to sort of equal '5'\r\n at Assertion.assert (expect.js:100:11)\r\n at Assertion.eql (expect.js:244:8)\r\n at _console.ts:47:41\r\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\r\n at runAttempt (retry_with_retries.ts:29:15)\r\n at retryWithRetries (retry_with_retries.ts:97:21)\r\n at RetryService.tryForTime (retry.ts:38:12)\r\n at Context.<anonymous> (_console.ts:44:7)\r\n at Object.apply (wrap_function.js:73:16)\r\n```\r\n- when reached retry count limit before timeout\r\n```\r\n Error: retry.tryWithRetries reached the limit of attempts waiting for 'run console request': 2 out of 2\r\n Error: expected '# Click the Variables button, above, to create your own variables.\\nGET ${exampleVariable1} // _search\\n{\\n \"query\": {\\n \"${exampleVariable2}\": {} // match_all\\n }\\n}' to sort of equal '5'\r\n at Assertion.assert (expect.js:100:11)\r\n at Assertion.eql (expect.js:244:8)\r\n at _console.ts:47:41\r\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\r\n at runAttempt (retry_for_success.ts:29:15)\r\n at retryForSuccess (retry_for_success.ts:97:21)\r\n at RetryService.tryWithRetries (retry.ts:115:12)\r\n at Context.<anonymous> (_console.ts:44:7)\r\n at Object.apply (wrap_function.js:73:16)\r\n```","sha":"277b3fbc24889bedd512f23674e768d2f4c43294","branchLabelMapping":{"^v8.14.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","FTR","v8.14.0","v8.13.1"],"title":"[kbn-ftr-common-functional-services] extend retry service","number":178660,"url":"https://github.com/elastic/kibana/pull/178660","mergeCommit":{"message":"[kbn-ftr-common-functional-services] extend retry service (#178660)\n\n## Summary\r\n\r\nI took the retry wrapper created by\r\n[jpdjere](https://github.com/jpdjere) in #173998 and extended\r\n`retryForSuccess<T>` with required capabilities to provide the same\r\nfunctionality.\r\n\r\nThis PR:\r\n1) extends retry service with new function `tryWithRetries<T> => :\r\nPromise<T>` to retry block `options.retryCount` times within\r\n`options.timeout` period and return block result\r\n\r\n```ts\r\n const response = await retry.tryWithRetries<SearchResponse>(\r\n 'search request',\r\n async () => {\r\n const response = await supertest\r\n .post(`/internal/search/es`)\r\n .set(ELASTIC_HTTP_VERSION_HEADER, '1')\r\n .send({\r\n params: {\r\n body: {\r\n query: {\r\n match_all: {},\r\n },\r\n },\r\n },\r\n })\r\n .expect(200);\r\n return response.body as SearchResponse;\r\n },\r\n {\r\n retryCount: 4,\r\n retryDelay: 100, // optional\r\n timeout: 30000, // optional\r\n }\r\n```\r\n\r\n2) removes `utils/retry.ts` wrapper and migrate tests to FTR Retry\r\nservice\r\n3) Adds descriptions to Retry service functions explaining the default\r\nuse case\r\n\r\nHow the failures look like:\r\n\r\n- when reached timeout before retry count limit \r\n```\r\n Error: retry.tryWithRetries reached timeout 11000 ms waiting for 'run console request'\r\n Error: expected '# Click the Variables button, above, to create your own variables.\\nGET ${exampleVariable1} // _search\\n{\\n \"query\": {\\n \"${exampleVariable2}\": {} // match_all\\n }\\n}' to sort of equal '5'\r\n at Assertion.assert (expect.js:100:11)\r\n at Assertion.eql (expect.js:244:8)\r\n at _console.ts:47:41\r\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\r\n at runAttempt (retry_with_retries.ts:29:15)\r\n at retryWithRetries (retry_with_retries.ts:97:21)\r\n at RetryService.tryForTime (retry.ts:38:12)\r\n at Context.<anonymous> (_console.ts:44:7)\r\n at Object.apply (wrap_function.js:73:16)\r\n```\r\n- when reached retry count limit before timeout\r\n```\r\n Error: retry.tryWithRetries reached the limit of attempts waiting for 'run console request': 2 out of 2\r\n Error: expected '# Click the Variables button, above, to create your own variables.\\nGET ${exampleVariable1} // _search\\n{\\n \"query\": {\\n \"${exampleVariable2}\": {} // match_all\\n }\\n}' to sort of equal '5'\r\n at Assertion.assert (expect.js:100:11)\r\n at Assertion.eql (expect.js:244:8)\r\n at _console.ts:47:41\r\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\r\n at runAttempt (retry_for_success.ts:29:15)\r\n at retryForSuccess (retry_for_success.ts:97:21)\r\n at RetryService.tryWithRetries (retry.ts:115:12)\r\n at Context.<anonymous> (_console.ts:44:7)\r\n at Object.apply (wrap_function.js:73:16)\r\n```","sha":"277b3fbc24889bedd512f23674e768d2f4c43294"}},"sourceBranch":"main","suggestedTargetBranches":["8.13"],"targetPullRequestStates":[{"branch":"main","label":"v8.14.0","branchLabelMappingKey":"^v8.14.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/178660","number":178660,"mergeCommit":{"message":"[kbn-ftr-common-functional-services] extend retry service (#178660)\n\n## Summary\r\n\r\nI took the retry wrapper created by\r\n[jpdjere](https://github.com/jpdjere) in #173998 and extended\r\n`retryForSuccess<T>` with required capabilities to provide the same\r\nfunctionality.\r\n\r\nThis PR:\r\n1) extends retry service with new function `tryWithRetries<T> => :\r\nPromise<T>` to retry block `options.retryCount` times within\r\n`options.timeout` period and return block result\r\n\r\n```ts\r\n const response = await retry.tryWithRetries<SearchResponse>(\r\n 'search request',\r\n async () => {\r\n const response = await supertest\r\n .post(`/internal/search/es`)\r\n .set(ELASTIC_HTTP_VERSION_HEADER, '1')\r\n .send({\r\n params: {\r\n body: {\r\n query: {\r\n match_all: {},\r\n },\r\n },\r\n },\r\n })\r\n .expect(200);\r\n return response.body as SearchResponse;\r\n },\r\n {\r\n retryCount: 4,\r\n retryDelay: 100, // optional\r\n timeout: 30000, // optional\r\n }\r\n```\r\n\r\n2) removes `utils/retry.ts` wrapper and migrate tests to FTR Retry\r\nservice\r\n3) Adds descriptions to Retry service functions explaining the default\r\nuse case\r\n\r\nHow the failures look like:\r\n\r\n- when reached timeout before retry count limit \r\n```\r\n Error: retry.tryWithRetries reached timeout 11000 ms waiting for 'run console request'\r\n Error: expected '# Click the Variables button, above, to create your own variables.\\nGET ${exampleVariable1} // _search\\n{\\n \"query\": {\\n \"${exampleVariable2}\": {} // match_all\\n }\\n}' to sort of equal '5'\r\n at Assertion.assert (expect.js:100:11)\r\n at Assertion.eql (expect.js:244:8)\r\n at _console.ts:47:41\r\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\r\n at runAttempt (retry_with_retries.ts:29:15)\r\n at retryWithRetries (retry_with_retries.ts:97:21)\r\n at RetryService.tryForTime (retry.ts:38:12)\r\n at Context.<anonymous> (_console.ts:44:7)\r\n at Object.apply (wrap_function.js:73:16)\r\n```\r\n- when reached retry count limit before timeout\r\n```\r\n Error: retry.tryWithRetries reached the limit of attempts waiting for 'run console request': 2 out of 2\r\n Error: expected '# Click the Variables button, above, to create your own variables.\\nGET ${exampleVariable1} // _search\\n{\\n \"query\": {\\n \"${exampleVariable2}\": {} // match_all\\n }\\n}' to sort of equal '5'\r\n at Assertion.assert (expect.js:100:11)\r\n at Assertion.eql (expect.js:244:8)\r\n at _console.ts:47:41\r\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\r\n at runAttempt (retry_for_success.ts:29:15)\r\n at retryForSuccess (retry_for_success.ts:97:21)\r\n at RetryService.tryWithRetries (retry.ts:115:12)\r\n at Context.<anonymous> (_console.ts:44:7)\r\n at Object.apply (wrap_function.js:73:16)\r\n```","sha":"277b3fbc24889bedd512f23674e768d2f4c43294"}},{"branch":"8.13","label":"v8.13.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Dzmitry Lemechko <[email protected]>
Addresses:
#172107
#171380
Summary
Unskip skipped tests in:
x-pack/test/security_solution_api_integration/test_suites/detections_response/default_license/prebuilt_rules/bundled_prebuilt_rules_package/install_latest_bundled_prebuilt_rules.ts
x-pack/test/security_solution_api_integration/test_suites/detections_response/default_license/prebuilt_rules/management/fleet_integration.ts
retryDelay
in the RetryService inpackages/kbn-ftr-common-functional-services/services/retry/retry.ts
a configurable parameter - used in ourretry
util to shorten the wait period to 200ms.retry
wrapper util for our FTR Integration tests, that wrapsretry.try
from the RetryService, to implement maximum attempts.retry
wrapper in test utils that install thesecurity_detection_engine
Fleet package, and asserts that rules have been succesfully installed - and retries if they haven't.refreshSavedObjectIndices
reusable util that cleans cache and refreshes indices. Centralizes comment spread around file into this single file.Flaky test runner
Before changes:
bundled_prebuilt_rules_package
andmanagement
: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4688 🟢 (250 and 250 runs)bundled_prebuilt_rules_package
:management
After changes:
bundled_prebuilt_rules_package
:management
For maintainers