-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security Solution] Improve logging for FTR test retry
function
#176316
Conversation
Pinging @elastic/security-detections-response (Team:Detections and Resp) |
Pinging @elastic/security-solution (Team: SecuritySolution) |
Pinging @elastic/security-detection-rule-management (Team:Detection Rule Management) |
buildkite test this |
@jpdjere Is this a real error you managed to get from running these tests? Or just a simulated one for checking how that code works now. |
It's a simulated error, I wanted to be 100% that the errors thrown in the actual test get logged in CI builds. It's from this run. |
💚 Build Succeeded
Metrics [docs]
History
To update your PR or re-run it, just comment with: cc @jpdjere |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Flaky test runs were not able to reproduce the error reported. Merging this PR to add additional logging and we will keep an eye on these tests to understand why they fail, if they do. |
…astic#176316) ## Summary **Fixes:** - elastic#175481 - elastic#175250 ### Description Improves logging for the `retry` FTR integration testing utility that is used to wrap helpers that make endpoint calls or direct Elasticsearch operations. The previous logging would only explain that the maximum amount of retries had been reached, with the actual error caused in the test being lost, which proved hard to debug. These changes catches the error and log it, allowing us to understand why a retried test failed. Error now reported as: ``` [00:00:19] │ERROR Retrying installPrebuiltRulesPackageByVersion: Error: expected 500 "Internal Server Error", got 200 "OK" [00:00:19] │ debg --- retry.tryForTime failed again with the same message... [00:00:19] │ERROR Reached maximum number of retries for test: 2/2 [00:00:19] └- ✖ fail: Rules Management - Prebuilt Rules - Update Prebuilt Rules Package @ess @serverless @skipInQA update_prebuilt_rules_package should allow user to install prebuilt rules from scratch, then install new rules and upgrade existing rules from the new package [00:00:19] │ Error: "Reached maximum number of retries for test: 2/2" [00:00:19] │ at block (retry.ts:72:16) [00:00:19] │ at runAttempt (retry_for_success.ts:29:21) [00:00:19] │ at retryForSuccess (retry_for_success.ts:79:27) [00:00:19] │ at RetryService.tryForTime (retry.ts:23:12) [00:00:19] │ at retry (retry.ts:62:20) [00:00:19] │ at installPrebuiltRulesPackageByVersion (install_fleet_package_by_url.ts:77:25) [00:00:19] │ at Context.<anonymous> (update_prebuilt_rules_package.ts:106:46) [00:00:19] │ at Object.apply (wrap_function.js:73:16) ``` Main error is still `"Reached maximum number of retries for test: 2/2"`, but now additional logging of exactly **what failed in the test** is error-logged as seen above: `ERROR Retrying installPrebuiltRulesPackageByVersion: Error: expected 500 "Internal Server Error", got 200 "OK"` **Flaky test run:** - Shared 50 and 50: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5068 - Ess 100 runs: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5091 - Serverless 100 runs: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5092 ### For maintainers - [ ] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) (cherry picked from commit 479a022)
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…x60; function (#176316) (#176497) # Backport This will backport the following commits from `main` to `8.12`: - [[Security Solution] Improve logging for FTR test `retry` function (#176316)](#176316) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Juan Pablo Djeredjian","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-02-08T12:35:54Z","message":"[Security Solution] Improve logging for FTR test `retry` function (#176316)\n\n## Summary\r\n\r\n**Fixes:** \r\n- https://github.com/elastic/kibana/issues/175481\r\n- https://github.com/elastic/kibana/issues/175250\r\n\r\n\r\n### Description\r\n\r\nImproves logging for the `retry` FTR integration testing utility that is\r\nused to wrap helpers that make endpoint calls or direct Elasticsearch\r\noperations.\r\n\r\nThe previous logging would only explain that the maximum amount of\r\nretries had been reached, with the actual error caused in the test being\r\nlost, which proved hard to debug.\r\n\r\nThese changes catches the error and log it, allowing us to understand\r\nwhy a retried test failed.\r\n\r\nError now reported as:\r\n\r\n```\r\n[00:00:19] │ERROR Retrying installPrebuiltRulesPackageByVersion: Error: expected 500 \"Internal Server Error\", got 200 \"OK\"\r\n[00:00:19] │ debg --- retry.tryForTime failed again with the same message...\r\n[00:00:19] │ERROR Reached maximum number of retries for test: 2/2\r\n[00:00:19] └- ✖ fail: Rules Management - Prebuilt Rules - Update Prebuilt Rules Package @ess @serverless @skipInQA update_prebuilt_rules_package should allow user to install prebuilt rules from scratch, then install new rules and upgrade existing rules from the new package\r\n[00:00:19] │ Error: \"Reached maximum number of retries for test: 2/2\"\r\n[00:00:19] │ at block (retry.ts:72:16)\r\n[00:00:19] │ at runAttempt (retry_for_success.ts:29:21)\r\n[00:00:19] │ at retryForSuccess (retry_for_success.ts:79:27)\r\n[00:00:19] │ at RetryService.tryForTime (retry.ts:23:12)\r\n[00:00:19] │ at retry (retry.ts:62:20)\r\n[00:00:19] │ at installPrebuiltRulesPackageByVersion (install_fleet_package_by_url.ts:77:25)\r\n[00:00:19] │ at Context.<anonymous> (update_prebuilt_rules_package.ts:106:46)\r\n[00:00:19] │ at Object.apply (wrap_function.js:73:16)\r\n```\r\nMain error is still `\"Reached maximum number of retries for test: 2/2\"`,\r\nbut now additional logging of exactly **what failed in the test** is\r\nerror-logged as seen above: `ERROR Retrying\r\ninstallPrebuiltRulesPackageByVersion: Error: expected 500 \"Internal\r\nServer Error\", got 200 \"OK\"`\r\n\r\n**Flaky test run:** \r\n- Shared 50 and 50:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5068\r\n- Ess 100 runs:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5091\r\n- Serverless 100 runs:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5092\r\n\r\n### For maintainers\r\n\r\n- [ ] This was checked for breaking API changes and was [labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"479a022bd3a8ae79ca9af1eb12a90a26cb53efdf","branchLabelMapping":{"^v8.13.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["test","release_note:skip","Team:Detections and Resp","Team: SecuritySolution","Team:Detection Rule Management","v8.13.0","v8.12.2"],"title":"[Security Solution] Improve logging for FTR test `retry` function","number":176316,"url":"https://github.com/elastic/kibana/pull/176316","mergeCommit":{"message":"[Security Solution] Improve logging for FTR test `retry` function (#176316)\n\n## Summary\r\n\r\n**Fixes:** \r\n- https://github.com/elastic/kibana/issues/175481\r\n- https://github.com/elastic/kibana/issues/175250\r\n\r\n\r\n### Description\r\n\r\nImproves logging for the `retry` FTR integration testing utility that is\r\nused to wrap helpers that make endpoint calls or direct Elasticsearch\r\noperations.\r\n\r\nThe previous logging would only explain that the maximum amount of\r\nretries had been reached, with the actual error caused in the test being\r\nlost, which proved hard to debug.\r\n\r\nThese changes catches the error and log it, allowing us to understand\r\nwhy a retried test failed.\r\n\r\nError now reported as:\r\n\r\n```\r\n[00:00:19] │ERROR Retrying installPrebuiltRulesPackageByVersion: Error: expected 500 \"Internal Server Error\", got 200 \"OK\"\r\n[00:00:19] │ debg --- retry.tryForTime failed again with the same message...\r\n[00:00:19] │ERROR Reached maximum number of retries for test: 2/2\r\n[00:00:19] └- ✖ fail: Rules Management - Prebuilt Rules - Update Prebuilt Rules Package @ess @serverless @skipInQA update_prebuilt_rules_package should allow user to install prebuilt rules from scratch, then install new rules and upgrade existing rules from the new package\r\n[00:00:19] │ Error: \"Reached maximum number of retries for test: 2/2\"\r\n[00:00:19] │ at block (retry.ts:72:16)\r\n[00:00:19] │ at runAttempt (retry_for_success.ts:29:21)\r\n[00:00:19] │ at retryForSuccess (retry_for_success.ts:79:27)\r\n[00:00:19] │ at RetryService.tryForTime (retry.ts:23:12)\r\n[00:00:19] │ at retry (retry.ts:62:20)\r\n[00:00:19] │ at installPrebuiltRulesPackageByVersion (install_fleet_package_by_url.ts:77:25)\r\n[00:00:19] │ at Context.<anonymous> (update_prebuilt_rules_package.ts:106:46)\r\n[00:00:19] │ at Object.apply (wrap_function.js:73:16)\r\n```\r\nMain error is still `\"Reached maximum number of retries for test: 2/2\"`,\r\nbut now additional logging of exactly **what failed in the test** is\r\nerror-logged as seen above: `ERROR Retrying\r\ninstallPrebuiltRulesPackageByVersion: Error: expected 500 \"Internal\r\nServer Error\", got 200 \"OK\"`\r\n\r\n**Flaky test run:** \r\n- Shared 50 and 50:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5068\r\n- Ess 100 runs:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5091\r\n- Serverless 100 runs:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5092\r\n\r\n### For maintainers\r\n\r\n- [ ] This was checked for breaking API changes and was [labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"479a022bd3a8ae79ca9af1eb12a90a26cb53efdf"}},"sourceBranch":"main","suggestedTargetBranches":["8.12"],"targetPullRequestStates":[{"branch":"main","label":"v8.13.0","branchLabelMappingKey":"^v8.13.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/176316","number":176316,"mergeCommit":{"message":"[Security Solution] Improve logging for FTR test `retry` function (#176316)\n\n## Summary\r\n\r\n**Fixes:** \r\n- https://github.com/elastic/kibana/issues/175481\r\n- https://github.com/elastic/kibana/issues/175250\r\n\r\n\r\n### Description\r\n\r\nImproves logging for the `retry` FTR integration testing utility that is\r\nused to wrap helpers that make endpoint calls or direct Elasticsearch\r\noperations.\r\n\r\nThe previous logging would only explain that the maximum amount of\r\nretries had been reached, with the actual error caused in the test being\r\nlost, which proved hard to debug.\r\n\r\nThese changes catches the error and log it, allowing us to understand\r\nwhy a retried test failed.\r\n\r\nError now reported as:\r\n\r\n```\r\n[00:00:19] │ERROR Retrying installPrebuiltRulesPackageByVersion: Error: expected 500 \"Internal Server Error\", got 200 \"OK\"\r\n[00:00:19] │ debg --- retry.tryForTime failed again with the same message...\r\n[00:00:19] │ERROR Reached maximum number of retries for test: 2/2\r\n[00:00:19] └- ✖ fail: Rules Management - Prebuilt Rules - Update Prebuilt Rules Package @ess @serverless @skipInQA update_prebuilt_rules_package should allow user to install prebuilt rules from scratch, then install new rules and upgrade existing rules from the new package\r\n[00:00:19] │ Error: \"Reached maximum number of retries for test: 2/2\"\r\n[00:00:19] │ at block (retry.ts:72:16)\r\n[00:00:19] │ at runAttempt (retry_for_success.ts:29:21)\r\n[00:00:19] │ at retryForSuccess (retry_for_success.ts:79:27)\r\n[00:00:19] │ at RetryService.tryForTime (retry.ts:23:12)\r\n[00:00:19] │ at retry (retry.ts:62:20)\r\n[00:00:19] │ at installPrebuiltRulesPackageByVersion (install_fleet_package_by_url.ts:77:25)\r\n[00:00:19] │ at Context.<anonymous> (update_prebuilt_rules_package.ts:106:46)\r\n[00:00:19] │ at Object.apply (wrap_function.js:73:16)\r\n```\r\nMain error is still `\"Reached maximum number of retries for test: 2/2\"`,\r\nbut now additional logging of exactly **what failed in the test** is\r\nerror-logged as seen above: `ERROR Retrying\r\ninstallPrebuiltRulesPackageByVersion: Error: expected 500 \"Internal\r\nServer Error\", got 200 \"OK\"`\r\n\r\n**Flaky test run:** \r\n- Shared 50 and 50:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5068\r\n- Ess 100 runs:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5091\r\n- Serverless 100 runs:\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5092\r\n\r\n### For maintainers\r\n\r\n- [ ] This was checked for breaking API changes and was [labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"479a022bd3a8ae79ca9af1eb12a90a26cb53efdf"}},{"branch":"8.12","label":"v8.12.2","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Juan Pablo Djeredjian <[email protected]>
…astic#176316) ## Summary **Fixes:** - elastic#175481 - elastic#175250 ### Description Improves logging for the `retry` FTR integration testing utility that is used to wrap helpers that make endpoint calls or direct Elasticsearch operations. The previous logging would only explain that the maximum amount of retries had been reached, with the actual error caused in the test being lost, which proved hard to debug. These changes catches the error and log it, allowing us to understand why a retried test failed. Error now reported as: ``` [00:00:19] │ERROR Retrying installPrebuiltRulesPackageByVersion: Error: expected 500 "Internal Server Error", got 200 "OK" [00:00:19] │ debg --- retry.tryForTime failed again with the same message... [00:00:19] │ERROR Reached maximum number of retries for test: 2/2 [00:00:19] └- ✖ fail: Rules Management - Prebuilt Rules - Update Prebuilt Rules Package @ess @serverless @skipInQA update_prebuilt_rules_package should allow user to install prebuilt rules from scratch, then install new rules and upgrade existing rules from the new package [00:00:19] │ Error: "Reached maximum number of retries for test: 2/2" [00:00:19] │ at block (retry.ts:72:16) [00:00:19] │ at runAttempt (retry_for_success.ts:29:21) [00:00:19] │ at retryForSuccess (retry_for_success.ts:79:27) [00:00:19] │ at RetryService.tryForTime (retry.ts:23:12) [00:00:19] │ at retry (retry.ts:62:20) [00:00:19] │ at installPrebuiltRulesPackageByVersion (install_fleet_package_by_url.ts:77:25) [00:00:19] │ at Context.<anonymous> (update_prebuilt_rules_package.ts:106:46) [00:00:19] │ at Object.apply (wrap_function.js:73:16) ``` Main error is still `"Reached maximum number of retries for test: 2/2"`, but now additional logging of exactly **what failed in the test** is error-logged as seen above: `ERROR Retrying installPrebuiltRulesPackageByVersion: Error: expected 500 "Internal Server Error", got 200 "OK"` **Flaky test run:** - Shared 50 and 50: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5068 - Ess 100 runs: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5091 - Serverless 100 runs: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5092 ### For maintainers - [ ] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
…astic#176316) ## Summary **Fixes:** - elastic#175481 - elastic#175250 ### Description Improves logging for the `retry` FTR integration testing utility that is used to wrap helpers that make endpoint calls or direct Elasticsearch operations. The previous logging would only explain that the maximum amount of retries had been reached, with the actual error caused in the test being lost, which proved hard to debug. These changes catches the error and log it, allowing us to understand why a retried test failed. Error now reported as: ``` [00:00:19] │ERROR Retrying installPrebuiltRulesPackageByVersion: Error: expected 500 "Internal Server Error", got 200 "OK" [00:00:19] │ debg --- retry.tryForTime failed again with the same message... [00:00:19] │ERROR Reached maximum number of retries for test: 2/2 [00:00:19] └- ✖ fail: Rules Management - Prebuilt Rules - Update Prebuilt Rules Package @ess @serverless @skipInQA update_prebuilt_rules_package should allow user to install prebuilt rules from scratch, then install new rules and upgrade existing rules from the new package [00:00:19] │ Error: "Reached maximum number of retries for test: 2/2" [00:00:19] │ at block (retry.ts:72:16) [00:00:19] │ at runAttempt (retry_for_success.ts:29:21) [00:00:19] │ at retryForSuccess (retry_for_success.ts:79:27) [00:00:19] │ at RetryService.tryForTime (retry.ts:23:12) [00:00:19] │ at retry (retry.ts:62:20) [00:00:19] │ at installPrebuiltRulesPackageByVersion (install_fleet_package_by_url.ts:77:25) [00:00:19] │ at Context.<anonymous> (update_prebuilt_rules_package.ts:106:46) [00:00:19] │ at Object.apply (wrap_function.js:73:16) ``` Main error is still `"Reached maximum number of retries for test: 2/2"`, but now additional logging of exactly **what failed in the test** is error-logged as seen above: `ERROR Retrying installPrebuiltRulesPackageByVersion: Error: expected 500 "Internal Server Error", got 200 "OK"` **Flaky test run:** - Shared 50 and 50: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5068 - Ess 100 runs: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5091 - Serverless 100 runs: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5092 ### For maintainers - [ ] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
Summary
Fixes:
Description
Improves logging for the
retry
FTR integration testing utility that is used to wrap helpers that make endpoint calls or direct Elasticsearch operations.The previous logging would only explain that the maximum amount of retries had been reached, with the actual error caused in the test being lost, which proved hard to debug.
These changes catches the error and log it, allowing us to understand why a retried test failed.
Error now reported as:
Main error is still
"Reached maximum number of retries for test: 2/2"
, but now additional logging of exactly what failed in the test is error-logged as seen above:ERROR Retrying installPrebuiltRulesPackageByVersion: Error: expected 500 "Internal Server Error", got 200 "OK"
Flaky test run:
For maintainers