-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ResponseOps] use Data Streams for AAD indices in serverless #160572
[ResponseOps] use Data Streams for AAD indices in serverless #160572
Conversation
1722158
to
1591879
Compare
64625ec
to
986774c
Compare
695a5a2
to
0060185
Compare
0060185
to
259f517
Compare
4df6539
to
11d2667
Compare
74421c2
to
c7c0886
Compare
Changes the way the alerts-as-data (AAD) indices are created and written to, to allow them to be built as they have been in the past (alias and backing indices created manually) OR as an ES Data Stream. Serverless will use Data Streams, other environments will use the existing alias and backing indices. The configuration `xpack.alerting.useDataStreamForAlerts` determines which to use, and is false by default. This configuration is not intended to be used by customers, as it when used in the wrong environment, will cause failures creating and writing to the alert indices. PR elastic#160572 - fix jest test - start integrating bits of the dataStreamAdapter - fix tests - fix function tests - fix FT - change the rule registry writer - fix FT - start using the new ds adapter to create alias indices - add create data stream logic - use alias by default createConcreteWriteIndex() as it's being called by other plugins now - start adapting jest tests to test datastream alerts as well - extend create_concrete_write_index jest tests for data stream - rebase, fix merge conflicts - fix jest test after merge conflict - convert the RR trial test to use data streams - fix jest test, remove ILM for datastream, run serverless o11y test with ds, add rr test with ds - fix jest and function tests - fix jest tests
8ad645e
to
fa31bf0
Compare
4007f39
to
b7e0206
Compare
d361034
to
bf59bb3
Compare
@elasticmachine merge upstream I think this should clear the failing FTR test via PR 165103 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
security-engineering-productivity changes LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the Threat Hunting Investigations team
elastic#164988) ## Summary Needed for elastic#160572 (comment) (cherry picked from commit f8ad13a) # Conflicts: # x-pack/test/security_solution_cypress/cypress/e2e/exceptions/rule_details_flow/add_edit_exception.cy.ts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DE changes LGTM
// https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update-by-query.html#_refreshing_shards_2 | ||
// Note: Before we tried to use "refresh: wait_for" but I do not think that was available and instead it defaulted to "refresh: true" | ||
// but the tests do not pass with "refresh: false". If at some point a "refresh: wait_for" is implemented, we should use that instead. | ||
refresh: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is allowing refresh:true
something we want to do?
I remember we did some work to avoid explicitly refreshing indices: https://github.com/elastic/security-team/issues/6561
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In similar work, done in #162508, I had the same issue, so had add retry calls to wait until item reindexed and only after this finish request.
It worth to note though, updates in lists/items are not available in UI, only through API calls, so we do not expect them to be called frequently, especially in Serverless. Only list delete available in UI, which is also not expected to be frequent operation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review @vitaliidm, I took it from here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tested it locally and it seems false
works fine, let's see if CI agrees 🤞
// https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update-by-query.html#_refreshing_shards_2 | ||
// Note: Before we tried to use "refresh: wait_for" but I do not think that was available and instead it defaulted to "refresh: true" | ||
// but the tests do not pass with "refresh: false". If at some point a "refresh: wait_for" is implemented, we should use that instead. | ||
refresh: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
is allowing refresh:true something we want to do?
I remember we did some work to avoid explicitly refreshing indices: https://github.com/elastic/security-team/issues/6561
💔 Build Failed
Failed CI StepsMetrics [docs]Public APIs missing comments
Async chunks
Canvas Sharable Runtime
Public APIs missing exports
Unknown metric groupsAPI count
ESLint disabled line counts
Total ESLint disabled count
History
To update your PR or re-run it, just comment with: |
…ver.load (#164988) (#165203) # Backport This will backport the following commits from `main` to `8.10`: - [[security_solution_cypress] Add support for options in EsArchiver.load (#164988)](#164988) <!--- Backport version: 8.9.8 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Patryk Kopyciński","email":"[email protected]"},"sourceCommit":{"committedDate":"2023-08-28T18:21:43Z","message":"[security_solution_cypress] Add support for options in EsArchiver.load (#164988)\n\n## Summary\r\n\r\nNeeded for\r\nhttps://github.com//pull/160572#issuecomment-1695775316","sha":"f8ad13ae1f078aa5f0cb6e1190e3c167dec168c8","branchLabelMapping":{"^v8.11.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:prev-minor","v8.11.0"],"number":164988,"url":"https://github.com/elastic/kibana/pull/164988","mergeCommit":{"message":"[security_solution_cypress] Add support for options in EsArchiver.load (#164988)\n\n## Summary\r\n\r\nNeeded for\r\nhttps://github.com//pull/160572#issuecomment-1695775316","sha":"f8ad13ae1f078aa5f0cb6e1190e3c167dec168c8"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.11.0","labelRegex":"^v8.11.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/164988","number":164988,"mergeCommit":{"message":"[security_solution_cypress] Add support for options in EsArchiver.load (#164988)\n\n## Summary\r\n\r\nNeeded for\r\nhttps://github.com//pull/160572#issuecomment-1695775316","sha":"f8ad13ae1f078aa5f0cb6e1190e3c167dec168c8"}}]}] BACKPORT-->
…#160572) resolves elastic#154266 Changes the way the alerts-as-data (AAD) indices are created and written to, to allow them to be built as they have been in the past (alias and backing indices created manually) OR as an ES Data Stream. Serverless will use Data Streams, other environments will use the existing alias and backing indices. The determination is made by optionally including the `serverless` plugin, and determining if it's available. The implementation is organized around a `DataStreamAdapter` object, which is instantiated with a "data stream" or "alias" flavor, and then it handles the specialized behavior. Currently, a lot of the smaller implementation bits, like setting property values in ES calls, is done via in-line boolean checks of that object, as to whether data streams or aliases are being used. This could probably be cleaned up some. Existing non-serverless function tests are largely unchanged, as they can't test the new data stream path. Some tests have been added to the serverless function tests, to test basic reading / writing via updated alert documents. ## DEFER - more serverless AaD tests - elastic#158403 - this issue is more noticeable now that we HAVE to do OCC with data streams, so we get errors instead of simply overwriting documents (which is also bad) Co-authored-by: Patryk Kopycinski <[email protected]>
resolves: #190376 In PR #160572, we changed from using just the bulk op `index` to using `create` when new alerts are being created. Unfortunately, the code to handle the bulk responses didn't take into account that the bulk responses for `create`s need different handling than `index`s. Specifically, conflicts for `create` were being treated as errors. This PR changes the processing to consider additional ops besides just `index`.
resolves: elastic#190376 In PR elastic#160572, we changed from using just the bulk op `index` to using `create` when new alerts are being created. Unfortunately, the code to handle the bulk responses didn't take into account that the bulk responses for `create`s need different handling than `index`s. Specifically, conflicts for `create` were being treated as errors. This PR changes the processing to consider additional ops besides just `index`.
resolves: elastic#190376 In PR elastic#160572, we changed from using just the bulk op `index` to using `create` when new alerts are being created. Unfortunately, the code to handle the bulk responses didn't take into account that the bulk responses for `create`s need different handling than `index`s. Specifically, conflicts for `create` were being treated as errors. This PR changes the processing to consider additional ops besides just `index`.
resolves #154266
Changes the way the alerts-as-data (AAD) indices are created and written to, to allow them to be built as they have been in the past (alias and backing indices created manually) OR as an ES Data Stream.
Serverless will use Data Streams, other environments will use the existing alias and backing indices. The determination is made by optionally including the
serverless
plugin, and determining if it's available.The implementation is organized around a
DataStreamAdapter
object, which is instantiated with a "data stream" or "alias" flavor, and then it handles the specialized behavior. Currently, a lot of the smaller implementation bits, like setting property values in ES calls, is done via in-line boolean checks of that object, as to whether data streams or aliases are being used. This could probably be cleaned up some.Existing non-serverless function tests are largely unchanged, as they can't test the new data stream path. Some tests have been added to the serverless function tests, to test basic reading / writing via updated alert documents.
DEFER
more serverless AaD tests
potentially re-org
alerts_service
tests; currently this module makes no direct ES calls, but it's jest tests are quite large and mock a LOT of ES calls. The calls are now indirect in helpers. Almost feels like the code was re-org'd to add the helpers, but the tests weren't. Would likely make the tests more understandable with a re-org - spent lots of time in this test module[Response Ops][Alerting] Discuss how to handle possible conflicts when updating alerts-as-data documents #158403 - this issue is more noticeable now that we HAVE to do OCC with data streams, so we get errors instead of simply overwriting documents (which is also bad)
Checklist
Delete any items that are not applicable to this PR.