-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RAC] Alerts as Data Bulk Insert #93730
Comments
Pinging @elastic/security-detections-response (Team:Detections and Resp) |
Pinging @elastic/kibana-alerting-services (Team:Alerting Services) |
Pinging @elastic/security-solution (Team: SecuritySolution) |
This may not be relevant, but thought I'd mention it. For the current event log, we currently queue up events to write, in bulk, async. However the public interface is a simple synchronous We've always treated the event log data as "not critical" - for the cases we query it in alerting (to show alert details), we handle cases where data is missing for some reason. Beyond bugs / timing issues / etc, that reason could be that the data got ILM'd away in a delete phase. I'm guessing we want to make the "alerts as data" a little more "critical" :-), but not quite sure what that means, because if we buffer the data, and then Kibana crashes with an OOM or SIGSEGV, you're going to lose that last set of buffered data. All that code (there's not much) to deal with the event log buffering is here: https://github.com/elastic/kibana/blob/master/x-pack/plugins/event_log/server/es/cluster_client_adapter.ts#L52-L106 |
**Needed for:** rule execution log for Security #94143 **Related to:** - alerts-as-data: #93728, #93729, #93730 - RFC for index naming #98912 ## Summary This PR adds a mechanism for writing to / reading from / bootstrapping indices for RAC project into the `rule_registry` plugin. Particularly, indices for alerts-as-data and rule execution events. This implementation is similar to existing implementations like `event_log` plugin (see #98353 (comment) for historical perspective), but we're going to converge all of them into 1 or 2 implementations. At least we should have a single one in `rule_registry` itself. In this PR I tried to incorporate most of the feedback received in the RFC (#98912), but if you notice I missed/forgot something, please let me know in the comments. Done in this PR: - [x] Schema-agnostic APIs for working with Elasticsearch. - [x] Schema-aware log definition and bootstrapping API (creating hierarchical logs). - [x] Schema-aware write API (logging events). - [x] Schema-aware read API (searching logs, filtering, sorting, pagination, aggregation). - [x] Support for Kibana spaces, space-aware index bootstrapping (either at rule creation or rule execution time). As for reviewing this PR, perhaps it might be easier to start with: - checking description of #98912 - checking usage examples https://github.com/elastic/kibana/pull/98353/files#diff-c049ff2198cc69bd50a69e92d29e88da7e10b9a152bdaceaf3d41826e712c12b - checking public api https://github.com/elastic/kibana/pull/98353/files#diff-8e9ef0dbcbc60b1861d492a03865b2ae76a56ec38ada61898c991d3a74bd6268 ## Next steps Next steps towards rule execution log in Security (#94143): - define actual schema for rule execution events - inject instance of rule execution log into Security rule executors and route handlers - implement actual execution logging in rule executors - update route handlers to start fetching execution events and metrics from the log instead of custom saved objects Next steps in the context of RAC and unified implementation: - converge this implementation with `RuleDataService` implementation - implement robust index bootstrapping - reconsider using FieldMap as a generic type parameter - implement validation for documents being indexed - cover the final implementation with tests - write comprehensive docs: update plugin README, add JSDoc comments to all public interfaces
**Needed for:** rule execution log for Security elastic#94143 **Related to:** - alerts-as-data: elastic#93728, elastic#93729, elastic#93730 - RFC for index naming elastic#98912 ## Summary This PR adds a mechanism for writing to / reading from / bootstrapping indices for RAC project into the `rule_registry` plugin. Particularly, indices for alerts-as-data and rule execution events. This implementation is similar to existing implementations like `event_log` plugin (see elastic#98353 (comment) for historical perspective), but we're going to converge all of them into 1 or 2 implementations. At least we should have a single one in `rule_registry` itself. In this PR I tried to incorporate most of the feedback received in the RFC (elastic#98912), but if you notice I missed/forgot something, please let me know in the comments. Done in this PR: - [x] Schema-agnostic APIs for working with Elasticsearch. - [x] Schema-aware log definition and bootstrapping API (creating hierarchical logs). - [x] Schema-aware write API (logging events). - [x] Schema-aware read API (searching logs, filtering, sorting, pagination, aggregation). - [x] Support for Kibana spaces, space-aware index bootstrapping (either at rule creation or rule execution time). As for reviewing this PR, perhaps it might be easier to start with: - checking description of elastic#98912 - checking usage examples https://github.com/elastic/kibana/pull/98353/files#diff-c049ff2198cc69bd50a69e92d29e88da7e10b9a152bdaceaf3d41826e712c12b - checking public api https://github.com/elastic/kibana/pull/98353/files#diff-8e9ef0dbcbc60b1861d492a03865b2ae76a56ec38ada61898c991d3a74bd6268 ## Next steps Next steps towards rule execution log in Security (elastic#94143): - define actual schema for rule execution events - inject instance of rule execution log into Security rule executors and route handlers - implement actual execution logging in rule executors - update route handlers to start fetching execution events and metrics from the log instead of custom saved objects Next steps in the context of RAC and unified implementation: - converge this implementation with `RuleDataService` implementation - implement robust index bootstrapping - reconsider using FieldMap as a generic type parameter - implement validation for documents being indexed - cover the final implementation with tests - write comprehensive docs: update plugin README, add JSDoc comments to all public interfaces
**Needed for:** rule execution log for Security #94143 **Related to:** - alerts-as-data: #93728, #93729, #93730 - RFC for index naming #98912 ## Summary This PR adds a mechanism for writing to / reading from / bootstrapping indices for RAC project into the `rule_registry` plugin. Particularly, indices for alerts-as-data and rule execution events. This implementation is similar to existing implementations like `event_log` plugin (see #98353 (comment) for historical perspective), but we're going to converge all of them into 1 or 2 implementations. At least we should have a single one in `rule_registry` itself. In this PR I tried to incorporate most of the feedback received in the RFC (#98912), but if you notice I missed/forgot something, please let me know in the comments. Done in this PR: - [x] Schema-agnostic APIs for working with Elasticsearch. - [x] Schema-aware log definition and bootstrapping API (creating hierarchical logs). - [x] Schema-aware write API (logging events). - [x] Schema-aware read API (searching logs, filtering, sorting, pagination, aggregation). - [x] Support for Kibana spaces, space-aware index bootstrapping (either at rule creation or rule execution time). As for reviewing this PR, perhaps it might be easier to start with: - checking description of #98912 - checking usage examples https://github.com/elastic/kibana/pull/98353/files#diff-c049ff2198cc69bd50a69e92d29e88da7e10b9a152bdaceaf3d41826e712c12b - checking public api https://github.com/elastic/kibana/pull/98353/files#diff-8e9ef0dbcbc60b1861d492a03865b2ae76a56ec38ada61898c991d3a74bd6268 ## Next steps Next steps towards rule execution log in Security (#94143): - define actual schema for rule execution events - inject instance of rule execution log into Security rule executors and route handlers - implement actual execution logging in rule executors - update route handlers to start fetching execution events and metrics from the log instead of custom saved objects Next steps in the context of RAC and unified implementation: - converge this implementation with `RuleDataService` implementation - implement robust index bootstrapping - reconsider using FieldMap as a generic type parameter - implement validation for documents being indexed - cover the final implementation with tests - write comprehensive docs: update plugin README, add JSDoc comments to all public interfaces
Closing in favor of Alerts as Data RFC doc. |
This issue is for discussing the architecture/implementation for writing out Alerts as Data.
Relevant implementations from within detections include:
And a plethora of other utilities and helpers found in:
kibana/x-pack/plugins/security_solution/server/lib/detection_engine/signals/
The above implementations can be spread out as they cover many complex use cases, like creating alerts for aggregations, EQL sequences, alerts on alerts, and logic for preventing the creation duplicate alerts (among others :). Some patterns have started forming as we've been adding new rule types and features over the last year, but until now we've yet to have the opportunity to abstract further and clean up the control flow.
As discussed with @mikecote, @sqren and @tsg, a library implementation providing a hook for alert creation that each rule could provide their own implementation to (and call into other library utils like exceptions, deduplication, etc) would go a long way in extending the initial implementation within Detections.
Open questions
CPU Threshold Exceeded
rule may write an alert, then later update that alert if it is triggered again. Will we follow this same paradigm, or are alerts immutable except for assignment/status like in the security use case? If the latter, how will what is now many alerts for one incident be displayed in the triage workflow since we don't support groupings/aggregations? Each new trigger creates a new alert with aggregate information from the predecessors and automatically closes them so there's only ever one active alert? Building block alerts for each trigger, and the shell alert is mutable?The text was updated successfully, but these errors were encountered: