Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RAC] Alerts as Data Bootstrapping Index Creation #93729

Closed
spong opened this issue Mar 5, 2021 · 12 comments
Closed

[RAC] Alerts as Data Bootstrapping Index Creation #93729

spong opened this issue Mar 5, 2021 · 12 comments
Assignees
Labels
discuss Team:Detections and Resp Security Detection Response Team Team:Observability Team label for Observability Team (for things that are handled across all of observability) Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Theme: rac label obsolete

Comments

@spong
Copy link
Member

spong commented Mar 5, 2021

This issue is for finalizing the architecture/implementation for bootstrapping the Alerts as Data index creation process.

Existing implementations include the bootstrapping of the .siem-signals index, and the bootstrapping of the .kibana-event-log index.

.siem-signals

Most all logic lives in:
kibana/x-pack/plugins/security_solution/server/lib/detection_engine/index/

With routes and index/ilm definitions residing in:
kibana/x-pack/plugins/security_solution/server/lib/detection_engine/routes/index/

.kibana-event-log

Exists as a dedicated plugin named eventLog and can be found in: kibana/x-pack/plugins/event_log/
(Fantastic docs @pmuellr :)

Reference docs:
@spong spong added discuss Team:Observability Team label for Observability Team (for things that are handled across all of observability) Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Theme: rac label obsolete labels Mar 5, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@pmuellr
Copy link
Member

pmuellr commented Mar 8, 2021

FWIW, the bootstrapping of the current event log takes place here - https://github.com/elastic/kibana/blob/master/x-pack/plugins/event_log/server/es/init.ts

There's one thing we haven't done, keep forgetting to do this - because we're always writing to an alias, we should use the require_alias option when indexing the documents (going to open an issue for this now for event log). This will ensure that we don't accidentally create an index with the alias name, when we index the documents. That shouldn't happen, since we are creating the alias before doing any indexing, but in the past this happened when the bootstrapping code was not robust enough.

@mikecote
Copy link
Contributor

There's been a few ideas bounced around for using the event log for alerts as data (now or later). I wanted to lay out the issues that the event log has today for future reference.

  1. The event log uses a single index. This won't work with the requirements of creating an index per space, rule type.
  2. The event log is designed to be secured behind saved objects. Users who want direct access to the data indices to build visualizations would get access to all the data captured by the event log, which they shouldn't see.
  3. The event log is append-only. I've seen requirements for updating alert documents to support workflows like acknowledging, change status, assign, etc. Moving away from append-only may prevent leveraging data streams.
  4. The event log deletes data older than 90 days by default. This configuration is global for all events captured. I'm assuming alert data will want its own customizable data retention policy?

cc @dgieselaar @sqren @pmuellr

@sorenlouv
Copy link
Member

This has probably been discussed elsewhere already but: Would it make sense to write alert events to multiple destinations: the event log, and another append-only index (eg alerting-log-*)? This way the alerting framework is still in control of persisting alerts, which ensures consistency across consumers.

  1. The event log uses a single index

We can decide to write to an index per rule type, eg .alerting-log-observability-apm-transaction-latency

  1. [...] users [...] would get access to all the data captured by the event log, which they shouldn't see.

If we write to an index per alert type it will be possible to give granular permissions:
.alerting-log-* vs .alerting-log-observability-* vs .alerting-log-observability-apm-*

  1. The event log is append-only. I've seen requirements for updating alert documents to support workflows like acknowledging, change status, assign, etc.

All those actions could be events instead of mutations on existing events

@mikecote
Copy link
Contributor

This has probably been discussed elsewhere already but: Would it make sense to write alert events to multiple destinations: the event log, and another append-only index (eg alerting-log-*)? This way the alerting framework is still in control of persisting alerts, which ensures consistency across consumers.

IT, depends. There have been talks about mutating this data as well, would each index need to update in this scenario?

All those actions could be events instead of mutations on existing events

Would this work for filtering and sorting? Like list alerts that have recovered, are assigned, are open, etc.

@mikecote
Copy link
Contributor

We can decide to write to an index per rule type, eg .alerting-log-observability-apm-transaction-latency

If we write to an index per alert type it will be possible to give granular permissions:
.alerting-log-* vs .alerting-log-observability-* vs .alerting-log-observability-apm-*

I agree that both of those points are needed and confirms that the event log can't be used to solve that problem today.

@dgieselaar
Copy link
Member

Would this work for filtering and sorting? Like list alerts that have recovered, are assigned, are open, etc.

Yes, if the whole state is captured on the last event, which in my head it would be. But it might be a little too premature to dive into all the technical details of this specific approach?

@dgieselaar
Copy link
Member

I agree that both of those points are needed and confirms that the event log can't be used to solve that problem today.

agree 👍 (although the event log can be namespaced too, but I don't see a compelling reason right now to store everything there).

@mikecote
Copy link
Contributor

Yes, if the whole state is captured on the last event, which in my head it would be.

Ah, I didn't think of that, nice!

@peluja1012 peluja1012 assigned yctercero and spong and unassigned yctercero Mar 23, 2021
@spong spong mentioned this issue Mar 30, 2021
7 tasks
banderror added a commit that referenced this issue May 27, 2021
**Needed for:** rule execution log for Security #94143
**Related to:**

- alerts-as-data: #93728, #93729, #93730
- RFC for index naming #98912

## Summary

This PR adds a mechanism for writing to / reading from / bootstrapping indices for RAC project into the `rule_registry` plugin. Particularly, indices for alerts-as-data and rule execution events. This implementation is similar to existing implementations like `event_log` plugin (see #98353 (comment) for historical perspective), but we're going to converge all of them into 1 or 2 implementations. At least we should have a single one in `rule_registry` itself.

In this PR I tried to incorporate most of the feedback received in the RFC (#98912), but if you notice I missed/forgot something, please let me know in the comments.

Done in this PR:

- [x] Schema-agnostic APIs for working with Elasticsearch.
- [x] Schema-aware log definition and bootstrapping API (creating hierarchical logs).
- [x] Schema-aware write API (logging events).
- [x] Schema-aware read API (searching logs, filtering, sorting, pagination, aggregation).
- [x] Support for Kibana spaces, space-aware index bootstrapping (either at rule creation or rule execution time).

As for reviewing this PR, perhaps it might be easier to start with:

- checking description of #98912
- checking usage examples https://github.com/elastic/kibana/pull/98353/files#diff-c049ff2198cc69bd50a69e92d29e88da7e10b9a152bdaceaf3d41826e712c12b
- checking public api https://github.com/elastic/kibana/pull/98353/files#diff-8e9ef0dbcbc60b1861d492a03865b2ae76a56ec38ada61898c991d3a74bd6268

## Next steps

Next steps towards rule execution log in Security (#94143):

- define actual schema for rule execution events
- inject instance of rule execution log into Security rule executors and route handlers
- implement actual execution logging in rule executors
- update route handlers to start fetching execution events and metrics from the log instead of custom saved objects

Next steps in the context of RAC and unified implementation:

- converge this implementation with `RuleDataService` implementation
  - implement robust index bootstrapping
  - reconsider using FieldMap as a generic type parameter
  - implement validation for documents being indexed
- cover the final implementation with tests
- write comprehensive docs: update plugin README, add JSDoc comments to all public interfaces
banderror added a commit to banderror/kibana that referenced this issue May 27, 2021
**Needed for:** rule execution log for Security elastic#94143
**Related to:**

- alerts-as-data: elastic#93728, elastic#93729, elastic#93730
- RFC for index naming elastic#98912

## Summary

This PR adds a mechanism for writing to / reading from / bootstrapping indices for RAC project into the `rule_registry` plugin. Particularly, indices for alerts-as-data and rule execution events. This implementation is similar to existing implementations like `event_log` plugin (see elastic#98353 (comment) for historical perspective), but we're going to converge all of them into 1 or 2 implementations. At least we should have a single one in `rule_registry` itself.

In this PR I tried to incorporate most of the feedback received in the RFC (elastic#98912), but if you notice I missed/forgot something, please let me know in the comments.

Done in this PR:

- [x] Schema-agnostic APIs for working with Elasticsearch.
- [x] Schema-aware log definition and bootstrapping API (creating hierarchical logs).
- [x] Schema-aware write API (logging events).
- [x] Schema-aware read API (searching logs, filtering, sorting, pagination, aggregation).
- [x] Support for Kibana spaces, space-aware index bootstrapping (either at rule creation or rule execution time).

As for reviewing this PR, perhaps it might be easier to start with:

- checking description of elastic#98912
- checking usage examples https://github.com/elastic/kibana/pull/98353/files#diff-c049ff2198cc69bd50a69e92d29e88da7e10b9a152bdaceaf3d41826e712c12b
- checking public api https://github.com/elastic/kibana/pull/98353/files#diff-8e9ef0dbcbc60b1861d492a03865b2ae76a56ec38ada61898c991d3a74bd6268

## Next steps

Next steps towards rule execution log in Security (elastic#94143):

- define actual schema for rule execution events
- inject instance of rule execution log into Security rule executors and route handlers
- implement actual execution logging in rule executors
- update route handlers to start fetching execution events and metrics from the log instead of custom saved objects

Next steps in the context of RAC and unified implementation:

- converge this implementation with `RuleDataService` implementation
  - implement robust index bootstrapping
  - reconsider using FieldMap as a generic type parameter
  - implement validation for documents being indexed
- cover the final implementation with tests
- write comprehensive docs: update plugin README, add JSDoc comments to all public interfaces
banderror added a commit that referenced this issue May 27, 2021
**Needed for:** rule execution log for Security #94143
**Related to:**

- alerts-as-data: #93728, #93729, #93730
- RFC for index naming #98912

## Summary

This PR adds a mechanism for writing to / reading from / bootstrapping indices for RAC project into the `rule_registry` plugin. Particularly, indices for alerts-as-data and rule execution events. This implementation is similar to existing implementations like `event_log` plugin (see #98353 (comment) for historical perspective), but we're going to converge all of them into 1 or 2 implementations. At least we should have a single one in `rule_registry` itself.

In this PR I tried to incorporate most of the feedback received in the RFC (#98912), but if you notice I missed/forgot something, please let me know in the comments.

Done in this PR:

- [x] Schema-agnostic APIs for working with Elasticsearch.
- [x] Schema-aware log definition and bootstrapping API (creating hierarchical logs).
- [x] Schema-aware write API (logging events).
- [x] Schema-aware read API (searching logs, filtering, sorting, pagination, aggregation).
- [x] Support for Kibana spaces, space-aware index bootstrapping (either at rule creation or rule execution time).

As for reviewing this PR, perhaps it might be easier to start with:

- checking description of #98912
- checking usage examples https://github.com/elastic/kibana/pull/98353/files#diff-c049ff2198cc69bd50a69e92d29e88da7e10b9a152bdaceaf3d41826e712c12b
- checking public api https://github.com/elastic/kibana/pull/98353/files#diff-8e9ef0dbcbc60b1861d492a03865b2ae76a56ec38ada61898c991d3a74bd6268

## Next steps

Next steps towards rule execution log in Security (#94143):

- define actual schema for rule execution events
- inject instance of rule execution log into Security rule executors and route handlers
- implement actual execution logging in rule executors
- update route handlers to start fetching execution events and metrics from the log instead of custom saved objects

Next steps in the context of RAC and unified implementation:

- converge this implementation with `RuleDataService` implementation
  - implement robust index bootstrapping
  - reconsider using FieldMap as a generic type parameter
  - implement validation for documents being indexed
- cover the final implementation with tests
- write comprehensive docs: update plugin README, add JSDoc comments to all public interfaces
@peluja1012
Copy link
Contributor

Closing in favor of #98912

@spong spong closed this as completed Nov 10, 2021
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Team:Detections and Resp Security Detection Response Team Team:Observability Team label for Observability Team (for things that are handled across all of observability) Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Theme: rac label obsolete
Projects
None yet
Development

No branches or pull requests

9 participants