[Security Solution] Domain-based architecture of the Detection Engine #138600

banderror · 2022-08-11T11:20:02Z

Summary

TL;DR: This ticket proposes to split the Detection Engine into subdomains, each of which would be represented by its own folder and owned by one of the Detection & Response area teams.

Problem

The Detection Engine was built by engineers of the @elastic/security-detections-response team a few years ago, and a lot has changed since then. Now when we have more features, more code to maintain, some people left, some people joined, and 3 area teams, we face some challenges:

It's not entirely clear who owns what: what code, features, and folders are owned by which area.
We have a lot of files and folders such as "common", "utils", "types" that are owned by everybody (and so, in fact, nobody).
CODEOWNERS file is written in such a way that it requires code review from many teams even if changes are irrelevant to those teams. One of the main reasons for that seems to be the folder structure.
When you explore folders in security_solution, it's not immediately clear which ones are related to the Detection Engine.
Folder structure does not reflect all the features of the Detection Engine.
It's not always clear where to put what code, e.g. when you implement a new feature.
There's no clear place for domain models and business logic, especially on the FE side.

Solution

The idea is to split the Detection Engine, which is a subdomain of a large domain "Security Solution", into its own smaller subdomains. For example, our subdomains could be alerts, rule_schema, rule_execution, rule_management, rule_monitoring, rule_exceptions, etc.

Details

Folder structure

Every subdomain would be represented by its own top-level folder under detection_engine at every "technical" level: common, public, and server. For example, the alerts subdomain would live in:

security_solution/common/detection_engine/alerts
security_solution/public/detection_engine/alerts
security_solution/server/detection_engine/alerts

Inside of those 3 folders, there would be a standard subfolder structure that would:

accommodate all kinds of code, such as domain models, business logic, API routes and clients, components, pages, cross-cutting concerns, utilities, etc
make it clear for everyone where to put which kind of code

An example of such a subdomain has already been implemented in #126063, please check the following folders in the main branch for any details:

security_solution/common/detection_engine/rule_monitoring
security_solution/public/detection_engine/rule_monitoring
security_solution/server/lib/detection_engine/rule_monitoring

Ownership

Every subdomain of the Detection Engine would be owned by one of the following teams:

@elastic/security-detections-response-alerts
@elastic/security-detections-response-rules
@elastic/security-solution-platform

Every area team would be the owner of one or a few subdomains.

This ownership would be fixed in the CODEOWNERS file.

Dependencies

This domain-based architecture should optimize for low coupling and high cohesion:

Low coupling between subdomains: there should be a few dependencies between them.
High cohesion within each subdomain: it's good if modules inside of a subdomain have a lot of dependencies on each other.

We should follow this principle when splitting the DE into subdomains, as well as when maintaining and evolving it after that.

Every subdomain should expose a clear and minimal contract for other subdomains.

Sub-tasks

These will need to be extracted into their own tickets:

Design the subdomains. Come up with a list of subdomains to extract. Think about the responsibilities that each of them would have. Think about the dependencies between them. Draw a diagram of subdomains and dependencies between them. (PR)
Design the internal folder structure for a subdomain. (PR)
TBD

The text was updated successfully, but these errors were encountered:

…r UX (#138770) ## Summary **API changes** - Adds API for determining the list-rule references. - Updates the exception items find api to include the `search` param which allows for simple search queries - used with the EUI search bar **UI updates** - Moved the exception components into new `rule_exceptions` folder per suggested folder structure updates listed [here](#138600) - Updates the rule details tabs to split endpoint and rule exceptions into their own tabs - Updates the viewer utilities header now that these different exception types are split - Updates exception item UI to match new designs - Updates the UI for when there are no items - Removes `use_exception_list_items` hook as it is no longer in use - Flyouts (add/edit) remain untouched

…r UX (elastic#138770) ## Summary **API changes** - Adds API for determining the list-rule references. - Updates the exception items find api to include the `search` param which allows for simple search queries - used with the EUI search bar **UI updates** - Moved the exception components into new `rule_exceptions` folder per suggested folder structure updates listed [here](elastic#138600) - Updates the rule details tabs to split endpoint and rule exceptions into their own tabs - Updates the viewer utilities header now that these different exception types are split - Updates exception item UI to match new designs - Updates the UI for when there are no items - Removes `use_exception_list_items` hook as it is no longer in use - Flyouts (add/edit) remain untouched

…toring Rule Management (#142950) **Partially addresses:** #138600, #92169, #138606 **Addresses:** #136957, #136962, #138614 ## Summary In this PR we are: - Splitting the Detection Engine into subdomains ([ticket](#138600)). Every subdomain got its own folder under `detection_engine`, and we moved some (not all) code into them. More on that is below. New subdomains introduced: - `fleet_integrations` - `prebuilt_rules` - `rule_actions_legacy` - `rule_exceptions` - `rule_management` - `rule_preview` - `rule_schema` - `rule_creation_ui` - `rule_details_ui` - `rule_management_ui` - `rule_exceptions_ui` - Updating the CODEOWNERS file accordingly. - Refactoring the Rule Management page and the Rules table. Our main focus was on the way how we communicate with the API endpoints, how we cache and invalidate the fetched data, and how this code is organized in the codebase. More on that is below. - Increasing the bundle size limit. This is going to be decreased back in a follow-up PR ([ticket](#143532)) ## Restructuring folders into subdomains For the background and problem statement, please refer to #138600 We were focusing on code that is closely related to the Rules area: either owned by us de facto (we work on it) or owned by us de jure (according to the CODEOWNERS file). Or goal was to explicitly extract code that we don't own de facto into separate subdomains, transfer ownership to other area teams, and reflect this in the CODEOWNERS file. On the other hand, we wanted the code that we own to also be organized in clear subdomains that we could easily own via CODEOWNERS. We didn't touch the code that is already explicitly owned by other area teams, e.g. `x-pack/plugins/security_solution/server/lib/detection_engine/rule_types`. This is a draft "domain map" - an architectural diagram that shows how the Detection Engine _could_ be split into subdomains. It's more a TO-BE idea/aspiration rather than an AS-IS statement. Any feedback, critiques, and suggestions would be extremely appreciated! <img width="2592" alt="Screenshot 2022-10-18 at 16 08 40" src="https://user-images.githubusercontent.com/7359339/196453965-b65f5b49-9a33-4d90-bb48-1347e9576223.png"> It shows the flow of dependencies between subdomains and proposes some rules: - The whole graph of dependencies between all subdomains should be a DAG. There should not be bi-directional or circular dependencies between them. - **Generic subdomains** represent some general knowledge that can be used/applied outside of the Detection Engine. - Can depend on some generic kbn packages, npm packages or utils. - Can't depend on any other Detection Engine subdomains. - **Crosscutting subdomains** represent some code that can be common to / shared between many other subdomains. This could be some very common domain models and API schemas. - Can depend on generic subdomains. - Can depend on other crosscutting subdomains (dependencies between them must form a DAG). - Can't depend on core or UI subdomains. - **Core subdomains** contain most of the "meat" of the Detection Engine: domain models, server-side and client-side business logic, server-side API endpoints, client-side UI (potentially shareable between several pages). - Can depend on crosscutting and generic subdomains. - Can depend on other core subdomains (dependencies between them must form a DAG). - Can't depend on UI subdomains. - **UI subdomains** contain the implementation of pages related to the Detection Engine. Every page can easily depend on several core subdomains, so these subdomain are on top of everything. - Can depend on any other subdomains. Dependencies must form a DAG. Dashed lines show some existing dependencies that we think should be eliminated. Ownership TO-BE is color-coded. We updated the CODEOWNERS file according to the new folders. The folder restructuring is not 100% finished but we did a big part of it. Most of the FE code continues to live in legacy folders, e.g. see `x-pack/plugins/security_solution/public/detections`. So this work is to be continued... ## Refactoring of Rule Management FE - [x] #136957 For effective HTTP requests caching and deduplication, we've migrated all data fetching logic to `useQuery` and `useMutation` hooks from `react-query`. That allowed us to introduce the following improvements to our codebase: * All outgoing HTTP requests are now automatically deduplicated. That means that data fetching hooks like `useRule` could be used on any level in the component tree to access response data directly. So, no need to put the hook on the top level anymore and use prop-drilling to make the response data available to all children components that require it. * All HTTP responses are now cached with the default TTL of 5 minutes—no more redundant requests. With a hot cache, transitions to some pages now happen immediately. - [x] #136962 Data fetching hooks of the Rules Area are now organized in one place. `security_solution/public/detection_engine/rule_management/api/hooks` contains abstraction layer on top of the Kibana's HTTP client. All data fetching should happen exclusively through that layer to ensure that: * Mutation queries automatically invalidate associated cache entries. * Optimistic updates or updates from mutation responses could be implemented centrally where possible. - [x] #92169 From some of the Rule Management components, logic was extracted to hooks located in `security_solution/public/detection_engine/rule_management/logic`. ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios

botelastic bot added the needs-team Issues missing a team label label Aug 11, 2022

yctercero mentioned this issue Aug 25, 2022

[Security Solution][Exceptions] - Initial updates to exceptions viewer UX #138770

Merged

10 tasks

banderror mentioned this issue Sep 19, 2022

[Osquery] Add Detection action #133279

Merged

This was referenced Oct 4, 2022

[Security Solution] Adjust detection engine routes codeowners #142570

Merged

[Security Solution] Restructuring folders of Detection Engine + refactoring Rule Management #142950

Merged

banderror mentioned this issue Dec 2, 2022

[Security Solution] Decompose Timelines TGrid component and moved to security_solution #140151

Merged

yctercero added Team:Detection Engine Security Solution Detection Engine Area and removed Team:Detection Alerts Security Detection Alerts Area Team Team:Security Solution Platform Security Solution Platform Team labels May 13, 2023

yctercero mentioned this issue Sep 6, 2023

[DE] - Detection Engine backlog overview #165878

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Security Solution] Domain-based architecture of the Detection Engine #138600

[Security Solution] Domain-based architecture of the Detection Engine #138600

banderror commented Aug 11, 2022 •

edited

Loading

[Security Solution] Domain-based architecture of the Detection Engine #138600

[Security Solution] Domain-based architecture of the Detection Engine #138600

Comments

banderror commented Aug 11, 2022 • edited Loading

Summary

Problem

Solution

Details

Folder structure

Ownership

Dependencies

Sub-tasks

banderror commented Aug 11, 2022 •

edited

Loading