Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] [Security Solution] Install prebuilt rules package using stream-based approach #195888

Merged
merged 1 commit into from
Nov 5, 2024

Conversation

xcrzx
Copy link
Contributor

@xcrzx xcrzx commented Oct 11, 2024

Resolves: #192350

Summary

Implemented stream-based installation of the detection rules package.

Background: The installation of the detection rules package was causing OOM (Out of Memory) errors in Serverless environments where the available memory is limited to 1GB. The root cause of the errors was that during installation, the package was being read and unzipped entirely into memory. Given the large package size, this led to OOMs. To address these memory issues, the following changes were made:

  1. Added a branching logic to the installPackageFromRegistry and installPackageByUpload methods, where based on the package name is decided to use streaming or not. Only one security_detection_engine package is currently hardcoded to use streaming.
  2. In the state machine then defined a separate set of steps for the stream-based package installation. It is reduced to cover only Kibana assets installation at this stage.
  3. A new stepInstallKibanaAssetsWithStreaming step is added to handle assets installation. While this method still reads the package archive into memory (since unzipping from a readable stream is not possible due to the design of the .zip format), the package is unzipped using streams after being read into a buffer. This allows only a small portion of the archive (100 saved objects at a time) to be unpacked into memory, reducing memory usage.
  4. The new method also includes several optimizations, such as only removing previously installed assets if they are missing in the new package and using savedObjectClient.bulkCreate instead of the less efficient savedObjectClient.import.

Test environment

  1. Prebuilt detection rules package with ~20k saved objects; 118MB zipped.
  2. Local package registry.
  3. Production build of Kibana running locally with a 700MB max old space limit, pointed to that registry.

Setting up a test environment is not completely straightforward. Here's a rough outline of the steps:

How to test this PR
  1. Create a package containing a large number of prebuilt rules.
    1. I used the package-storage repository to find one of the previously released prebuilt rules packages.
    2. Multiplied the number of assets in the package to 20k historical versions.
    3. Built the package using elastic-package build.
  2. Start a local package registry serving the built package using elastic-package stack up --services package-registry.
  3. Create a production build of Kibana. To speed up the process, unnecessary artifacts can be skipped:
    node scripts/build --skip-cdn-assets --skip-docker-ubi --skip-docker-ubuntu --skip-docker-wolfi --skip-docker-fips
    
  4. Provide the built Kibana with a config pointing to the local registry. The config is located in build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64/config/kibana.yml. You can use the following config:
    csp.strict: false
    xpack.security.encryptionKey: 've4Vohnu oa0Fu9ae Eethee8c oDieg4do Nohrah1u ao9Hu2oh Aeb4Ieyi Aew1aegi'
    xpack.encryptedSavedObjects.encryptionKey: 'Shah7nai Eew6izai Eir7OoW0 Gewi2ief eiSh8woo shoogh7E Quae6hal ce6Oumah'
    
    xpack.fleet.internal.registry.kibanaVersionCheckEnabled: false
    xpack.fleet.registryUrl: https://localhost:8080
    
    elasticsearch:
      username: 'kibana_system'
      password: 'changeme'
      hosts: 'http://localhost:9200'
    
  5. Override the Node options Kibana starts with to allow it to connect to the local registry and set the memory limit. For this, you need to edit the build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64/bin/kibana file:
    NODE_OPTIONS="--no-warnings --max-http-header-size=65536 --unhandled-rejections=warn --dns-result-order=ipv4first --openssl-legacy-provider --max_old_space_size=700 --inspect" NODE_ENV=production NODE_EXTRA_CA_CERTS=~/.elastic-package/profiles/default/certs/ca-cert.pem exec "${NODE}" "${DIR}/src/cli/dist" "${@}"
    
  6. Navigate to the build folder: build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64.
  7. Start Kibana using ./bin/kibana.
  8. Kibana is now running in debug mode, with the debugger started on port 9229. You can connect to it using VS Code's debug config or Chrome's DevTools.
  9. Now you can install prebuilt detection rules by calling the POST /internal/detection_engine/prebuilt_rules/_bootstrap endpoint, which uses the new streaming installation under the hood.

Test results locally

Without the streaming approach

Guaranteed OOM. Even smaller packages, up to 10k rules, caused sporadic OOM errors. So for comparison, tested the package installation without memory limits.

Screenshot 2024-10-14 at 14 15 26

  1. Heap memory usage spikes up to 2.5GB
  2. External memory consumes up to 450 Mb, which is four times the archive size
  3. RSS (Resident Set Size) exceeds 4.5GB

With the streaming approach

No OOM errors observed. The memory consumption chart looks like the following:

Screenshot 2024-10-14 at 11 15 21

  1. Heap memory remains stable, around 450MB, without any spikes.
  2. External memory jumps to around 250MB at the beginning of the installation, then drops to around 120MB, which is roughly equal to the package archive size. I couldn't determine why the external memory consumption exceeds the package size by 2x when the installation starts. I checked the code for places where the package might be loaded into memory twice but found nothing suspicious. This might be worth investigating further.
  3. RSS remains stable, peaking slightly above 1GB. I believe this is the upper limit for a package that can be handled without errors in a Serverless environment, where the memory limit is dictated by pod-level settings rather than Node settings and is set to 1GB. I'll verify this on a real Serverless instance to confirm.

Test results on Serverless

Screenshot 2024-10-31 at 12 31 34

@xcrzx xcrzx self-assigned this Oct 11, 2024
@xcrzx xcrzx force-pushed the stream-based-installation branch 5 times, most recently from 0beda4f to 8e14eac Compare October 18, 2024 15:30
@xcrzx xcrzx force-pushed the stream-based-installation branch 7 times, most recently from eb331d5 to 1b97750 Compare October 25, 2024 15:46
@xcrzx xcrzx marked this pull request as ready for review October 25, 2024 15:47
@xcrzx xcrzx requested a review from a team as a code owner October 25, 2024 15:47
@xcrzx xcrzx added performance release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:Detection Rule Management Security Detection Rule Management Team Feature:Prebuilt Detection Rules Security Solution Prebuilt Detection Rules area v8.17.0 labels Oct 25, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detection-rule-management (Team:Detection Rule Management)

@xcrzx xcrzx added the backport:version Backport to applied version labels label Oct 25, 2024
@xcrzx xcrzx requested a review from nchaulet October 28, 2024 14:14
@xcrzx xcrzx added the ci:project-deploy-security Create a Security Serverless Project label Oct 28, 2024
@xcrzx xcrzx force-pushed the stream-based-installation branch 2 times, most recently from 94a3e50 to 75eea3e Compare October 28, 2024 18:52
@@ -46,7 +45,7 @@ export async function unzipBuffer(
if (!filter({ path })) return zipfile.readEntry();

const entryBuffer = await getZipReadStream(zipfile, entry).then(streamToBuffer);
onEntry({ buffer: entryBuffer, path });
await onEntry({ buffer: entryBuffer, path });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happen if an error happens during the onEntry call here? should we handle it and do we have to manually close the zipFile?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Added a try ... finally block to continue iterating on an error. That matches the behavior of untarBuffer

Copy link
Member

@nchaulet nchaulet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally and it seems to works well LGTM 🚀

It will be great to add an api integration test that install a streaming package, maybe we can add a security_detection_engine with a few rules in x-pack/test/fleet_api_integration/apis/fixtures/test_packages and try to install it? it could also come as a follow up issue/PR

@xcrzx xcrzx force-pushed the stream-based-installation branch from 75eea3e to e374721 Compare November 4, 2024 10:12
@xcrzx xcrzx force-pushed the stream-based-installation branch from e374721 to d6a3036 Compare November 5, 2024 10:03
@xcrzx
Copy link
Contributor Author

xcrzx commented Nov 5, 2024

It will be great to add an api integration test that install a streaming package, maybe we can add a security_detection_engine with a few rules in x-pack/test/fleet_api_integration/apis/fixtures/test_packages and try to install it?

Added an integration test x-pack/test/fleet_api_integration/apis/epm/install_with_streaming.ts 👍

@elasticmachine
Copy link
Contributor

elasticmachine commented Nov 5, 2024

⏳ Build in-progress

History

cc @xcrzx

@xcrzx xcrzx merged commit 67cdb93 into elastic:main Nov 5, 2024
26 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.x

https://github.com/elastic/kibana/actions/runs/11683891974

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Nov 5, 2024
…am-based approach (elastic#195888)

**Resolves: elastic#192350

## Summary

Implemented stream-based installation of the detection rules package.

**Background**: The installation of the detection rules package was
causing OOM (Out of Memory) errors in Serverless environments where the
available memory is limited to 1GB. The root cause of the errors was
that during installation, the package was being read and unzipped
entirely into memory. Given the large package size, this led to OOMs. To
address these memory issues, the following changes were made:

1. Added a branching logic to the `installPackageFromRegistry` and
`installPackageByUpload` methods, where based on the package name is
decided to use streaming or not. Only one `security_detection_engine`
package is currently hardcoded to use streaming.
2. In the state machine then defined a separate set of steps for the
stream-based package installation. It is reduced to cover only Kibana
assets installation at this stage.
3. A new `stepInstallKibanaAssetsWithStreaming` step is added to handle
assets installation. While this method still reads the package archive
into memory (since unzipping from a readable stream is [not possible due
to the design of the .zip
format](https://github.com/thejoshwolfe/yauzl?tab=readme-ov-file#no-streaming-unzip-api)),
the package is unzipped using streams after being read into a buffer.
This allows only a small portion of the archive (100 saved objects at a
time) to be unpacked into memory, reducing memory usage.
4. The new method also includes several optimizations, such as only
removing previously installed assets if they are missing in the new
package and using `savedObjectClient.bulkCreate` instead of the less
efficient `savedObjectClient.import`.

### Test environment

1. Prebuilt detection rules package with ~20k saved objects; 118MB
zipped.
5. Local package registry.
6. Production build of Kibana running locally with a 700MB max old space
limit, pointed to that registry.

Setting up a test environment is not completely straightforward. Here's
a rough outline of the steps:
<details>
<summary>
How to test this PR
</summary>

1. Create a package containing a large number of prebuilt rules.
1. I used the `package-storage` repository to find one of the previously
released prebuilt rules packages.
2. Multiplied the number of assets in the package to 20k historical
versions.
   4. Built the package using `elastic-package build`.
2. Start a local package registry serving the built package using
`elastic-package stack up --services package-registry`.
4. Create a production build of Kibana. To speed up the process,
unnecessary artifacts can be skipped:
    ```
node scripts/build --skip-cdn-assets --skip-docker-ubi
--skip-docker-ubuntu --skip-docker-wolfi --skip-docker-fips
    ```
7. Provide the built Kibana with a config pointing to the local
registry. The config is located in
`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64/config/kibana.yml`.
You can use the following config:
    ```
    csp.strict: false
xpack.security.encryptionKey: 've4Vohnu oa0Fu9ae Eethee8c oDieg4do
Nohrah1u ao9Hu2oh Aeb4Ieyi Aew1aegi'
xpack.encryptedSavedObjects.encryptionKey: 'Shah7nai Eew6izai Eir7OoW0
Gewi2ief eiSh8woo shoogh7E Quae6hal ce6Oumah'

    xpack.fleet.internal.registry.kibanaVersionCheckEnabled: false
    xpack.fleet.registryUrl: https://localhost:8080

    elasticsearch:
      username: 'kibana_system'
      password: 'changeme'
      hosts: 'http://localhost:9200'
    ```
8. Override the Node options Kibana starts with to allow it to connect
to the local registry and set the memory limit. For this, you need to
edit the `build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64/bin/kibana`
file:
    ```
NODE_OPTIONS="--no-warnings --max-http-header-size=65536
--unhandled-rejections=warn --dns-result-order=ipv4first
--openssl-legacy-provider --max_old_space_size=700 --inspect"
NODE_ENV=production
NODE_EXTRA_CA_CERTS=~/.elastic-package/profiles/default/certs/ca-cert.pem
exec "${NODE}" "${DIR}/src/cli/dist" "${@}"
    ```
9. Navigate to the build folder:
`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64`.
10. Start Kibana using `./bin/kibana`.
11. Kibana is now running in debug mode, with the debugger started on
port 9229. You can connect to it using VS Code's debug config or
Chrome's DevTools.
12. Now you can install prebuilt detection rules by calling the `POST
/internal/detection_engine/prebuilt_rules/_bootstrap` endpoint, which
uses the new streaming installation under the hood.

</details>

### Test results locally

**Without the streaming approach**

Guaranteed OOM. Even smaller packages, up to 10k rules, caused sporadic
OOM errors. So for comparison, tested the package installation without
memory limits.

![Screenshot 2024-10-14 at 14 15
26](https://github.com/user-attachments/assets/131cb877-2404-4638-b619-b1370a53659f)

1. Heap memory usage spikes up to 2.5GB
5. External memory consumes up to 450 Mb, which is four times the
archive size
13. RSS (Resident Set Size) exceeds 4.5GB

**With the streaming approach**

No OOM errors observed. The memory consumption chart looks like the
following:

![Screenshot 2024-10-14 at 11 15
21](https://github.com/user-attachments/assets/b47ba8c9-2ba7-42de-b921-c33104d4481e)

1. Heap memory remains stable, around 450MB, without any spikes.
2. External memory jumps to around 250MB at the beginning of the
installation, then drops to around 120MB, which is roughly equal to the
package archive size. I couldn't determine why the external memory
consumption exceeds the package size by 2x when the installation starts.
I checked the code for places where the package might be loaded into
memory twice but found nothing suspicious. This might be worth
investigating further.
3. RSS remains stable, peaking slightly above 1GB. I believe this is the
upper limit for a package that can be handled without errors in a
Serverless environment, where the memory limit is dictated by pod-level
settings rather than Node settings and is set to 1GB. I'll verify this
on a real Serverless instance to confirm.

### Test results on Serverless

![Screenshot 2024-10-31 at 12 31
34](https://github.com/user-attachments/assets/d20d2860-fa96-4e56-be2b-7b3c0b5c7b77)

(cherry picked from commit 67cdb93)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

@xcrzx xcrzx deleted the stream-based-installation branch November 5, 2024 13:35
kibanamachine added a commit that referenced this pull request Nov 5, 2024
…g stream-based approach (#195888) (#198936)

# Backport

This will backport the following commits from `main` to `8.x`:
- [[Fleet] [Security Solution] Install prebuilt rules package using
stream-based approach
(#195888)](#195888)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Dmitrii
Shevchenko","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-11-05T12:11:47Z","message":"[Fleet]
[Security Solution] Install prebuilt rules package using stream-based
approach (#195888)\n\n**Resolves:
https://github.com/elastic/kibana/issues/192350**\r\n\r\n##
Summary\r\n\r\nImplemented stream-based installation of the detection
rules package.\r\n\r\n**Background**: The installation of the detection
rules package was\r\ncausing OOM (Out of Memory) errors in Serverless
environments where the\r\navailable memory is limited to 1GB. The root
cause of the errors was\r\nthat during installation, the package was
being read and unzipped\r\nentirely into memory. Given the large package
size, this led to OOMs. To\r\naddress these memory issues, the following
changes were made:\r\n\r\n1. Added a branching logic to the
`installPackageFromRegistry` and\r\n`installPackageByUpload` methods,
where based on the package name is\r\ndecided to use streaming or not.
Only one `security_detection_engine`\r\npackage is currently hardcoded
to use streaming.\r\n2. In the state machine then defined a separate set
of steps for the\r\nstream-based package installation. It is reduced to
cover only Kibana\r\nassets installation at this stage.\r\n3. A new
`stepInstallKibanaAssetsWithStreaming` step is added to handle\r\nassets
installation. While this method still reads the package archive\r\ninto
memory (since unzipping from a readable stream is [not possible
due\r\nto the design of the
.zip\r\nformat](https://github.com/thejoshwolfe/yauzl?tab=readme-ov-file#no-streaming-unzip-api)),\r\nthe
package is unzipped using streams after being read into a
buffer.\r\nThis allows only a small portion of the archive (100 saved
objects at a\r\ntime) to be unpacked into memory, reducing memory
usage.\r\n4. The new method also includes several optimizations, such as
only\r\nremoving previously installed assets if they are missing in the
new\r\npackage and using `savedObjectClient.bulkCreate` instead of the
less\r\nefficient `savedObjectClient.import`.\r\n\r\n### Test
environment\r\n\r\n1. Prebuilt detection rules package with ~20k saved
objects; 118MB\r\nzipped.\r\n5. Local package registry.\r\n6. Production
build of Kibana running locally with a 700MB max old space\r\nlimit,
pointed to that registry.\r\n\r\nSetting up a test environment is not
completely straightforward. Here's\r\na rough outline of the
steps:\r\n<details>\r\n<summary>\r\nHow to test this
PR\r\n</summary>\r\n\r\n1. Create a package containing a large number of
prebuilt rules.\r\n1. I used the `package-storage` repository to find
one of the previously\r\nreleased prebuilt rules packages.\r\n2.
Multiplied the number of assets in the package to 20k
historical\r\nversions.\r\n 4. Built the package using `elastic-package
build`.\r\n2. Start a local package registry serving the built package
using\r\n`elastic-package stack up --services package-registry`.\r\n4.
Create a production build of Kibana. To speed up the
process,\r\nunnecessary artifacts can be skipped:\r\n ```\r\nnode
scripts/build --skip-cdn-assets
--skip-docker-ubi\r\n--skip-docker-ubuntu --skip-docker-wolfi
--skip-docker-fips\r\n ```\r\n7. Provide the built Kibana with a config
pointing to the local\r\nregistry. The config is located
in\r\n`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64/config/kibana.yml`.\r\nYou
can use the following config:\r\n ```\r\n csp.strict:
false\r\nxpack.security.encryptionKey: 've4Vohnu oa0Fu9ae Eethee8c
oDieg4do\r\nNohrah1u ao9Hu2oh Aeb4Ieyi
Aew1aegi'\r\nxpack.encryptedSavedObjects.encryptionKey: 'Shah7nai
Eew6izai Eir7OoW0\r\nGewi2ief eiSh8woo shoogh7E Quae6hal
ce6Oumah'\r\n\r\n
xpack.fleet.internal.registry.kibanaVersionCheckEnabled: false\r\n
xpack.fleet.registryUrl: https://localhost:8080\r\n\r\n
elasticsearch:\r\n username: 'kibana_system'\r\n password:
'changeme'\r\n hosts: 'http://localhost:9200'\r\n ```\r\n8. Override the
Node options Kibana starts with to allow it to connect\r\nto the local
registry and set the memory limit. For this, you need to\r\nedit the
`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64/bin/kibana`\r\nfile:\r\n
```\r\nNODE_OPTIONS=\"--no-warnings
--max-http-header-size=65536\r\n--unhandled-rejections=warn
--dns-result-order=ipv4first\r\n--openssl-legacy-provider
--max_old_space_size=700
--inspect\"\r\nNODE_ENV=production\r\nNODE_EXTRA_CA_CERTS=~/.elastic-package/profiles/default/certs/ca-cert.pem\r\nexec
\"${NODE}\" \"${DIR}/src/cli/dist\" \"${@}\"\r\n ```\r\n9. Navigate to
the build
folder:\r\n`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64`.\r\n10.
Start Kibana using `./bin/kibana`.\r\n11. Kibana is now running in debug
mode, with the debugger started on\r\nport 9229. You can connect to it
using VS Code's debug config or\r\nChrome's DevTools.\r\n12. Now you can
install prebuilt detection rules by calling the
`POST\r\n/internal/detection_engine/prebuilt_rules/_bootstrap` endpoint,
which\r\nuses the new streaming installation under the
hood.\r\n\r\n</details>\r\n\r\n### Test results locally\r\n\r\n**Without
the streaming approach**\r\n\r\nGuaranteed OOM. Even smaller packages,
up to 10k rules, caused sporadic\r\nOOM errors. So for comparison,
tested the package installation without\r\nmemory
limits.\r\n\r\n![Screenshot 2024-10-14 at 14
15\r\n26](https://github.com/user-attachments/assets/131cb877-2404-4638-b619-b1370a53659f)\r\n\r\n1.
Heap memory usage spikes up to 2.5GB\r\n5. External memory consumes up
to 450 Mb, which is four times the\r\narchive size\r\n13. RSS (Resident
Set Size) exceeds 4.5GB\r\n\r\n**With the streaming approach**\r\n\r\nNo
OOM errors observed. The memory consumption chart looks like
the\r\nfollowing:\r\n\r\n![Screenshot 2024-10-14 at 11
15\r\n21](https://github.com/user-attachments/assets/b47ba8c9-2ba7-42de-b921-c33104d4481e)\r\n\r\n1.
Heap memory remains stable, around 450MB, without any spikes.\r\n2.
External memory jumps to around 250MB at the beginning of
the\r\ninstallation, then drops to around 120MB, which is roughly equal
to the\r\npackage archive size. I couldn't determine why the external
memory\r\nconsumption exceeds the package size by 2x when the
installation starts.\r\nI checked the code for places where the package
might be loaded into\r\nmemory twice but found nothing suspicious. This
might be worth\r\ninvestigating further.\r\n3. RSS remains stable,
peaking slightly above 1GB. I believe this is the\r\nupper limit for a
package that can be handled without errors in a\r\nServerless
environment, where the memory limit is dictated by pod-level\r\nsettings
rather than Node settings and is set to 1GB. I'll verify this\r\non a
real Serverless instance to confirm.\r\n\r\n### Test results on
Serverless\r\n\r\n![Screenshot 2024-10-31 at 12
31\r\n34](https://github.com/user-attachments/assets/d20d2860-fa96-4e56-be2b-7b3c0b5c7b77)","sha":"67cdb93f5b800caac80672c942d04afe4d7aa4d8","branchLabelMapping":{"^v9.0.0$":"main","^v8.17.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["performance","release_note:skip","Team:Fleet","v9.0.0","Team:Detections
and Resp","Team: SecuritySolution","Team:Detection Rule
Management","Feature:Prebuilt Detection
Rules","ci:project-deploy-security","backport:version","v8.17.0"],"title":"[Fleet]
[Security Solution] Install prebuilt rules package using stream-based
approach","number":195888,"url":"https://github.com/elastic/kibana/pull/195888","mergeCommit":{"message":"[Fleet]
[Security Solution] Install prebuilt rules package using stream-based
approach (#195888)\n\n**Resolves:
https://github.com/elastic/kibana/issues/192350**\r\n\r\n##
Summary\r\n\r\nImplemented stream-based installation of the detection
rules package.\r\n\r\n**Background**: The installation of the detection
rules package was\r\ncausing OOM (Out of Memory) errors in Serverless
environments where the\r\navailable memory is limited to 1GB. The root
cause of the errors was\r\nthat during installation, the package was
being read and unzipped\r\nentirely into memory. Given the large package
size, this led to OOMs. To\r\naddress these memory issues, the following
changes were made:\r\n\r\n1. Added a branching logic to the
`installPackageFromRegistry` and\r\n`installPackageByUpload` methods,
where based on the package name is\r\ndecided to use streaming or not.
Only one `security_detection_engine`\r\npackage is currently hardcoded
to use streaming.\r\n2. In the state machine then defined a separate set
of steps for the\r\nstream-based package installation. It is reduced to
cover only Kibana\r\nassets installation at this stage.\r\n3. A new
`stepInstallKibanaAssetsWithStreaming` step is added to handle\r\nassets
installation. While this method still reads the package archive\r\ninto
memory (since unzipping from a readable stream is [not possible
due\r\nto the design of the
.zip\r\nformat](https://github.com/thejoshwolfe/yauzl?tab=readme-ov-file#no-streaming-unzip-api)),\r\nthe
package is unzipped using streams after being read into a
buffer.\r\nThis allows only a small portion of the archive (100 saved
objects at a\r\ntime) to be unpacked into memory, reducing memory
usage.\r\n4. The new method also includes several optimizations, such as
only\r\nremoving previously installed assets if they are missing in the
new\r\npackage and using `savedObjectClient.bulkCreate` instead of the
less\r\nefficient `savedObjectClient.import`.\r\n\r\n### Test
environment\r\n\r\n1. Prebuilt detection rules package with ~20k saved
objects; 118MB\r\nzipped.\r\n5. Local package registry.\r\n6. Production
build of Kibana running locally with a 700MB max old space\r\nlimit,
pointed to that registry.\r\n\r\nSetting up a test environment is not
completely straightforward. Here's\r\na rough outline of the
steps:\r\n<details>\r\n<summary>\r\nHow to test this
PR\r\n</summary>\r\n\r\n1. Create a package containing a large number of
prebuilt rules.\r\n1. I used the `package-storage` repository to find
one of the previously\r\nreleased prebuilt rules packages.\r\n2.
Multiplied the number of assets in the package to 20k
historical\r\nversions.\r\n 4. Built the package using `elastic-package
build`.\r\n2. Start a local package registry serving the built package
using\r\n`elastic-package stack up --services package-registry`.\r\n4.
Create a production build of Kibana. To speed up the
process,\r\nunnecessary artifacts can be skipped:\r\n ```\r\nnode
scripts/build --skip-cdn-assets
--skip-docker-ubi\r\n--skip-docker-ubuntu --skip-docker-wolfi
--skip-docker-fips\r\n ```\r\n7. Provide the built Kibana with a config
pointing to the local\r\nregistry. The config is located
in\r\n`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64/config/kibana.yml`.\r\nYou
can use the following config:\r\n ```\r\n csp.strict:
false\r\nxpack.security.encryptionKey: 've4Vohnu oa0Fu9ae Eethee8c
oDieg4do\r\nNohrah1u ao9Hu2oh Aeb4Ieyi
Aew1aegi'\r\nxpack.encryptedSavedObjects.encryptionKey: 'Shah7nai
Eew6izai Eir7OoW0\r\nGewi2ief eiSh8woo shoogh7E Quae6hal
ce6Oumah'\r\n\r\n
xpack.fleet.internal.registry.kibanaVersionCheckEnabled: false\r\n
xpack.fleet.registryUrl: https://localhost:8080\r\n\r\n
elasticsearch:\r\n username: 'kibana_system'\r\n password:
'changeme'\r\n hosts: 'http://localhost:9200'\r\n ```\r\n8. Override the
Node options Kibana starts with to allow it to connect\r\nto the local
registry and set the memory limit. For this, you need to\r\nedit the
`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64/bin/kibana`\r\nfile:\r\n
```\r\nNODE_OPTIONS=\"--no-warnings
--max-http-header-size=65536\r\n--unhandled-rejections=warn
--dns-result-order=ipv4first\r\n--openssl-legacy-provider
--max_old_space_size=700
--inspect\"\r\nNODE_ENV=production\r\nNODE_EXTRA_CA_CERTS=~/.elastic-package/profiles/default/certs/ca-cert.pem\r\nexec
\"${NODE}\" \"${DIR}/src/cli/dist\" \"${@}\"\r\n ```\r\n9. Navigate to
the build
folder:\r\n`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64`.\r\n10.
Start Kibana using `./bin/kibana`.\r\n11. Kibana is now running in debug
mode, with the debugger started on\r\nport 9229. You can connect to it
using VS Code's debug config or\r\nChrome's DevTools.\r\n12. Now you can
install prebuilt detection rules by calling the
`POST\r\n/internal/detection_engine/prebuilt_rules/_bootstrap` endpoint,
which\r\nuses the new streaming installation under the
hood.\r\n\r\n</details>\r\n\r\n### Test results locally\r\n\r\n**Without
the streaming approach**\r\n\r\nGuaranteed OOM. Even smaller packages,
up to 10k rules, caused sporadic\r\nOOM errors. So for comparison,
tested the package installation without\r\nmemory
limits.\r\n\r\n![Screenshot 2024-10-14 at 14
15\r\n26](https://github.com/user-attachments/assets/131cb877-2404-4638-b619-b1370a53659f)\r\n\r\n1.
Heap memory usage spikes up to 2.5GB\r\n5. External memory consumes up
to 450 Mb, which is four times the\r\narchive size\r\n13. RSS (Resident
Set Size) exceeds 4.5GB\r\n\r\n**With the streaming approach**\r\n\r\nNo
OOM errors observed. The memory consumption chart looks like
the\r\nfollowing:\r\n\r\n![Screenshot 2024-10-14 at 11
15\r\n21](https://github.com/user-attachments/assets/b47ba8c9-2ba7-42de-b921-c33104d4481e)\r\n\r\n1.
Heap memory remains stable, around 450MB, without any spikes.\r\n2.
External memory jumps to around 250MB at the beginning of
the\r\ninstallation, then drops to around 120MB, which is roughly equal
to the\r\npackage archive size. I couldn't determine why the external
memory\r\nconsumption exceeds the package size by 2x when the
installation starts.\r\nI checked the code for places where the package
might be loaded into\r\nmemory twice but found nothing suspicious. This
might be worth\r\ninvestigating further.\r\n3. RSS remains stable,
peaking slightly above 1GB. I believe this is the\r\nupper limit for a
package that can be handled without errors in a\r\nServerless
environment, where the memory limit is dictated by pod-level\r\nsettings
rather than Node settings and is set to 1GB. I'll verify this\r\non a
real Serverless instance to confirm.\r\n\r\n### Test results on
Serverless\r\n\r\n![Screenshot 2024-10-31 at 12
31\r\n34](https://github.com/user-attachments/assets/d20d2860-fa96-4e56-be2b-7b3c0b5c7b77)","sha":"67cdb93f5b800caac80672c942d04afe4d7aa4d8"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/195888","number":195888,"mergeCommit":{"message":"[Fleet]
[Security Solution] Install prebuilt rules package using stream-based
approach (#195888)\n\n**Resolves:
https://github.com/elastic/kibana/issues/192350**\r\n\r\n##
Summary\r\n\r\nImplemented stream-based installation of the detection
rules package.\r\n\r\n**Background**: The installation of the detection
rules package was\r\ncausing OOM (Out of Memory) errors in Serverless
environments where the\r\navailable memory is limited to 1GB. The root
cause of the errors was\r\nthat during installation, the package was
being read and unzipped\r\nentirely into memory. Given the large package
size, this led to OOMs. To\r\naddress these memory issues, the following
changes were made:\r\n\r\n1. Added a branching logic to the
`installPackageFromRegistry` and\r\n`installPackageByUpload` methods,
where based on the package name is\r\ndecided to use streaming or not.
Only one `security_detection_engine`\r\npackage is currently hardcoded
to use streaming.\r\n2. In the state machine then defined a separate set
of steps for the\r\nstream-based package installation. It is reduced to
cover only Kibana\r\nassets installation at this stage.\r\n3. A new
`stepInstallKibanaAssetsWithStreaming` step is added to handle\r\nassets
installation. While this method still reads the package archive\r\ninto
memory (since unzipping from a readable stream is [not possible
due\r\nto the design of the
.zip\r\nformat](https://github.com/thejoshwolfe/yauzl?tab=readme-ov-file#no-streaming-unzip-api)),\r\nthe
package is unzipped using streams after being read into a
buffer.\r\nThis allows only a small portion of the archive (100 saved
objects at a\r\ntime) to be unpacked into memory, reducing memory
usage.\r\n4. The new method also includes several optimizations, such as
only\r\nremoving previously installed assets if they are missing in the
new\r\npackage and using `savedObjectClient.bulkCreate` instead of the
less\r\nefficient `savedObjectClient.import`.\r\n\r\n### Test
environment\r\n\r\n1. Prebuilt detection rules package with ~20k saved
objects; 118MB\r\nzipped.\r\n5. Local package registry.\r\n6. Production
build of Kibana running locally with a 700MB max old space\r\nlimit,
pointed to that registry.\r\n\r\nSetting up a test environment is not
completely straightforward. Here's\r\na rough outline of the
steps:\r\n<details>\r\n<summary>\r\nHow to test this
PR\r\n</summary>\r\n\r\n1. Create a package containing a large number of
prebuilt rules.\r\n1. I used the `package-storage` repository to find
one of the previously\r\nreleased prebuilt rules packages.\r\n2.
Multiplied the number of assets in the package to 20k
historical\r\nversions.\r\n 4. Built the package using `elastic-package
build`.\r\n2. Start a local package registry serving the built package
using\r\n`elastic-package stack up --services package-registry`.\r\n4.
Create a production build of Kibana. To speed up the
process,\r\nunnecessary artifacts can be skipped:\r\n ```\r\nnode
scripts/build --skip-cdn-assets
--skip-docker-ubi\r\n--skip-docker-ubuntu --skip-docker-wolfi
--skip-docker-fips\r\n ```\r\n7. Provide the built Kibana with a config
pointing to the local\r\nregistry. The config is located
in\r\n`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64/config/kibana.yml`.\r\nYou
can use the following config:\r\n ```\r\n csp.strict:
false\r\nxpack.security.encryptionKey: 've4Vohnu oa0Fu9ae Eethee8c
oDieg4do\r\nNohrah1u ao9Hu2oh Aeb4Ieyi
Aew1aegi'\r\nxpack.encryptedSavedObjects.encryptionKey: 'Shah7nai
Eew6izai Eir7OoW0\r\nGewi2ief eiSh8woo shoogh7E Quae6hal
ce6Oumah'\r\n\r\n
xpack.fleet.internal.registry.kibanaVersionCheckEnabled: false\r\n
xpack.fleet.registryUrl: https://localhost:8080\r\n\r\n
elasticsearch:\r\n username: 'kibana_system'\r\n password:
'changeme'\r\n hosts: 'http://localhost:9200'\r\n ```\r\n8. Override the
Node options Kibana starts with to allow it to connect\r\nto the local
registry and set the memory limit. For this, you need to\r\nedit the
`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64/bin/kibana`\r\nfile:\r\n
```\r\nNODE_OPTIONS=\"--no-warnings
--max-http-header-size=65536\r\n--unhandled-rejections=warn
--dns-result-order=ipv4first\r\n--openssl-legacy-provider
--max_old_space_size=700
--inspect\"\r\nNODE_ENV=production\r\nNODE_EXTRA_CA_CERTS=~/.elastic-package/profiles/default/certs/ca-cert.pem\r\nexec
\"${NODE}\" \"${DIR}/src/cli/dist\" \"${@}\"\r\n ```\r\n9. Navigate to
the build
folder:\r\n`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64`.\r\n10.
Start Kibana using `./bin/kibana`.\r\n11. Kibana is now running in debug
mode, with the debugger started on\r\nport 9229. You can connect to it
using VS Code's debug config or\r\nChrome's DevTools.\r\n12. Now you can
install prebuilt detection rules by calling the
`POST\r\n/internal/detection_engine/prebuilt_rules/_bootstrap` endpoint,
which\r\nuses the new streaming installation under the
hood.\r\n\r\n</details>\r\n\r\n### Test results locally\r\n\r\n**Without
the streaming approach**\r\n\r\nGuaranteed OOM. Even smaller packages,
up to 10k rules, caused sporadic\r\nOOM errors. So for comparison,
tested the package installation without\r\nmemory
limits.\r\n\r\n![Screenshot 2024-10-14 at 14
15\r\n26](https://github.com/user-attachments/assets/131cb877-2404-4638-b619-b1370a53659f)\r\n\r\n1.
Heap memory usage spikes up to 2.5GB\r\n5. External memory consumes up
to 450 Mb, which is four times the\r\narchive size\r\n13. RSS (Resident
Set Size) exceeds 4.5GB\r\n\r\n**With the streaming approach**\r\n\r\nNo
OOM errors observed. The memory consumption chart looks like
the\r\nfollowing:\r\n\r\n![Screenshot 2024-10-14 at 11
15\r\n21](https://github.com/user-attachments/assets/b47ba8c9-2ba7-42de-b921-c33104d4481e)\r\n\r\n1.
Heap memory remains stable, around 450MB, without any spikes.\r\n2.
External memory jumps to around 250MB at the beginning of
the\r\ninstallation, then drops to around 120MB, which is roughly equal
to the\r\npackage archive size. I couldn't determine why the external
memory\r\nconsumption exceeds the package size by 2x when the
installation starts.\r\nI checked the code for places where the package
might be loaded into\r\nmemory twice but found nothing suspicious. This
might be worth\r\ninvestigating further.\r\n3. RSS remains stable,
peaking slightly above 1GB. I believe this is the\r\nupper limit for a
package that can be handled without errors in a\r\nServerless
environment, where the memory limit is dictated by pod-level\r\nsettings
rather than Node settings and is set to 1GB. I'll verify this\r\non a
real Serverless instance to confirm.\r\n\r\n### Test results on
Serverless\r\n\r\n![Screenshot 2024-10-31 at 12
31\r\n34](https://github.com/user-attachments/assets/d20d2860-fa96-4e56-be2b-7b3c0b5c7b77)","sha":"67cdb93f5b800caac80672c942d04afe4d7aa4d8"}},{"branch":"8.x","label":"v8.17.0","branchLabelMappingKey":"^v8.17.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Dmitrii Shevchenko <[email protected]>
xcrzx added a commit that referenced this pull request Nov 7, 2024
…tion (#199122)

**Related to: #195888

## Summary

Add default index pattern creation to the new stream-based package
installation method to match the behavior of standard package
installation.

Switching to stream-based package installation resulted in the default
index patterns not being created, even after installing the rules
package. While this likely doesn’t affect production, as multiple
integrations are usually installed in Kibana (creating the default index
pattern in any case), this change has impacted some tests:
#199030. So restoring the original
behaviour
mbondyra pushed a commit to mbondyra/kibana that referenced this pull request Nov 8, 2024
…tion (elastic#199122)

**Related to: elastic#195888

## Summary

Add default index pattern creation to the new stream-based package
installation method to match the behavior of standard package
installation.

Switching to stream-based package installation resulted in the default
index patterns not being created, even after installing the rules
package. While this likely doesn’t affect production, as multiple
integrations are usually installed in Kibana (creating the default index
pattern in any case), this change has impacted some tests:
elastic#199030. So restoring the original
behaviour
xcrzx added a commit to xcrzx/kibana that referenced this pull request Nov 8, 2024
…tion (elastic#199122)

**Related to: elastic#195888

## Summary

Add default index pattern creation to the new stream-based package
installation method to match the behavior of standard package
installation.

Switching to stream-based package installation resulted in the default
index patterns not being created, even after installing the rules
package. While this likely doesn’t affect production, as multiple
integrations are usually installed in Kibana (creating the default index
pattern in any case), this change has impacted some tests:
elastic#199030. So restoring the original
behaviour

(cherry picked from commit 22d3e62)

# Conflicts:
#	x-pack/test/security_solution_cypress/cypress/e2e/investigations/sourcerer/sourcerer_timeline.cy.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:version Backport to applied version labels ci:project-deploy-security Create a Security Serverless Project Feature:Prebuilt Detection Rules Security Solution Prebuilt Detection Rules area performance release_note:skip Skip the PR/issue when compiling release notes Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team:Fleet Team label for Observability Data Collection Fleet team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v8.17.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Security Solution] Stream-based installation of the package with prebuilt rules
4 participants