Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[8.x] [Fleet] [Security Solution] Install prebuilt rules package usin…
…g stream-based approach (#195888) (#198936) # Backport This will backport the following commits from `main` to `8.x`: - [[Fleet] [Security Solution] Install prebuilt rules package using stream-based approach (#195888)](#195888) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Dmitrii Shevchenko","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-11-05T12:11:47Z","message":"[Fleet] [Security Solution] Install prebuilt rules package using stream-based approach (#195888)\n\n**Resolves: https://github.com/elastic/kibana/issues/192350**\r\n\r\n## Summary\r\n\r\nImplemented stream-based installation of the detection rules package.\r\n\r\n**Background**: The installation of the detection rules package was\r\ncausing OOM (Out of Memory) errors in Serverless environments where the\r\navailable memory is limited to 1GB. The root cause of the errors was\r\nthat during installation, the package was being read and unzipped\r\nentirely into memory. Given the large package size, this led to OOMs. To\r\naddress these memory issues, the following changes were made:\r\n\r\n1. Added a branching logic to the `installPackageFromRegistry` and\r\n`installPackageByUpload` methods, where based on the package name is\r\ndecided to use streaming or not. Only one `security_detection_engine`\r\npackage is currently hardcoded to use streaming.\r\n2. In the state machine then defined a separate set of steps for the\r\nstream-based package installation. It is reduced to cover only Kibana\r\nassets installation at this stage.\r\n3. A new `stepInstallKibanaAssetsWithStreaming` step is added to handle\r\nassets installation. While this method still reads the package archive\r\ninto memory (since unzipping from a readable stream is [not possible due\r\nto the design of the .zip\r\nformat](https://github.com/thejoshwolfe/yauzl?tab=readme-ov-file#no-streaming-unzip-api)),\r\nthe package is unzipped using streams after being read into a buffer.\r\nThis allows only a small portion of the archive (100 saved objects at a\r\ntime) to be unpacked into memory, reducing memory usage.\r\n4. The new method also includes several optimizations, such as only\r\nremoving previously installed assets if they are missing in the new\r\npackage and using `savedObjectClient.bulkCreate` instead of the less\r\nefficient `savedObjectClient.import`.\r\n\r\n### Test environment\r\n\r\n1. Prebuilt detection rules package with ~20k saved objects; 118MB\r\nzipped.\r\n5. Local package registry.\r\n6. Production build of Kibana running locally with a 700MB max old space\r\nlimit, pointed to that registry.\r\n\r\nSetting up a test environment is not completely straightforward. Here's\r\na rough outline of the steps:\r\n<details>\r\n<summary>\r\nHow to test this PR\r\n</summary>\r\n\r\n1. Create a package containing a large number of prebuilt rules.\r\n1. I used the `package-storage` repository to find one of the previously\r\nreleased prebuilt rules packages.\r\n2. Multiplied the number of assets in the package to 20k historical\r\nversions.\r\n 4. Built the package using `elastic-package build`.\r\n2. Start a local package registry serving the built package using\r\n`elastic-package stack up --services package-registry`.\r\n4. Create a production build of Kibana. To speed up the process,\r\nunnecessary artifacts can be skipped:\r\n ```\r\nnode scripts/build --skip-cdn-assets --skip-docker-ubi\r\n--skip-docker-ubuntu --skip-docker-wolfi --skip-docker-fips\r\n ```\r\n7. Provide the built Kibana with a config pointing to the local\r\nregistry. The config is located in\r\n`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64/config/kibana.yml`.\r\nYou can use the following config:\r\n ```\r\n csp.strict: false\r\nxpack.security.encryptionKey: 've4Vohnu oa0Fu9ae Eethee8c oDieg4do\r\nNohrah1u ao9Hu2oh Aeb4Ieyi Aew1aegi'\r\nxpack.encryptedSavedObjects.encryptionKey: 'Shah7nai Eew6izai Eir7OoW0\r\nGewi2ief eiSh8woo shoogh7E Quae6hal ce6Oumah'\r\n\r\n xpack.fleet.internal.registry.kibanaVersionCheckEnabled: false\r\n xpack.fleet.registryUrl: https://localhost:8080\r\n\r\n elasticsearch:\r\n username: 'kibana_system'\r\n password: 'changeme'\r\n hosts: 'http://localhost:9200'\r\n ```\r\n8. Override the Node options Kibana starts with to allow it to connect\r\nto the local registry and set the memory limit. For this, you need to\r\nedit the `build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64/bin/kibana`\r\nfile:\r\n ```\r\nNODE_OPTIONS=\"--no-warnings --max-http-header-size=65536\r\n--unhandled-rejections=warn --dns-result-order=ipv4first\r\n--openssl-legacy-provider --max_old_space_size=700 --inspect\"\r\nNODE_ENV=production\r\nNODE_EXTRA_CA_CERTS=~/.elastic-package/profiles/default/certs/ca-cert.pem\r\nexec \"${NODE}\" \"${DIR}/src/cli/dist\" \"${@}\"\r\n ```\r\n9. Navigate to the build folder:\r\n`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64`.\r\n10. Start Kibana using `./bin/kibana`.\r\n11. Kibana is now running in debug mode, with the debugger started on\r\nport 9229. You can connect to it using VS Code's debug config or\r\nChrome's DevTools.\r\n12. Now you can install prebuilt detection rules by calling the `POST\r\n/internal/detection_engine/prebuilt_rules/_bootstrap` endpoint, which\r\nuses the new streaming installation under the hood.\r\n\r\n</details>\r\n\r\n### Test results locally\r\n\r\n**Without the streaming approach**\r\n\r\nGuaranteed OOM. Even smaller packages, up to 10k rules, caused sporadic\r\nOOM errors. So for comparison, tested the package installation without\r\nmemory limits.\r\n\r\n![Screenshot 2024-10-14 at 14 15\r\n26](https://github.com/user-attachments/assets/131cb877-2404-4638-b619-b1370a53659f)\r\n\r\n1. Heap memory usage spikes up to 2.5GB\r\n5. External memory consumes up to 450 Mb, which is four times the\r\narchive size\r\n13. RSS (Resident Set Size) exceeds 4.5GB\r\n\r\n**With the streaming approach**\r\n\r\nNo OOM errors observed. The memory consumption chart looks like the\r\nfollowing:\r\n\r\n![Screenshot 2024-10-14 at 11 15\r\n21](https://github.com/user-attachments/assets/b47ba8c9-2ba7-42de-b921-c33104d4481e)\r\n\r\n1. Heap memory remains stable, around 450MB, without any spikes.\r\n2. External memory jumps to around 250MB at the beginning of the\r\ninstallation, then drops to around 120MB, which is roughly equal to the\r\npackage archive size. I couldn't determine why the external memory\r\nconsumption exceeds the package size by 2x when the installation starts.\r\nI checked the code for places where the package might be loaded into\r\nmemory twice but found nothing suspicious. This might be worth\r\ninvestigating further.\r\n3. RSS remains stable, peaking slightly above 1GB. I believe this is the\r\nupper limit for a package that can be handled without errors in a\r\nServerless environment, where the memory limit is dictated by pod-level\r\nsettings rather than Node settings and is set to 1GB. I'll verify this\r\non a real Serverless instance to confirm.\r\n\r\n### Test results on Serverless\r\n\r\n![Screenshot 2024-10-31 at 12 31\r\n34](https://github.com/user-attachments/assets/d20d2860-fa96-4e56-be2b-7b3c0b5c7b77)","sha":"67cdb93f5b800caac80672c942d04afe4d7aa4d8","branchLabelMapping":{"^v9.0.0$":"main","^v8.17.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["performance","release_note:skip","Team:Fleet","v9.0.0","Team:Detections and Resp","Team: SecuritySolution","Team:Detection Rule Management","Feature:Prebuilt Detection Rules","ci:project-deploy-security","backport:version","v8.17.0"],"title":"[Fleet] [Security Solution] Install prebuilt rules package using stream-based approach","number":195888,"url":"https://github.com/elastic/kibana/pull/195888","mergeCommit":{"message":"[Fleet] [Security Solution] Install prebuilt rules package using stream-based approach (#195888)\n\n**Resolves: https://github.com/elastic/kibana/issues/192350**\r\n\r\n## Summary\r\n\r\nImplemented stream-based installation of the detection rules package.\r\n\r\n**Background**: The installation of the detection rules package was\r\ncausing OOM (Out of Memory) errors in Serverless environments where the\r\navailable memory is limited to 1GB. The root cause of the errors was\r\nthat during installation, the package was being read and unzipped\r\nentirely into memory. Given the large package size, this led to OOMs. To\r\naddress these memory issues, the following changes were made:\r\n\r\n1. Added a branching logic to the `installPackageFromRegistry` and\r\n`installPackageByUpload` methods, where based on the package name is\r\ndecided to use streaming or not. Only one `security_detection_engine`\r\npackage is currently hardcoded to use streaming.\r\n2. In the state machine then defined a separate set of steps for the\r\nstream-based package installation. It is reduced to cover only Kibana\r\nassets installation at this stage.\r\n3. A new `stepInstallKibanaAssetsWithStreaming` step is added to handle\r\nassets installation. While this method still reads the package archive\r\ninto memory (since unzipping from a readable stream is [not possible due\r\nto the design of the .zip\r\nformat](https://github.com/thejoshwolfe/yauzl?tab=readme-ov-file#no-streaming-unzip-api)),\r\nthe package is unzipped using streams after being read into a buffer.\r\nThis allows only a small portion of the archive (100 saved objects at a\r\ntime) to be unpacked into memory, reducing memory usage.\r\n4. The new method also includes several optimizations, such as only\r\nremoving previously installed assets if they are missing in the new\r\npackage and using `savedObjectClient.bulkCreate` instead of the less\r\nefficient `savedObjectClient.import`.\r\n\r\n### Test environment\r\n\r\n1. Prebuilt detection rules package with ~20k saved objects; 118MB\r\nzipped.\r\n5. Local package registry.\r\n6. Production build of Kibana running locally with a 700MB max old space\r\nlimit, pointed to that registry.\r\n\r\nSetting up a test environment is not completely straightforward. Here's\r\na rough outline of the steps:\r\n<details>\r\n<summary>\r\nHow to test this PR\r\n</summary>\r\n\r\n1. Create a package containing a large number of prebuilt rules.\r\n1. I used the `package-storage` repository to find one of the previously\r\nreleased prebuilt rules packages.\r\n2. Multiplied the number of assets in the package to 20k historical\r\nversions.\r\n 4. Built the package using `elastic-package build`.\r\n2. Start a local package registry serving the built package using\r\n`elastic-package stack up --services package-registry`.\r\n4. Create a production build of Kibana. To speed up the process,\r\nunnecessary artifacts can be skipped:\r\n ```\r\nnode scripts/build --skip-cdn-assets --skip-docker-ubi\r\n--skip-docker-ubuntu --skip-docker-wolfi --skip-docker-fips\r\n ```\r\n7. Provide the built Kibana with a config pointing to the local\r\nregistry. The config is located in\r\n`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64/config/kibana.yml`.\r\nYou can use the following config:\r\n ```\r\n csp.strict: false\r\nxpack.security.encryptionKey: 've4Vohnu oa0Fu9ae Eethee8c oDieg4do\r\nNohrah1u ao9Hu2oh Aeb4Ieyi Aew1aegi'\r\nxpack.encryptedSavedObjects.encryptionKey: 'Shah7nai Eew6izai Eir7OoW0\r\nGewi2ief eiSh8woo shoogh7E Quae6hal ce6Oumah'\r\n\r\n xpack.fleet.internal.registry.kibanaVersionCheckEnabled: false\r\n xpack.fleet.registryUrl: https://localhost:8080\r\n\r\n elasticsearch:\r\n username: 'kibana_system'\r\n password: 'changeme'\r\n hosts: 'http://localhost:9200'\r\n ```\r\n8. Override the Node options Kibana starts with to allow it to connect\r\nto the local registry and set the memory limit. For this, you need to\r\nedit the `build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64/bin/kibana`\r\nfile:\r\n ```\r\nNODE_OPTIONS=\"--no-warnings --max-http-header-size=65536\r\n--unhandled-rejections=warn --dns-result-order=ipv4first\r\n--openssl-legacy-provider --max_old_space_size=700 --inspect\"\r\nNODE_ENV=production\r\nNODE_EXTRA_CA_CERTS=~/.elastic-package/profiles/default/certs/ca-cert.pem\r\nexec \"${NODE}\" \"${DIR}/src/cli/dist\" \"${@}\"\r\n ```\r\n9. Navigate to the build folder:\r\n`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64`.\r\n10. Start Kibana using `./bin/kibana`.\r\n11. Kibana is now running in debug mode, with the debugger started on\r\nport 9229. You can connect to it using VS Code's debug config or\r\nChrome's DevTools.\r\n12. Now you can install prebuilt detection rules by calling the `POST\r\n/internal/detection_engine/prebuilt_rules/_bootstrap` endpoint, which\r\nuses the new streaming installation under the hood.\r\n\r\n</details>\r\n\r\n### Test results locally\r\n\r\n**Without the streaming approach**\r\n\r\nGuaranteed OOM. Even smaller packages, up to 10k rules, caused sporadic\r\nOOM errors. So for comparison, tested the package installation without\r\nmemory limits.\r\n\r\n![Screenshot 2024-10-14 at 14 15\r\n26](https://github.com/user-attachments/assets/131cb877-2404-4638-b619-b1370a53659f)\r\n\r\n1. Heap memory usage spikes up to 2.5GB\r\n5. External memory consumes up to 450 Mb, which is four times the\r\narchive size\r\n13. RSS (Resident Set Size) exceeds 4.5GB\r\n\r\n**With the streaming approach**\r\n\r\nNo OOM errors observed. The memory consumption chart looks like the\r\nfollowing:\r\n\r\n![Screenshot 2024-10-14 at 11 15\r\n21](https://github.com/user-attachments/assets/b47ba8c9-2ba7-42de-b921-c33104d4481e)\r\n\r\n1. Heap memory remains stable, around 450MB, without any spikes.\r\n2. External memory jumps to around 250MB at the beginning of the\r\ninstallation, then drops to around 120MB, which is roughly equal to the\r\npackage archive size. I couldn't determine why the external memory\r\nconsumption exceeds the package size by 2x when the installation starts.\r\nI checked the code for places where the package might be loaded into\r\nmemory twice but found nothing suspicious. This might be worth\r\ninvestigating further.\r\n3. RSS remains stable, peaking slightly above 1GB. I believe this is the\r\nupper limit for a package that can be handled without errors in a\r\nServerless environment, where the memory limit is dictated by pod-level\r\nsettings rather than Node settings and is set to 1GB. I'll verify this\r\non a real Serverless instance to confirm.\r\n\r\n### Test results on Serverless\r\n\r\n![Screenshot 2024-10-31 at 12 31\r\n34](https://github.com/user-attachments/assets/d20d2860-fa96-4e56-be2b-7b3c0b5c7b77)","sha":"67cdb93f5b800caac80672c942d04afe4d7aa4d8"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/195888","number":195888,"mergeCommit":{"message":"[Fleet] [Security Solution] Install prebuilt rules package using stream-based approach (#195888)\n\n**Resolves: https://github.com/elastic/kibana/issues/192350**\r\n\r\n## Summary\r\n\r\nImplemented stream-based installation of the detection rules package.\r\n\r\n**Background**: The installation of the detection rules package was\r\ncausing OOM (Out of Memory) errors in Serverless environments where the\r\navailable memory is limited to 1GB. The root cause of the errors was\r\nthat during installation, the package was being read and unzipped\r\nentirely into memory. Given the large package size, this led to OOMs. To\r\naddress these memory issues, the following changes were made:\r\n\r\n1. Added a branching logic to the `installPackageFromRegistry` and\r\n`installPackageByUpload` methods, where based on the package name is\r\ndecided to use streaming or not. Only one `security_detection_engine`\r\npackage is currently hardcoded to use streaming.\r\n2. In the state machine then defined a separate set of steps for the\r\nstream-based package installation. It is reduced to cover only Kibana\r\nassets installation at this stage.\r\n3. A new `stepInstallKibanaAssetsWithStreaming` step is added to handle\r\nassets installation. While this method still reads the package archive\r\ninto memory (since unzipping from a readable stream is [not possible due\r\nto the design of the .zip\r\nformat](https://github.com/thejoshwolfe/yauzl?tab=readme-ov-file#no-streaming-unzip-api)),\r\nthe package is unzipped using streams after being read into a buffer.\r\nThis allows only a small portion of the archive (100 saved objects at a\r\ntime) to be unpacked into memory, reducing memory usage.\r\n4. The new method also includes several optimizations, such as only\r\nremoving previously installed assets if they are missing in the new\r\npackage and using `savedObjectClient.bulkCreate` instead of the less\r\nefficient `savedObjectClient.import`.\r\n\r\n### Test environment\r\n\r\n1. Prebuilt detection rules package with ~20k saved objects; 118MB\r\nzipped.\r\n5. Local package registry.\r\n6. Production build of Kibana running locally with a 700MB max old space\r\nlimit, pointed to that registry.\r\n\r\nSetting up a test environment is not completely straightforward. Here's\r\na rough outline of the steps:\r\n<details>\r\n<summary>\r\nHow to test this PR\r\n</summary>\r\n\r\n1. Create a package containing a large number of prebuilt rules.\r\n1. I used the `package-storage` repository to find one of the previously\r\nreleased prebuilt rules packages.\r\n2. Multiplied the number of assets in the package to 20k historical\r\nversions.\r\n 4. Built the package using `elastic-package build`.\r\n2. Start a local package registry serving the built package using\r\n`elastic-package stack up --services package-registry`.\r\n4. Create a production build of Kibana. To speed up the process,\r\nunnecessary artifacts can be skipped:\r\n ```\r\nnode scripts/build --skip-cdn-assets --skip-docker-ubi\r\n--skip-docker-ubuntu --skip-docker-wolfi --skip-docker-fips\r\n ```\r\n7. Provide the built Kibana with a config pointing to the local\r\nregistry. The config is located in\r\n`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64/config/kibana.yml`.\r\nYou can use the following config:\r\n ```\r\n csp.strict: false\r\nxpack.security.encryptionKey: 've4Vohnu oa0Fu9ae Eethee8c oDieg4do\r\nNohrah1u ao9Hu2oh Aeb4Ieyi Aew1aegi'\r\nxpack.encryptedSavedObjects.encryptionKey: 'Shah7nai Eew6izai Eir7OoW0\r\nGewi2ief eiSh8woo shoogh7E Quae6hal ce6Oumah'\r\n\r\n xpack.fleet.internal.registry.kibanaVersionCheckEnabled: false\r\n xpack.fleet.registryUrl: https://localhost:8080\r\n\r\n elasticsearch:\r\n username: 'kibana_system'\r\n password: 'changeme'\r\n hosts: 'http://localhost:9200'\r\n ```\r\n8. Override the Node options Kibana starts with to allow it to connect\r\nto the local registry and set the memory limit. For this, you need to\r\nedit the `build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64/bin/kibana`\r\nfile:\r\n ```\r\nNODE_OPTIONS=\"--no-warnings --max-http-header-size=65536\r\n--unhandled-rejections=warn --dns-result-order=ipv4first\r\n--openssl-legacy-provider --max_old_space_size=700 --inspect\"\r\nNODE_ENV=production\r\nNODE_EXTRA_CA_CERTS=~/.elastic-package/profiles/default/certs/ca-cert.pem\r\nexec \"${NODE}\" \"${DIR}/src/cli/dist\" \"${@}\"\r\n ```\r\n9. Navigate to the build folder:\r\n`build/default/kibana-9.0.0-SNAPSHOT-darwin-aarch64`.\r\n10. Start Kibana using `./bin/kibana`.\r\n11. Kibana is now running in debug mode, with the debugger started on\r\nport 9229. You can connect to it using VS Code's debug config or\r\nChrome's DevTools.\r\n12. Now you can install prebuilt detection rules by calling the `POST\r\n/internal/detection_engine/prebuilt_rules/_bootstrap` endpoint, which\r\nuses the new streaming installation under the hood.\r\n\r\n</details>\r\n\r\n### Test results locally\r\n\r\n**Without the streaming approach**\r\n\r\nGuaranteed OOM. Even smaller packages, up to 10k rules, caused sporadic\r\nOOM errors. So for comparison, tested the package installation without\r\nmemory limits.\r\n\r\n![Screenshot 2024-10-14 at 14 15\r\n26](https://github.com/user-attachments/assets/131cb877-2404-4638-b619-b1370a53659f)\r\n\r\n1. Heap memory usage spikes up to 2.5GB\r\n5. External memory consumes up to 450 Mb, which is four times the\r\narchive size\r\n13. RSS (Resident Set Size) exceeds 4.5GB\r\n\r\n**With the streaming approach**\r\n\r\nNo OOM errors observed. The memory consumption chart looks like the\r\nfollowing:\r\n\r\n![Screenshot 2024-10-14 at 11 15\r\n21](https://github.com/user-attachments/assets/b47ba8c9-2ba7-42de-b921-c33104d4481e)\r\n\r\n1. Heap memory remains stable, around 450MB, without any spikes.\r\n2. External memory jumps to around 250MB at the beginning of the\r\ninstallation, then drops to around 120MB, which is roughly equal to the\r\npackage archive size. I couldn't determine why the external memory\r\nconsumption exceeds the package size by 2x when the installation starts.\r\nI checked the code for places where the package might be loaded into\r\nmemory twice but found nothing suspicious. This might be worth\r\ninvestigating further.\r\n3. RSS remains stable, peaking slightly above 1GB. I believe this is the\r\nupper limit for a package that can be handled without errors in a\r\nServerless environment, where the memory limit is dictated by pod-level\r\nsettings rather than Node settings and is set to 1GB. I'll verify this\r\non a real Serverless instance to confirm.\r\n\r\n### Test results on Serverless\r\n\r\n![Screenshot 2024-10-31 at 12 31\r\n34](https://github.com/user-attachments/assets/d20d2860-fa96-4e56-be2b-7b3c0b5c7b77)","sha":"67cdb93f5b800caac80672c942d04afe4d7aa4d8"}},{"branch":"8.x","label":"v8.17.0","branchLabelMappingKey":"^v8.17.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Dmitrii Shevchenko <[email protected]>
- Loading branch information