Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Cache call to getBundledPackages #172640

Merged
merged 10 commits into from
Dec 8, 2023

Conversation

nchaulet
Copy link
Member

@nchaulet nchaulet commented Dec 5, 2023

Summary

getBundledPackages read all the bundled package from disk every time it's called, if that function is called concurrently it will lead to OOM errors, as reading all packages is quite memory consuming.

That draft PR propose a simple in memory cache implementation, and open the discussion on how we can fix/improve that.

I tested this locally with a --max_old_space_size=717 and there is no OOM memory anymore. @kpollich if it's the way we want to go I can fix the remaining tests

I also changed the interface of BundledPackage to lazily get the buffer only when we need.

@nchaulet nchaulet added bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team labels Dec 5, 2023
@nchaulet nchaulet self-assigned this Dec 5, 2023
@apmmachine
Copy link
Contributor

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • /oblt-deploy : Deploy a Kibana instance using the Observability test environments.
  • /oblt-deploy-serverless : Deploy a serverless Kibana instance using the Observability test environments.
  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@nchaulet nchaulet force-pushed the feature-cache-bundled-packags branch from d0e950a to 50e617d Compare December 5, 2023 21:47
@kpollich
Copy link
Member

kpollich commented Dec 6, 2023

I think this is a good path forward, and I'd be happy to see the remaining tests fixed. Do you have any concerns with this approach based on what we've been discussing in Slack?

@nchaulet
Copy link
Member Author

nchaulet commented Dec 6, 2023

I think this is a good path forward, and I'd be happy to see the remaining tests fixed. Do you have any concerns with this approach based on what we've been discussing in Slack?

This will add a little memory overhead, but for bundled packages, I think we maybe able to do something a little smarter by changing the BundledPackage to something like this

interface BundledPackage
{
  pkgName: string,
  pkgVersion: string,
  buffer: async () => Buffer // Change here to read buffer only when we need it,
}

What do you think? the only part that will be store in memory will the pkgName, version and the buffer will only be retrieved when needed

@kpollich
Copy link
Member

kpollich commented Dec 6, 2023

Yeah that change to bundled package objects makes sense to me.

@nchaulet nchaulet marked this pull request as ready for review December 6, 2023 19:30
@nchaulet nchaulet requested review from a team as code owners December 6, 2023 19:30
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@nchaulet
Copy link
Member Author

nchaulet commented Dec 6, 2023

@elasticmachine merge upstream

@botelastic botelastic bot added the Team:APM All issues that need APM UI Team support label Dec 6, 2023
@nchaulet nchaulet added release_note:skip Skip the PR/issue when compiling release notes and removed Team:APM All issues that need APM UI Team support labels Dec 6, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/apm-ui (Team:APM)

Copy link
Member

@kpollich kpollich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

@botelastic botelastic bot added the Team:APM All issues that need APM UI Team support label Dec 7, 2023
@nchaulet
Copy link
Member Author

nchaulet commented Dec 7, 2023

@kpollich do you think we should backport this one? I think we should

@kpollich
Copy link
Member

kpollich commented Dec 7, 2023

Yes let's backport this - it will fix a substantial memory issue with Fleet so worth the backport.

@nchaulet nchaulet added the backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) label Dec 7, 2023
@nchaulet
Copy link
Member Author

nchaulet commented Dec 7, 2023

@elasticmachine merge upstream

@smith smith added Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team and removed Team:APM All issues that need APM UI Team support labels Dec 7, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-infra_services-team (Team:obs-ux-infra_services)

const { pkgName, pkgVersion } = splitPkgKey(zipFile.replace(/\.zip$/, ''));

const getBuffer = () => fs.readFile(path.join(bundledPackageLocation, zipFile));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive-by comment: maybe this could be wrapped in once()? Behaviour should be similar as before (this change), but no risk in reading from disk multiple times if the function is called multiple times for whatever reason as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dgieselaar with the once this will cache the file in memory forever right? it's probably something we want to avoid, we are trying to reduce the fleet memory footprint in kibana too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what we try to achieve with that PR is to avoid to load all the bundled package in memory when we call getBundledPackage (that is called in a lot of place in fleet code)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the once this will cache the file in memory forever right?
No, only in the scope of this call to getBundledPackages. Which would be the same as before. Wrapping this in once simply means that if downstream getBuffer is called twice, it is only read from disk once. Previously it was also read from disk once - this change introduces the possibility that it is read from disk multiple times for each getBundledPackages call.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dgieselaar that PR changed the getBundledPackages to cache the results, so those bundled package object we are returning are now global, I made a change in d2c575e to make a copy for each getBundledPackages and use a once there does it make sense to you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yes, makes sense!

@botelastic botelastic bot added the Team:APM All issues that need APM UI Team support label Dec 7, 2023
@nchaulet nchaulet requested a review from dgieselaar December 8, 2023 00:02
@nchaulet
Copy link
Member Author

nchaulet commented Dec 8, 2023

That PR seems to drastically help with an issue I currently have with bumping the revision of a large number of agent policies

Trigger a policy change with 500 agent policies:
Before that change (look at the external memory(buffer))

Screenshot 2023-12-08 at 8 46 57 AM

With that change

Screenshot 2023-12-08 at 8 54 48 AM

Copy link
Contributor

@cauemarcondes cauemarcondes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nchaulet nchaulet enabled auto-merge (squash) December 8, 2023 16:07
@kibana-ci
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] Detection Engine - Security Solution Cypress Tests #6 / Alert user assignment - ESS & Serverless Basic rendering alert with some assignees in alerts table alert with some assignees in alerts table

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @nchaulet

@nchaulet nchaulet merged commit 2c0e988 into elastic:main Dec 8, 2023
40 checks passed
@nchaulet nchaulet deleted the feature-cache-bundled-packags branch December 8, 2023 17:11
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Dec 8, 2023
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.12

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Dec 8, 2023
# Backport

This will backport the following commits from `main` to `8.12`:
- [[Fleet] Cache call to getBundledPackages
(#172640)](#172640)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Nicolas
Chaulet","email":"[email protected]"},"sourceCommit":{"committedDate":"2023-12-08T17:06:06Z","message":"[Fleet]
Cache call to getBundledPackages
(#172640)","sha":"2c0e98818729fe5988e17ba53c5e4752f3364039","branchLabelMapping":{"^v8.13.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["bug","Team:APM","release_note:skip","Team:Fleet","backport:prev-minor","Team:obs-ux-infra_services","apm:review","v8.13.0"],"number":172640,"url":"https://github.com/elastic/kibana/pull/172640","mergeCommit":{"message":"[Fleet]
Cache call to getBundledPackages
(#172640)","sha":"2c0e98818729fe5988e17ba53c5e4752f3364039"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.13.0","labelRegex":"^v8.13.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/172640","number":172640,"mergeCommit":{"message":"[Fleet]
Cache call to getBundledPackages
(#172640)","sha":"2c0e98818729fe5988e17ba53c5e4752f3364039"}}]}]
BACKPORT-->

Co-authored-by: Nicolas Chaulet <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
apm:review backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) bug Fixes for quality problems that affect the customer experience release_note:skip Skip the PR/issue when compiling release notes Team:APM All issues that need APM UI Team support Team:Fleet Team label for Observability Data Collection Fleet team Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team v8.12.0 v8.13.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants