Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Heartbeat] Better defaults for browser concurrency #32082

Closed
andrewvc opened this issue Jun 23, 2022 · 5 comments · Fixed by #32564
Closed

[Heartbeat] Better defaults for browser concurrency #32082

andrewvc opened this issue Jun 23, 2022 · 5 comments · Fixed by #32564
Assignees
Labels
enhancement Heartbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team

Comments

@andrewvc
Copy link
Contributor

When running browser monitors we should automatically limit the number of concurrent browser monitors based on available system memory, requiring 750MiB free per monitor.

This should limit problems when running monitors via fleet, where concurrently running monitors can quickly trigger an OOM.

See also #23687 .

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jun 23, 2022
@andrewvc andrewvc added enhancement Heartbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team labels Jun 23, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/uptime (Team:Uptime)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jun 23, 2022
@andrewvc
Copy link
Contributor Author

andrewvc commented Jul 5, 2022

@emilioalvap pointed out that in containers this is tricky to impossible, so, we should probably just default this number to 1 and let users override it

@andrewvc andrewvc changed the title [Heartbeat] Automatically set browser concurrency based on memory [Heartbeat] Better defaults for browser concurrency Jul 5, 2022
@andrewvc andrewvc self-assigned this Jul 19, 2022
@andrewvc
Copy link
Contributor Author

andrewvc commented Jul 19, 2022

Looking at docker on my system /sys/fs/cgroup/memory/memory.limit_in_bytes does seem to set the right amount.

I propose that we:

  1. Check for this file, and set our limit based on 1.5GiB per monitor
  2. If not present default to 2 monitors concurrently and warn about this.

Any objection? Pulled from this SO answer: https://stackoverflow.com/questions/42187085/check-mem-limit-within-a-docker-container

@emilioalvap
Copy link
Collaborator

Not an objection but adding an example as to why I'm wary of this feature.

  • /sys/fs/cgroup/memory/memory.limit_in_bytes is only available for cgroup v1, if the kernel is using cgroup v2, the correct file to check would be /sys/fs/cgroup/memory.max. And we'd need to compute available memory as cgroups can be shared, in theory.

For reference, this tool implements a similar check: https://github.com/freeipa/freeipa/blob/3237ade3d2df20c3aeba4405f46a45a2130fbc7e/ipaserver/install/installutils.py#L1084

andrewvc added a commit to andrewvc/beats that referenced this issue Aug 1, 2022
andrewvc added a commit that referenced this issue Aug 1, 2022
Fixes #32082 by limiting browser jobs to 2. Users can still override this with the global heartbeat.jobs.browser.limit: 42.

This also fixes a previously unknown bug, where changing the heartbeat.jobs.*.limit values would cause heartbeat to crash with a panic. This appears to be a bug ingo-ucfg. The workaround this PR adds is moving from map[string]JobLimit to map[string]*JobLimit, which doesn't trigger the same reflection issues.
mergify bot pushed a commit that referenced this issue Aug 1, 2022
Fixes #32082 by limiting browser jobs to 2. Users can still override this with the global heartbeat.jobs.browser.limit: 42.

This also fixes a previously unknown bug, where changing the heartbeat.jobs.*.limit values would cause heartbeat to crash with a panic. This appears to be a bug ingo-ucfg. The workaround this PR adds is moving from map[string]JobLimit to map[string]*JobLimit, which doesn't trigger the same reflection issues.

(cherry picked from commit cdd37ca)
andrewvc added a commit to andrewvc/beats that referenced this issue Aug 1, 2022
@paulb-elastic paulb-elastic assigned lucasfcosta and unassigned andrewvc Aug 2, 2022
andrewvc added a commit that referenced this issue Aug 2, 2022
Fixes #32082 by limiting browser jobs to 2. Users can still override this with the global heartbeat.jobs.browser.limit: 42.

This also fixes a previously unknown bug, where changing the heartbeat.jobs.*.limit values would cause heartbeat to crash with a panic. This appears to be a bug ingo-ucfg. The workaround this PR adds is moving from map[string]JobLimit to map[string]*JobLimit, which doesn't trigger the same reflection issues.

(cherry picked from commit cdd37ca)

Co-authored-by: Andrew Cholakian <[email protected]>
@lucasfcosta
Copy link
Contributor

I've just tested this PR by:

  1. Creating 12 inline browser monitors in my Heartbeat config
  2. Starting Heartbeat itself
  3. Check that there were only 2 monitors as child processes of Heartbeat
    Screenshot 2022-08-04 at 12 51 40
  4. Change the heartbeat.jobs.browser.limit setting to 4
  5. Check that there were actually 4 child processes
    Screenshot 2022-08-04 at 12 53 27

chrisberkhout pushed a commit that referenced this issue Jun 1, 2023
Fixes #32082 by limiting browser jobs to 2. Users can still override this with the global heartbeat.jobs.browser.limit: 42.

This also fixes a previously unknown bug, where changing the heartbeat.jobs.*.limit values would cause heartbeat to crash with a panic. This appears to be a bug ingo-ucfg. The workaround this PR adds is moving from map[string]JobLimit to map[string]*JobLimit, which doesn't trigger the same reflection issues.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Heartbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants