Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disclaimer about chromium and electron apps being premature victims on larger systems #261

Open
JPvRiel opened this issue Oct 7, 2021 · 0 comments

Comments

@JPvRiel
Copy link

JPvRiel commented Oct 7, 2021

earlyoom works reasonably well, but doesn't fully mitigate the issues with some hard-coded inappropriate OOM score adjustments from chrome and Electron apps (vscode). Maybe we can amed the README with a clearer warning about this complication?

#230 already covered this to some extent, and was closed, but I found the following part int the README a bit misleading:

Why not trigger the kernel oom killer?

I tried out earlyoom because it made it sound like it's somehow smarter, aimed better at the memory hogs. However, it doesn't avoid the issue of aggressive oom_score_adj values. In my own testing with 6x parallel memory hogging python processes, earlyoom selects and terminates really small vscode (Electron) processes (with omm adj 200) instead of the several big memory hogging python processes.

As shown below, it's unfortunate that an innocent 5MB RSS code process is getting a higher badness score and being killed compared to a python3 process, albeit with a slightly lower badness score, but 5380MB RSS. That's a 1000x bigger memory hog skipped first!. Also, it's not going to help much freeing up memory stress by killing a half-dozen small processes before a much bigger one...

Oct 07 19:16:10 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem  5.00%, swap 40.00%
Oct 07 19:16:10 <hostname> earlyoom[26328]: sending SIGTERM to process 57834 uid 1000 "code": badness 866, VmRSS 5 MiB
Oct 07 19:16:10 <hostname> earlyoom[26328]: process exited after 0.1 seconds
Oct 07 19:16:10 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem  5.00%, swap 40.00%
Oct 07 19:16:10 <hostname> earlyoom[26328]: sending SIGTERM to process 63745 uid 1000 "python3": badness 748, VmRSS 5380 MiB
Oct 07 19:16:11 <hostname> earlyoom[26328]: process exited after 1.2 seconds

The current earlyoom logic does put in some effort to compare RSS, but only when the badness scores are identical.

earlyoom/kill.c

Line 237 in 788fe49

if (cur->badness == victim->badness && cur->VmRSSkiB <= victim->VmRSSkiB) {

That's not enough to mitigate the bad hard coded defaults of 300 (chrome) and 200 (electron vscode).

Tested on my system with hibernation enabled:

  • Ubuntu 20.04 LTS, earlyoom 1.6.2
  • 80GB virtual memory 32GB RAM + 48GB swap (since 1.5x swap is recommended for hibernation enabled systems)

For the logic in earlyoom to take affect and treat two processes as competing victims, I've kinda guessed (python math at the end) that oom_adj_scores skew the score with my 80GBs of virtual memory as (score => penalty memory):

  • 200 => 16GB
  • 300 => 24GB

I.e. the way the badness scores work out, a 1GB vscode process (adj scrore 200) is treated as if it's on par with/abusing memory as much a 17GB process.

I wonder if there's any sensible way to add an arg that can weight badness scores slightly less and RSS more so that it scales better for larger systems.

Yes, --avoid is a way to mitigate this, but with Electon (vscode) which can hog memory, oom_score_adj = 200 - 300 = -100? So using --avoid possibly swings the issue out of balance in the opposite direction? Rock-n-hard place... E.g. if preferred, code with -100 gets to eat up to 8GB of RAM before it's a level victim with other non-adjusted processes.

Putting chrome into avoid cancels out nicely (300 - 300).

I appreciate this isn't earlyoom's fault/bug/issue, but rather a combo of how cumbersome OOM management and scores on Linux are plus tweaking and hard-coding chromium to cater to 2011 chromebooks with only 2GB ram and no swap. The way chrome adjusts it's score is still broken a decade later!

More complete log of all processes killed - multiple small visual studo code processes got wiped out first freeing up very little RAM...

Oct 07 19:14:32 <hostname> earlyoom[26328]: mem avail: 27949 of 31880 MiB (87.67%), swap free: 44345 of 49151 MiB (90.22%)
Oct 07 19:16:10 <hostname> earlyoom[26328]: mem avail:   165 of 31880 MiB ( 0.52%), swap free: 19581 of 49151 MiB (39.84%)
Oct 07 19:16:10 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem  5.00%, swap 40.00%
Oct 07 19:16:10 <hostname> earlyoom[26328]: sending SIGTERM to process 57834 uid 1000 "code": badness 866, VmRSS 5 MiB
Oct 07 19:16:10 <hostname> earlyoom[26328]: process exited after 0.1 seconds
Oct 07 19:16:10 <hostname> earlyoom[26328]: mem avail:   207 of 31880 MiB ( 0.65%), swap free: 19327 of 49151 MiB (39.32%)
Oct 07 19:16:10 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem  5.00%, swap 40.00%
Oct 07 19:16:10 <hostname> earlyoom[26328]: sending SIGTERM to process 57192 uid 1000 "code": badness 801, VmRSS 54 MiB
Oct 07 19:16:10 <hostname> earlyoom[26328]: process exited after 0.1 seconds
Oct 07 19:16:10 <hostname> earlyoom[26328]: mem avail:   243 of 31880 MiB ( 0.77%), swap free: 19169 of 49151 MiB (39.00%)
Oct 07 19:16:10 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem  5.00%, swap 40.00%
Oct 07 19:16:10 <hostname> earlyoom[26328]: sending SIGTERM to process 63745 uid 1000 "python3": badness 748, VmRSS 5380 MiB
Oct 07 19:16:11 <hostname> earlyoom[26328]: process exited after 1.2 seconds
Oct 07 19:16:20 <hostname> earlyoom[26328]: mem avail:   191 of 31880 MiB ( 0.60%), swap free: 19555 of 49151 MiB (39.79%)
Oct 07 19:16:20 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem  5.00%, swap 40.00%
Oct 07 19:16:20 <hostname> earlyoom[26328]: sending SIGTERM to process 63900 uid 1000 "code": badness 800, VmRSS 78 MiB
Oct 07 19:16:20 <hostname> earlyoom[26328]: process exited after 0.1 seconds
Oct 07 19:16:20 <hostname> earlyoom[26328]: mem avail:   346 of 31880 MiB ( 1.09%), swap free: 19332 of 49151 MiB (39.33%)
Oct 07 19:16:20 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem  5.00%, swap 40.00%
Oct 07 19:16:20 <hostname> earlyoom[26328]: sending SIGTERM to process 63963 uid 1000 "code": badness 800, VmRSS 2 MiB
Oct 07 19:16:20 <hostname> earlyoom[26328]: process exited after 0.1 seconds
Oct 07 19:16:20 <hostname> earlyoom[26328]: mem avail:   323 of 31880 MiB ( 1.02%), swap free: 19324 of 49151 MiB (39.32%)
Oct 07 19:16:20 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem  5.00%, swap 40.00%
Oct 07 19:16:20 <hostname> earlyoom[26328]: sending SIGTERM to process 63964 uid 1000 "code": badness 800, VmRSS 4 MiB
Oct 07 19:16:20 <hostname> earlyoom[26328]: process exited after 0.1 seconds
Oct 07 19:16:20 <hostname> earlyoom[26328]: mem avail:   335 of 31880 MiB ( 1.05%), swap free: 19154 of 49151 MiB (38.97%)
Oct 07 19:16:20 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem  5.00%, swap 40.00%
Oct 07 19:16:20 <hostname> earlyoom[26328]: sending SIGTERM to process 63966 uid 1000 "code": badness 800, VmRSS 6 MiB
Oct 07 19:16:20 <hostname> earlyoom[26328]: process exited after 0.1 seconds
Oct 07 19:16:20 <hostname> earlyoom[26328]: mem avail:   336 of 31880 MiB ( 1.06%), swap free: 19035 of 49151 MiB (38.73%)
Oct 07 19:16:20 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem  5.00%, swap 40.00%
Oct 07 19:16:20 <hostname> earlyoom[26328]: sending SIGTERM to process 63968 uid 1000 "code": badness 800, VmRSS 6 MiB
Oct 07 19:16:21 <hostname> earlyoom[26328]: process exited after 0.1 seconds
Oct 07 19:16:21 <hostname> earlyoom[26328]: mem avail:   395 of 31880 MiB ( 1.24%), swap free: 18898 of 49151 MiB (38.45%)
Oct 07 19:16:21 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem  5.00%, swap 40.00%
Oct 07 19:16:21 <hostname> earlyoom[26328]: sending SIGTERM to process 63973 uid 1000 "code": badness 800, VmRSS 10 MiB
Oct 07 19:16:21 <hostname> earlyoom[26328]: process exited after 0.1 seconds
Oct 07 19:16:21 <hostname> earlyoom[26328]: mem avail:   419 of 31880 MiB ( 1.31%), swap free: 18814 of 49151 MiB (38.28%)
Oct 07 19:16:21 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem  5.00%, swap 40.00%
Oct 07 19:16:21 <hostname> earlyoom[26328]: sending SIGTERM to process 63975 uid 1000 "code": badness 800, VmRSS 75 MiB
Oct 07 19:16:21 <hostname> earlyoom[26328]: process exited after 0.1 seconds
Oct 07 19:16:21 <hostname> earlyoom[26328]: mem avail:   417 of 31880 MiB ( 1.31%), swap free: 18814 of 49151 MiB (38.28%)
Oct 07 19:16:21 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem  5.00%, swap 40.00%
Oct 07 19:16:21 <hostname> earlyoom[26328]: sending SIGTERM to process 63984 uid 1000 "code": badness 800, VmRSS 76 MiB
Oct 07 19:16:21 <hostname> earlyoom[26328]: mem avail:   356 of 31880 MiB ( 1.12%), swap free: 18685 of 49151 MiB (38.02%)
Oct 07 19:16:21 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem  5.00%, swap 40.00%
Oct 07 19:16:21 <hostname> earlyoom[26328]: sending SIGTERM to process 63740 uid 1000 "python3": badness 764, VmRSS 6354 MiB
Oct 07 19:16:22 <hostname> earlyoom[26328]: process exited after 1.2 seconds
Oct 07 19:16:37 <hostname> earlyoom[26328]: mem avail:   285 of 31880 MiB ( 0.90%), swap free: 19646 of 49151 MiB (39.97%)
Oct 07 19:16:37 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem  5.00%, swap 40.00%
Oct 07 19:16:37 <hostname> earlyoom[26328]: sending SIGTERM to process 63743 uid 1000 "python3": badness 791, VmRSS 8055 MiB
Oct 07 19:16:38 <hostname> earlyoom[26328]: process exited after 1.2 seconds
Oct 07 19:17:03 <hostname> earlyoom[26328]: mem avail:   256 of 31880 MiB ( 0.80%), swap free: 19606 of 49151 MiB (39.89%)
Oct 07 19:17:03 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem  5.00%, swap 40.00%
Oct 07 19:17:03 <hostname> earlyoom[26328]: sending SIGTERM to process 63744 uid 1000 "python3": badness 824, VmRSS 10209 MiB
Oct 07 19:17:05 <hostname> earlyoom[26328]: process exited after 1.5 seconds

P.S. read the source code at https://github.com/torvalds/linux/blob/7cca308cfdc0725363ac5943dca9dcd49cc1d2d5/mm/oom_kill.c#L204 and calculated in python:

def effective_penalty(oom_score_adj=0, ram_g=16, swap_g=24):
  total_vmem = ram_g + swap_g
  total_pages = (total_vmem * 2 ** 30) / (4 * 2 ** 10)  # GB / 4 KB pages
  adj = oom_score_adj * total_pages / 1000   # copied from kernel source logic
  return int(adj)  # as mem pages 4KB?

print(f"{effective_penalty(200, 32, 48) * 4 * 2 ** 10 / 2 ** 30:.3f} GB")  # convert from pages back to GB
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant