You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
earlyoom works reasonably well, but doesn't fully mitigate the issues with some hard-coded inappropriate OOM score adjustments from chrome and Electron apps (vscode). Maybe we can amed the README with a clearer warning about this complication?
#230 already covered this to some extent, and was closed, but I found the following part int the README a bit misleading:
I tried out earlyoom because it made it sound like it's somehow smarter, aimed better at the memory hogs. However, it doesn't avoid the issue of aggressive oom_score_adj values. In my own testing with 6x parallel memory hogging python processes, earlyoom selects and terminates really small vscode (Electron) processes (with omm adj 200) instead of the several big memory hogging python processes.
As shown below, it's unfortunate that an innocent 5MB RSS code process is getting a higher badness score and being killed compared to a python3 process, albeit with a slightly lower badness score, but 5380MB RSS. That's a 1000x bigger memory hog skipped first!. Also, it's not going to help much freeing up memory stress by killing a half-dozen small processes before a much bigger one...
Oct 07 19:16:10 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem 5.00%, swap 40.00%Oct 07 19:16:10 <hostname> earlyoom[26328]: sending SIGTERM to process 57834 uid 1000 "code": badness 866, VmRSS 5 MiBOct 07 19:16:10 <hostname> earlyoom[26328]: process exited after 0.1 secondsOct 07 19:16:10 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem 5.00%, swap 40.00%Oct 07 19:16:10 <hostname> earlyoom[26328]: sending SIGTERM to process 63745 uid 1000 "python3": badness 748, VmRSS 5380 MiBOct 07 19:16:11 <hostname> earlyoom[26328]: process exited after 1.2 seconds
The current earlyoom logic does put in some effort to compare RSS, but only when the badness scores are identical.
if (cur->badness==victim->badness&&cur->VmRSSkiB <= victim->VmRSSkiB) {
That's not enough to mitigate the bad hard coded defaults of 300 (chrome) and 200 (electron vscode).
Tested on my system with hibernation enabled:
Ubuntu 20.04 LTS, earlyoom 1.6.2
80GB virtual memory 32GB RAM + 48GB swap (since 1.5x swap is recommended for hibernation enabled systems)
For the logic in earlyoom to take affect and treat two processes as competing victims, I've kinda guessed (python math at the end) that oom_adj_scores skew the score with my 80GBs of virtual memory as (score => penalty memory):
200 => 16GB
300 => 24GB
I.e. the way the badness scores work out, a 1GB vscode process (adj scrore 200) is treated as if it's on par with/abusing memory as much a 17GB process.
I wonder if there's any sensible way to add an arg that can weight badness scores slightly less and RSS more so that it scales better for larger systems.
Yes, --avoid is a way to mitigate this, but with Electon (vscode) which can hog memory, oom_score_adj = 200 - 300 = -100? So using --avoid possibly swings the issue out of balance in the opposite direction? Rock-n-hard place... E.g. if preferred, code with -100 gets to eat up to 8GB of RAM before it's a level victim with other non-adjusted processes.
Putting chrome into avoid cancels out nicely (300 - 300).
I appreciate this isn't earlyoom's fault/bug/issue, but rather a combo of how cumbersome OOM management and scores on Linux are plus tweaking and hard-coding chromium to cater to 2011 chromebooks with only 2GB ram and no swap. The way chrome adjusts it's score is still broken a decade later!
More complete log of all processes killed - multiple small visual studo code processes got wiped out first freeing up very little RAM...
Oct 07 19:14:32 <hostname> earlyoom[26328]: mem avail: 27949 of 31880 MiB (87.67%), swap free: 44345 of 49151 MiB (90.22%)Oct 07 19:16:10 <hostname> earlyoom[26328]: mem avail: 165 of 31880 MiB ( 0.52%), swap free: 19581 of 49151 MiB (39.84%)Oct 07 19:16:10 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem 5.00%, swap 40.00%Oct 07 19:16:10 <hostname> earlyoom[26328]: sending SIGTERM to process 57834 uid 1000 "code": badness 866, VmRSS 5 MiBOct 07 19:16:10 <hostname> earlyoom[26328]: process exited after 0.1 secondsOct 07 19:16:10 <hostname> earlyoom[26328]: mem avail: 207 of 31880 MiB ( 0.65%), swap free: 19327 of 49151 MiB (39.32%)Oct 07 19:16:10 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem 5.00%, swap 40.00%Oct 07 19:16:10 <hostname> earlyoom[26328]: sending SIGTERM to process 57192 uid 1000 "code": badness 801, VmRSS 54 MiBOct 07 19:16:10 <hostname> earlyoom[26328]: process exited after 0.1 secondsOct 07 19:16:10 <hostname> earlyoom[26328]: mem avail: 243 of 31880 MiB ( 0.77%), swap free: 19169 of 49151 MiB (39.00%)Oct 07 19:16:10 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem 5.00%, swap 40.00%Oct 07 19:16:10 <hostname> earlyoom[26328]: sending SIGTERM to process 63745 uid 1000 "python3": badness 748, VmRSS 5380 MiBOct 07 19:16:11 <hostname> earlyoom[26328]: process exited after 1.2 secondsOct 07 19:16:20 <hostname> earlyoom[26328]: mem avail: 191 of 31880 MiB ( 0.60%), swap free: 19555 of 49151 MiB (39.79%)Oct 07 19:16:20 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem 5.00%, swap 40.00%Oct 07 19:16:20 <hostname> earlyoom[26328]: sending SIGTERM to process 63900 uid 1000 "code": badness 800, VmRSS 78 MiBOct 07 19:16:20 <hostname> earlyoom[26328]: process exited after 0.1 secondsOct 07 19:16:20 <hostname> earlyoom[26328]: mem avail: 346 of 31880 MiB ( 1.09%), swap free: 19332 of 49151 MiB (39.33%)Oct 07 19:16:20 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem 5.00%, swap 40.00%Oct 07 19:16:20 <hostname> earlyoom[26328]: sending SIGTERM to process 63963 uid 1000 "code": badness 800, VmRSS 2 MiBOct 07 19:16:20 <hostname> earlyoom[26328]: process exited after 0.1 secondsOct 07 19:16:20 <hostname> earlyoom[26328]: mem avail: 323 of 31880 MiB ( 1.02%), swap free: 19324 of 49151 MiB (39.32%)Oct 07 19:16:20 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem 5.00%, swap 40.00%Oct 07 19:16:20 <hostname> earlyoom[26328]: sending SIGTERM to process 63964 uid 1000 "code": badness 800, VmRSS 4 MiBOct 07 19:16:20 <hostname> earlyoom[26328]: process exited after 0.1 secondsOct 07 19:16:20 <hostname> earlyoom[26328]: mem avail: 335 of 31880 MiB ( 1.05%), swap free: 19154 of 49151 MiB (38.97%)Oct 07 19:16:20 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem 5.00%, swap 40.00%Oct 07 19:16:20 <hostname> earlyoom[26328]: sending SIGTERM to process 63966 uid 1000 "code": badness 800, VmRSS 6 MiBOct 07 19:16:20 <hostname> earlyoom[26328]: process exited after 0.1 secondsOct 07 19:16:20 <hostname> earlyoom[26328]: mem avail: 336 of 31880 MiB ( 1.06%), swap free: 19035 of 49151 MiB (38.73%)Oct 07 19:16:20 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem 5.00%, swap 40.00%Oct 07 19:16:20 <hostname> earlyoom[26328]: sending SIGTERM to process 63968 uid 1000 "code": badness 800, VmRSS 6 MiBOct 07 19:16:21 <hostname> earlyoom[26328]: process exited after 0.1 secondsOct 07 19:16:21 <hostname> earlyoom[26328]: mem avail: 395 of 31880 MiB ( 1.24%), swap free: 18898 of 49151 MiB (38.45%)Oct 07 19:16:21 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem 5.00%, swap 40.00%Oct 07 19:16:21 <hostname> earlyoom[26328]: sending SIGTERM to process 63973 uid 1000 "code": badness 800, VmRSS 10 MiBOct 07 19:16:21 <hostname> earlyoom[26328]: process exited after 0.1 secondsOct 07 19:16:21 <hostname> earlyoom[26328]: mem avail: 419 of 31880 MiB ( 1.31%), swap free: 18814 of 49151 MiB (38.28%)Oct 07 19:16:21 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem 5.00%, swap 40.00%Oct 07 19:16:21 <hostname> earlyoom[26328]: sending SIGTERM to process 63975 uid 1000 "code": badness 800, VmRSS 75 MiBOct 07 19:16:21 <hostname> earlyoom[26328]: process exited after 0.1 secondsOct 07 19:16:21 <hostname> earlyoom[26328]: mem avail: 417 of 31880 MiB ( 1.31%), swap free: 18814 of 49151 MiB (38.28%)Oct 07 19:16:21 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem 5.00%, swap 40.00%Oct 07 19:16:21 <hostname> earlyoom[26328]: sending SIGTERM to process 63984 uid 1000 "code": badness 800, VmRSS 76 MiBOct 07 19:16:21 <hostname> earlyoom[26328]: mem avail: 356 of 31880 MiB ( 1.12%), swap free: 18685 of 49151 MiB (38.02%)Oct 07 19:16:21 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem 5.00%, swap 40.00%Oct 07 19:16:21 <hostname> earlyoom[26328]: sending SIGTERM to process 63740 uid 1000 "python3": badness 764, VmRSS 6354 MiBOct 07 19:16:22 <hostname> earlyoom[26328]: process exited after 1.2 secondsOct 07 19:16:37 <hostname> earlyoom[26328]: mem avail: 285 of 31880 MiB ( 0.90%), swap free: 19646 of 49151 MiB (39.97%)Oct 07 19:16:37 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem 5.00%, swap 40.00%Oct 07 19:16:37 <hostname> earlyoom[26328]: sending SIGTERM to process 63743 uid 1000 "python3": badness 791, VmRSS 8055 MiBOct 07 19:16:38 <hostname> earlyoom[26328]: process exited after 1.2 secondsOct 07 19:17:03 <hostname> earlyoom[26328]: mem avail: 256 of 31880 MiB ( 0.80%), swap free: 19606 of 49151 MiB (39.89%)Oct 07 19:17:03 <hostname> earlyoom[26328]: low memory! at or below SIGTERM limits: mem 5.00%, swap 40.00%Oct 07 19:17:03 <hostname> earlyoom[26328]: sending SIGTERM to process 63744 uid 1000 "python3": badness 824, VmRSS 10209 MiBOct 07 19:17:05 <hostname> earlyoom[26328]: process exited after 1.5 seconds
earlyoom works reasonably well, but doesn't fully mitigate the issues with some hard-coded inappropriate OOM score adjustments from chrome and Electron apps (vscode). Maybe we can amed the README with a clearer warning about this complication?
#230 already covered this to some extent, and was closed, but I found the following part int the README a bit misleading:
earlyoom/README.md
Line 65 in 788fe49
I tried out earlyoom because it made it sound like it's somehow smarter, aimed better at the memory hogs. However, it doesn't avoid the issue of aggressive
oom_score_adj
values. In my own testing with 6x parallel memory hogging python processes, earlyoom selects and terminates really small vscode (Electron) processes (with omm adj 200) instead of the several big memory hogging python processes.As shown below, it's unfortunate that an innocent 5MB RSS
code
process is getting a higher badness score and being killed compared to apython3
process, albeit with a slightly lower badness score, but 5380MB RSS. That's a 1000x bigger memory hog skipped first!. Also, it's not going to help much freeing up memory stress by killing a half-dozen small processes before a much bigger one...The current earlyoom logic does put in some effort to compare RSS, but only when the badness scores are identical.
earlyoom/kill.c
Line 237 in 788fe49
That's not enough to mitigate the bad hard coded defaults of 300 (chrome) and 200 (electron vscode).
Tested on my system with hibernation enabled:
For the logic in earlyoom to take affect and treat two processes as competing victims, I've kinda guessed (python math at the end) that oom_adj_scores skew the score with my 80GBs of virtual memory as (score => penalty memory):
I.e. the way the badness scores work out, a 1GB vscode process (adj scrore 200) is treated as if it's on par with/abusing memory as much a 17GB process.
I wonder if there's any sensible way to add an arg that can weight badness scores slightly less and RSS more so that it scales better for larger systems.
Yes,
--avoid
is a way to mitigate this, but with Electon (vscode) which can hog memory, oom_score_adj = 200 - 300 = -100? So using--avoid
possibly swings the issue out of balance in the opposite direction? Rock-n-hard place... E.g. if preferred,code
with -100 gets to eat up to 8GB of RAM before it's a level victim with other non-adjusted processes.Putting chrome into avoid cancels out nicely (300 - 300).
I appreciate this isn't earlyoom's fault/bug/issue, but rather a combo of how cumbersome OOM management and scores on Linux are plus tweaking and hard-coding chromium to cater to 2011 chromebooks with only 2GB ram and no swap. The way chrome adjusts it's score is still broken a decade later!
More complete log of all processes killed - multiple small visual studo
code
processes got wiped out first freeing up very little RAM...P.S. read the source code at https://github.com/torvalds/linux/blob/7cca308cfdc0725363ac5943dca9dcd49cc1d2d5/mm/oom_kill.c#L204 and calculated in python:
The text was updated successfully, but these errors were encountered: