I stopped using earlyoom because I got frustrated with it always seeming to be e...

__turbobrew__ · on Oct 15, 2022

In theory this should be possible with cgroups. The mechanism is there but I don’t know of a way to easily set up a policy that does what you want.

It should be possible to allocate say 95% of the system resources to the default cgroup and then you could create a secondary cgroup — recovery — which has access to the last 5% of the system resources which you could use to run commands such as “kill” or “top” to recover the system state.

Additionally you could run a second ssh server in the recovery cgroup which you could ssh into in the case of system lock up.

In reality it is probably easier to just reboot in most cases, or if you are dealing with servers use ipmi.

jaggirs · on Oct 15, 2022

I don't understand why this issue still persists on linux. As far as I can tell all earlyoom needs to do is kill the process that has eaten the most memory in the last minute or so. On windows this issue is non-existent.

viraptor · on Oct 15, 2022

It's not that simple. Imagine something leaking memory running on parallel with something bursty. For example your browser leaks, but you run a big grep|sort in the background. Or have some GC runtime which allocates in batches and just decided it needs another chunk to manage.

On Windows you don't have oom at all because it trades that solution for just swapping forever until either you can't do anything or manage to kill the right app yourself.

gmokki · on Oct 16, 2022

I think it has now been finally fixed in Linux 6.1: https://www.phoronix.com/news/Linux-MGLRU-v9-Promising

People have reported that their machines with small amount of RAM are now fully usable where previously the system become completely unresponsive when swapping started.

FeepingCreature · on Oct 15, 2022

You can configure earlyoom to tell it which binaries to preferentially kill.

OJFord · on Oct 15, 2022

I know; I did; I just didn't find that the answer to 'what would I ideally like/not like killed' was the same on a per-binary basis - it varied, and it'd always, by Sod's law, be wrong. e.g. if Firefox was set not to be killed, it would be a tab misbehaving; if Slack was allowed to be killed, it would be while I was mid-message, and so on.

If it had some concept of 'in-use', for which you could define rules like 'has an active window' or 'is playing media', that might work better for me.

jhgb · on Oct 15, 2022

Could the window manager be configured to communicate with this mechanism? The window manager knows what windows you're manipulating at the moment. (I imagine that terminal processes would be somewhat more complicated to handle.)

charcircuit · on Oct 15, 2022

Yes, android already has this functionality where foregrounded apps have a lower priority to be killed by lmk.

tmtvl · on Oct 15, 2022

Funny that, Linux internal oom killer has a setting for that (oom.kill_allocating_task or something in sysctl), silly that earlyoom doesn't have that.

I have switched to systemd-oomd, but haven't yet gotten in any notable scrapes, so can't comment yet on how it fares.