OOMing I just don't get it.
But I don't get what happens and why is this so armageddon for my linode.
Example, not i have 900 used and 100 free memory on free -m. And everything runs perfect.
Then I get 983 used and 17 free and nothing moves, not one of my sites can be accessed, SSH is bearly moving and I need to reboot to fix this somehow.
I just dont get this On/Off part of linode when OOming.
Also most pages I have are static pages that should run with very low system resources and maybe don't even need memory (not sure about this) but they also don't work.
3 Replies
This is not unique to Linode, but in fact to all computers and operating systems. If you overload them so that they need to rely on swap, they get uselessly slow.
The solution is to do one of two things: either reduce your memory load, or increase your memory supply.
Question 1. Why even bother with swapping, with good probability there will be new requests made server will then never get out of this swapping process and it will be uselles as you say all day then?
Q2. Rebooting helps to get out of this mess, until there is the same situation again ofcourse. Why is there no fail safe system that could remove processes when memory gets critically low. Lets say you remove only web page req. processes so no important process is stoped. This would mean that some people would get no response on web page, but then again it would be better than this as when OOming it is just uselles and nobody can open any page and page just times out depending on the timeout setting?
Just wondering why, probably there are reasons, and in the end proper setting or more memory will help, but I like to know and think about things why they are the way they are
A2: There is a failsafe system: it's called the OOM Killer. Check your logs: if you hit the OOM condition, the OOM killer probably kicked in to try to free up memory. Unfortunately, it doesn't kick in if you're "just" swap-thrashing, and it tends to be rather indiscriminate about what gets killed. You can tweak the priorities of each process so that certain things get killed before other things, but it's usually better to spend time avoiding the problem in the first place.
(Especially since approx. 99% of out-of-memory conditions are caused by folks using Apache and mod_php with the default settings, which is a well-documented 30-second fix.)