My linode360 not responding and all websites down
It's happened several times so apparently something wrong with my system or something else.
After reboot everything's back to normal and websites are fast and responsive. But approx 24 hours later, it suddenly goes into the same trouble again - very busy and occupied not responding to anything, took dozens of minutes to just reboot.
I've no idea what may be the trigger of this? Misconfiguration? My PHP code? But all my websites have been fine on the previous host.
My distro is Debian 5.0. Things I've installed:
1. apache, mysql, php
2. rsnapshot (some cron jobs)
3. postfix
4. vsftpd (though automatically stopped)
5. chkrootkit
6. fail2ban
Anyone have any clue? I can provide logs for analysis. Thanks a lot! It's really annoying.
P.S. What logs do I need to see to find out what php script may have caused the problem?
9 Replies
This helped me track down a similar problem that I was having, which turned out to be an incompatibility between two specific versions of Apache and PHP (that was known to the respective developers).
@mwalling:
Sounds like an OOM, as described here:
http://www.linode.com/forums/viewtopic.php?t=4460
Thanks mwalling. It sounds like it. I've made the changes according to the thread. Also I've modified the swappiness to 25:
Do you think it's a good move?
Fingers crossed. But I see no spikes of I/O rate before the server went irresponsive in the control panel performance graphs. So still not sure if it's the disk I/O that has slowed things down, though very likely I think.
Are you sure that the box is becoming unresponsive to everyone? I wondered if you might somehow be tripping fail2ban and end up having it lock you out for a while. Perhaps a cron job on your local machine that is trying to rsync or otherwise ssh into your linode?
@eas:
You should really run munin or something that regularly logs key resource consumption and performance metrics so you can go back and see what's getting driven over the edge. guspaz's suggestion to leave top running is good too, but if you have munin running all the time then you already have useful information the next time something like this happens.
Are you sure that the box is becoming unresponsive to everyone? I wondered if you might somehow be tripping fail2ban and end up having it lock you out for a while. Perhaps a cron job on your local machine that is trying to rsync or otherwise ssh into your linode?
The problem is I can't leave an SSH tunnel window on (I use putty) all day long. The problem occurs every 1 or 2 days, unexpectedly.
@mwalling:
Yes.
http://you.dontlike.us/munin/dontlike.u … ke.us.html">http://you.dontlike.us/munin/dontlike.us/you.dontlike.us.html
Thank you, it helped a lot. My server has been working happily for the last week. Seems it's indeed OOM that's causing the trouble.
So Munin. Will it automatically start recording and graphing the performance data of my machine after I install it? I don't know how to configure it.
> So Munin. Will it automatically start recording and graphing the performance data of my machine after I install it? I don't know how to configure it.
No, you'll need to edit /etc/munin/munin*.conf first, create a virtual host (apache in your case) profile for it, and most likely select a few plugins to load.
See any one of the hundreds of Debian/Munin tutorials
@mjrich:
> So Munin. Will it automatically start recording and graphing the performance data of my machine after I install it? I don't know how to configure it.
No, you'll need to edit /etc/munin/munin*.conf first, create a virtual host (apache in your case) profile for it, and most likely select a few plugins to load.See
on the web, and post back if you're still having problems. any one of the hundreds of Debian/Munin tutorials:)
Thank you too, mjrich, I'll give it a try and let you guys know.
Linode is a great place for learning to become a server admin!