My server gets down at the same time point every day, help!
Here's the traffic chart from my linode control panel:
~~![](<URL url=)http://www.panda-greatwall.com/downtime_linode.jpg
In this picture, you can see the downtime that has no traffic, which corresponds to high CPU usage ( ~140%). I have checked almost everything on my VPS, like system log, cron job, etc.
I wonder what could be the cause for this weird problem that is timed accurately for everyday?
thanks~~
15 Replies
~~![](<URL url=)http://www.panda-greatwall.com/io_down.jpg
I've checked my cron jobs and there's nothing set around 20:00
I found this thread
I use Lighttpd as webserver on CentOS 5.2 by the way.
thanks!~~
@Internat:
When you said youve checked your cron scripts, i assume you checked what is returned by crontab -e, AS well as whats listed in /etc/cron.d/ and /etc/cron.daily?
Yes, I've checked all of these. thanks
Just now, my VPS was down again and I had to restart it to bring it back online. The start of the down time is exactly the same for everyday: 2pm Mountain Standard Time (MST).
What could be the problem for such problem?? thanks
Try to run "apt-get clean" as root, and see if that helps.
Otherwise try to remove some old logs from /var/log.
One of the cron jobs to run google sitemap generator by reading access log consumes too much CPU & memory resources, which kind of freezes the entire system.
I wonder how to make my access logs smaller and easier to read? I have started using logrotate for the access logs of my Lighttpd server. anyone happen to know?
thanks!!
Run Munin. You seem to be doing this now, which is a good start.
Start checking Munin on a daily basis, so you get an idea of what normal "baseline" performance for your machine is.
When you have problems, start checking the logs under /var/log/. You can type "ls -ltr" to get a list of files sorted by time updated. The most recently updated log files will appear at the end of the list.
Log rotation. You should be doing this on a daily basis.
In your case, where the disk is full, you need to delete/remove the biggest offenders. Type "du -hs *" while in /var/log to get disk usage on each of the directories underneath of it. You can use this technique to quickly figure out if you have one or two large files (or directories full of lots of files) that are taking up lots of space.
Good luck!
– Doug
So I wonder what command I should use to find which directories eat up most of the space?
thanks!!
du -hs *
"man du" for more information on how the command works.
I always slice up my drives, very baaaaaad idea to have everything in one slice.
@marcus0263:
I'd bet it's something in /var/log being a likely candidate.
I always slice up my drives, very baaaaaad idea to have everything in one slice.
You are right! One of the error.log files is 4.6 GB and I don't know what happened to that website. I have deleted the gigantic error.log file by running 'rm -rf error.log', however, the system still shows 100% space used when I ran 'df -hT'.
Filesystem Type Size Used Avail Use% Mounted on
/dev/xvda ext3 12G 12G 43M 100% /
tmpfs tmpfs 181M 0 181M 0% /dev/shm
Did the deleted file go to some recycler?