Why my linodes CPU + Disk IO Spiking?

Hi Linode, I own a 1440 linode and I am wondering why that is not enough for my server? I run vbulletin + vbseo forum software and I only get around 1500-2000 unique visitors a day. I have around 60-100 users at any point in time. So I'm wondering why my linodes cpu and disk io is spiking? I tried cache's and added apc to the mix, but I do not know what is wrong. I have friends on linode and they also run vbulletin. They get around the same visitors as me and they are only running a 360! Please help me and tell me what is wrong.

Here are my vm stats…

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0 122780 187652   9476 277784    2    2    14    25  182  217  4  1 94  0
 0  0 122780 192216   9752 279180    0    0    12    72  741  848  4  1 94  0
 0  0 122780 193948   9848 279652    0    0     4    20  589  767  1  1 98  0
 0  0 122780 207408   9964 280076    0    0     6    17  489  738  1  0 98  0
 1  0 122780 177968  10088 280660    0    0     8    33  464  737  3  1 96  0
 0  0 122780 183028  10232 281056    0    0     3    33  501  730  2  1 97  0
 0  0 122780 185124  10400 281380    0    0     4    45  560  791  3  1 96  0
 0  0 122780 185940  10708 283280    0    0    23    79  479  746  3  1 96  0
 0  0 122780 181284  10868 284088    0    0     7    45  481  732  3  1 96  0
 0  0 122780 191896  10944 284660    0    0     8    22  405  688  2  1 97  0
 1  0 122780 184984  11348 286320    0    0    23    97  707  867  3  1 95  1
 3  0 122780 173064  11740 287936    0    0    11    88  721  879  4  1 94  0
 1  0 122780 176856  12012 288816    0    0    10    70  818  945  5  2 93  0

10 Replies

That looks pretty decent, with good memory usage, I/O, and CPU usage. If possible, capturing that information along with "ps auxwww" when a spike is occurring might help trace it down to something specific.

You mean like this?

root        29  0.0  0.0      0     0 ?        S<   Aug12   0:00 [kblockd/2]
root        30  0.0  0.0      0     0 ?        S<   Aug12   0:00 [kblockd/3]
root        32  0.0  0.0      0     0 ?        S<   Aug12   0:00 [kseriod]
root       116  0.0  0.0      0     0 ?        S    Aug12   0:00 [pdflush]
root       117  0.0  0.0      0     0 ?        S    Aug12   0:00 [pdflush]
root       118  0.0  0.0      0     0 ?        S<   Aug12   0:07 [kswapd0]
root       119  0.0  0.0      0     0 ?        S<   Aug12   0:00 [aio/0]
root       120  0.0  0.0      0     0 ?        S<   Aug12   0:00 [aio/1]
root       121  0.0  0.0      0     0 ?        S<   Aug12   0:00 [aio/2]
root       122  0.0  0.0      0     0 ?        S<   Aug12   0:00 [aio/3]
root       124  0.0  0.0      0     0 ?        S<   Aug12   0:00 [jfsIO]
root       125  0.0  0.0      0     0 ?        S<   Aug12   0:00 [jfsCommit]
root       126  0.0  0.0      0     0 ?        S<   Aug12   0:00 [jfsCommit]
root       127  0.0  0.0      0     0 ?        S<   Aug12   0:00 [jfsCommit]
root       128  0.0  0.0      0     0 ?        S<   Aug12   0:00 [jfsCommit]
root       129  0.0  0.0      0     0 ?        S<   Aug12   0:00 [jfsSync]
root       130  0.0  0.0      0     0 ?        S<   Aug12   0:00 [xfslogd/0]
root       131  0.0  0.0      0     0 ?        S<   Aug12   0:00 [xfslogd/1]
root       132  0.0  0.0      0     0 ?        S<   Aug12   0:00 [xfslogd/2]
root       133  0.0  0.0      0     0 ?        S<   Aug12   0:00 [xfslogd/3]
root       134  0.0  0.0      0     0 ?        S<   Aug12   0:00 [xfsdatad/0]
root       135  0.0  0.0      0     0 ?        S<   Aug12   0:00 [xfsdatad/1]
root       136  0.0  0.0      0     0 ?        S<   Aug12   0:00 [xfsdatad/2]
root       137  0.0  0.0      0     0 ?        S<   Aug12   0:00 [xfsdatad/3]
root       746  0.0  0.0      0     0 ?        S<   Aug12   0:00 [net_accel/0]
root       747  0.0  0.0      0     0 ?        S<   Aug12   0:00 [net_accel/1]
root       748  0.0  0.0      0     0 ?        S<   Aug12   0:00 [net_accel/2]
root       749  0.0  0.0      0     0 ?        S<   Aug12   0:00 [net_accel/3]
root       757  0.0  0.0      0     0 ?        S<   Aug12   0:00 [kpsmoused]
root       760  0.0  0.0      0     0 ?        S<   Aug12   0:00 [kcryptd/0]
root       761  0.0  0.0      0     0 ?        S<   Aug12   0:00 [kcryptd/1]
root       762  0.0  0.0      0     0 ?        S<   Aug12   0:00 [kcryptd/2]
root       763  0.0  0.0      0     0 ?        S<   Aug12   0:00 [kcryptd/3]
root       764  0.0  0.0      0     0 ?        S<   Aug12   0:00 [kmirrord]
root       774  0.0  0.0      0     0 ?        S<   Aug12   0:01 [kjournald]
root      1006  0.0  0.0   6288   192 ?        Ss   Aug12   0:00 dhclient3 -pf /var/run/dhclient.eth0.pid -lf /var/lib/dhcp3/dhclient.eth0.leases eth0
root      1096  0.0  0.0 121572  1180 ?        Sl   Aug12   0:00 /usr/sbin/rsyslogd -c3
bind      1114  0.0  0.0 140616  1460 ?        Ssl  Aug12   0:00 /usr/sbin/named -u bind
root      1131  0.0  0.0  48856   900 ?        Ss   Aug12   0:00 /usr/sbin/sshd
amavis    1159  0.0  0.0 124060  1392 ?        Ss   Aug12   0:00 amavisd (master)
amavis    1171  0.0  0.0 125260   912 ?        S    Aug12   0:00 amavisd (virgin child)
amavis    1172  0.0  0.0 125260   896 ?        S    Aug12   0:00 amavisd (virgin child)
root      1302  0.0  0.0  73740   432 ?        Ss   Aug12   0:00 /usr/sbin/citserver -d -x3 -lmail -t/dev/null
citadel   1303  0.2  0.2 224796  4312 ?        Sl   Aug12   1:21 /usr/sbin/citserver -d -x3 -lmail -t/dev/null
clamav    1514  0.0  8.7 167772 129056 ?       Ssl  Aug12   0:05 /usr/sbin/clamd
clamav    1612  0.0  0.0  21704  1216 ?        Ss   Aug12   0:00 /usr/bin/freshclam -d --quiet
clamav    1819  0.0 14.4 277628 213636 ?       Ssl  Aug12   0:06 /usr/sbin/clamav-milter --max-children=2 -ol --pidfile /var/run/clamav/clamav-milter.pid local:/var/run/clamav/clamav-milter.ctl
root      1837  0.0  0.0  21820   284 ?        Ss   Aug12   0:00 /usr/sbin/webcit -D/var/run/webcit/webcit.pid.8888 -p8888 127.0.0.1 504 -i0.0.0.0 -f -t/var/log/webcit//access.8888.log
root      1838  0.0  0.0  37256   720 ?        Sl   Aug12   0:01 /usr/sbin/webcit -D/var/run/webcit/webcit.pid.8888 -p8888 127.0.0.1 504 -i0.0.0.0 -f -t/var/log/webcit//access.8888.log
root      1840  0.0  0.0  21820   284 ?        Ss   Aug12   0:00 /usr/sbin/webcit -D/var/run/webcit/webcit.pid.4444 -p4444 127.0.0.1 504 -s -i0.0.0.0 -f -t/var/log/webcit//access.4444.log -s
root      1841  0.0  0.0  37256  1044 ?        Sl   Aug12   0:01 /usr/sbin/webcit -D/var/run/webcit/webcit.pid.4444 -p4444 127.0.0.1 504 -s -i0.0.0.0 -f -t/var/log/webcit//access.4444.log -s
root      1853  0.0  0.0  12552   492 ?        Ss   Aug12   0:00 ftpd: accepting connections on port 21
root      1873  0.0  0.0  18544   756 ?        Ss   Aug12   0:00 /usr/sbin/cron
root      1887  0.0  0.3 224792  5096 ?        Ss   Aug12   0:00 /usr/sbin/apache2 -k start
root      1904  0.0  0.0   3788   460 tty1     Ss+  Aug12   0:00 /sbin/getty 38400 tty1
www-data  4967  0.1  2.9 233812 43036 ?        S    Aug12   0:34 /usr/sbin/apache2 -k start
www-data  5216  0.0  2.7 232072 41240 ?        S    Aug12   0:19 /usr/sbin/apache2 -k start
www-data  5524  0.1  3.2 238072 48344 ?        S    00:47   0:20 /usr/sbin/apache2 -k start
www-data  5617  0.1  3.1 236784 46888 ?        S    01:08   0:14 /usr/sbin/apache2 -k start
www-data  5642  0.1  2.7 229672 40436 ?        S    01:08   0:19 /usr/sbin/apache2 -k start
www-data  5659  0.1  3.0 235964 45348 ?        S    01:08   0:14 /usr/sbin/apache2 -k start
www-data  5685  0.1  2.9 232960 43900 ?        S    01:08   0:19 /usr/sbin/apache2 -k start
www-data  5697  0.1  3.2 239892 48180 ?        S    01:08   0:21 /usr/sbin/apache2 -k start
www-data  5704  0.1  2.8 232608 41324 ?        S    01:08   0:24 /usr/sbin/apache2 -k start
www-data  5779  0.1  2.9 232280 42868 ?        S    01:20   0:17 /usr/sbin/apache2 -k start
www-data  5780  0.1  2.7 234404 40628 ?        S    01:21   0:17 /usr/sbin/apache2 -k start
www-data  5800  0.1  2.8 232028 41884 ?        S    01:24   0:17 /usr/sbin/apache2 -k start
root      6159  0.0  0.0   8888  1276 ?        S    01:53   0:00 /bin/sh /usr/bin/mysqld_safe
mysql     6198  7.8  4.5 326128 67364 ?        Sl   01:53  15:29 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock
root      6199  0.0  0.0   3776   628 ?        S    01:53   0:00 logger -p daemon.err -t mysqld_safe -i -t mysqld
www-data 14909  0.2  2.9 232132 42920 ?        S    03:27   0:14 /usr/sbin/apache2 -k start
www-data 18987  0.1  2.9 232276 42912 ?        S    03:50   0:06 /usr/sbin/apache2 -k start
www-data 18988  0.0  3.0 237028 44732 ?        S    03:50   0:03 /usr/sbin/apache2 -k start
www-data 19128  0.0  2.5 229308 38132 ?        S    04:27   0:02 /usr/sbin/apache2 -k start
www-data 19129  0.0  2.3 232072 34940 ?        S    04:27   0:02 /usr/sbin/apache2 -k start
www-data 19134  0.0  2.3 231376 34160 ?        S    04:27   0:02 /usr/sbin/apache2 -k start
www-data 19135  0.1  2.4 231024 36832 ?        S    04:27   0:04 /usr/sbin/apache2 -k start
www-data 19140  0.3  2.9 232064 43264 ?        S    04:27   0:08 /usr/sbin/apache2 -k start
www-data 19141  0.0  2.3 232224 34456 ?        S    04:27   0:01 /usr/sbin/apache2 -k start
www-data 19145  0.0  2.5 232992 37924 ?        S    04:27   0:01 /usr/sbin/apache2 -k start
www-data 19147  0.1  2.7 229748 40160 ?        S    04:27   0:02 /usr/sbin/apache2 -k start
www-data 19150  0.1  2.6 230224 39800 ?        R    04:27   0:04 /usr/sbin/apache2 -k start
www-data 19152  0.2  2.5 230964 37460 ?        S    04:27   0:06 /usr/sbin/apache2 -k start
www-data 19156  0.1  2.8 233200 41296 ?        S    04:27   0:03 /usr/sbin/apache2 -k start
www-data 19159  0.3  2.4 232576 36644 ?        S    04:27   0:08 /usr/sbin/apache2 -k start
www-data 19161  0.1  2.5 229724 37428 ?        S    04:27   0:04 /usr/sbin/apache2 -k start
www-data 19162  0.0  2.5 233200 38100 ?        S    04:27   0:01 /usr/sbin/apache2 -k start
www-data 19164  0.0  2.4 232140 35756 ?        S    04:27   0:01 /usr/sbin/apache2 -k start
www-data 19168  0.3  2.9 232584 42852 ?        S    04:27   0:08 /usr/sbin/apache2 -k start
www-data 19172  0.1  2.9 232152 42964 ?        S    04:27   0:04 /usr/sbin/apache2 -k start
www-data 19183  0.1  2.8 237020 42460 ?        S    04:27   0:03 /usr/sbin/apache2 -k start
www-data 19184  0.1  2.4 232064 36280 ?        S    04:27   0:02 /usr/sbin/apache2 -k start
www-data 19185  0.3  2.9 232272 43188 ?        S    04:27   0:08 /usr/sbin/apache2 -k start
www-data 19187  0.1  2.3 229748 35236 ?        S    04:27   0:03 /usr/sbin/apache2 -k start
www-data 19188  0.1  2.4 232140 36340 ?        S    04:27   0:02 /usr/sbin/apache2 -k start
www-data 19189  0.0  2.8 231876 41956 ?        S    04:27   0:02 /usr/sbin/apache2 -k start
www-data 19190  0.0  2.7 232808 40504 ?        S    04:27   0:01 /usr/sbin/apache2 -k start
www-data 19192  0.2  2.4 232024 36564 ?        S    04:27   0:06 /usr/sbin/apache2 -k start
www-data 19196  0.1  2.6 229616 38388 ?        S    04:27   0:02 /usr/sbin/apache2 -k start
www-data 19201  0.1  2.5 232628 37756 ?        S    04:27   0:03 /usr/sbin/apache2 -k start
www-data 19204  0.1  2.7 233148 40004 ?        S    04:27   0:03 /usr/sbin/apache2 -k start
www-data 19205  0.1  2.9 232460 43488 ?        S    04:27   0:03 /usr/sbin/apache2 -k start
www-data 19207  0.1  2.6 231560 39196 ?        S    04:27   0:03 /usr/sbin/apache2 -k start
www-data 19208  0.1  3.2 240396 48628 ?        S    04:27   0:04 /usr/sbin/apache2 -k start
www-data 19217  0.0  2.7 232464 40092 ?        S    04:27   0:02 /usr/sbin/apache2 -k start
www-data 19219  0.1  2.8 232492 42444 ?        S    04:27   0:04 /usr/sbin/apache2 -k start
root     19285  0.0  0.2  66060  3092 ?        Ss   04:49   0:00 sshd: root@pts/0
root     19288  0.0  0.1  17524  1788 pts/0    Ss   04:49   0:00 -bash
www-data 19298  0.1  2.0 232804 30840 ?        S    04:50   0:01 /usr/sbin/apache2 -k start
www-data 19299  0.1  2.1 231264 31676 ?        S    04:50   0:02 /usr/sbin/apache2 -k start
www-data 19301  0.1  2.3 231964 35000 ?        S    04:50   0:02 /usr/sbin/apache2 -k start
www-data 19302  0.0  1.7 233900 26508 ?        S    04:50   0:00 /usr/sbin/apache2 -k start
www-data 19303  0.0  1.0 225864 15316 ?        S    04:50   0:00 /usr/sbin/apache2 -k start
www-data 19331  0.1  1.7 229336 25736 ?        S    04:59   0:00 /usr/sbin/apache2 -k start
www-data 19333  0.2  2.1 233380 31264 ?        S    05:00   0:01 /usr/sbin/apache2 -k start
www-data 19347  0.1  2.2 235680 32636 ?        S    05:02   0:00 /usr/sbin/apache2 -k start
www-data 19348  0.0  1.6 229428 24308 ?        S    05:02   0:00 /usr/sbin/apache2 -k start
www-data 19349  0.1  1.7 229624 26396 ?        S    05:02   0:00 /usr/sbin/apache2 -k start
www-data 19350  0.1  1.9 232640 28608 ?        S    05:02   0:00 /usr/sbin/apache2 -k start
www-data 19351  0.5  2.2 231856 33836 ?        S    05:02   0:03 /usr/sbin/apache2 -k start
www-data 19352  0.1  1.9 230904 28528 ?        S    05:02   0:00 /usr/sbin/apache2 -k start
www-data 19353  0.1  2.9 242064 43428 ?        S    05:02   0:00 /usr/sbin/apache2 -k start
www-data 19354  0.0  1.3 228548 19956 ?        S    05:02   0:00 /usr/sbin/apache2 -k start
www-data 19355  0.1  2.0 232464 30812 ?        S    05:02   0:00 /usr/sbin/apache2 -k start
www-data 19356  0.1  1.9 232636 28908 ?        S    05:02   0:00 /usr/sbin/apache2 -k start
www-data 19358  0.0  1.6 230500 25032 ?        S    05:02   0:00 /usr/sbin/apache2 -k start
www-data 19359  0.4  2.3 234456 34552 ?        S    05:02   0:02 /usr/sbin/apache2 -k start
root     19418  0.0  0.0  14720   992 pts/0    R+   05:10   0:00 ps auxwww

The spikes gets so bad that I have to reboot the server in order for it to work again. :(

you're running too many processes and kill the server.

search the forums on how to sanitize your apache configs and limit the number of processes the apache is allowed to spawn.

maybe someone should make one of the "how to configure my apache properly" threads sticky, this is coming up about once a week.

@MsMimi:

The spikes gets so bad that I have to reboot the server in order for it to work again. :(
As Oliver posted you need to limit the number of simultaneous processes Apache is running to service requests.

To help explain the output you gathered, the 6th column for each process is the RSS or resident set size (in KB), which is a good indicator of the actual active memory being used by that process.

If you add up that column of all of your "apache2" processes, you'll find it sums to 2507768, or about 2448MB. Given that your Linode only has 1440MB memory available to it, you can see how badly overallocated you are, which is why performance gets so bad. Alternatively, add up the 4th column, which is the percentage of total memory used by that process. You'll find your total is about 166%.

The machine is spending so much time accessing the disk in order to exchange bits of apache2 processes between disk and memory that it never gets much of anything done. A classic case of "thrashing".

Limiting how many Apache processes (the MaxClients config file parameter I believe) can run will keep your memory usage reasonable (on a Linode you should avoid any swapping/using more memory than you have in your steady state). Hitting the limit will cause additional connections to delay momentarily until a free Apache process is available to service them, but the time you gain back by not swapping will easily let them be serviced in fractions of seconds, and the addiitonal delay is not something the clients will notice.

– David

im not sure if it was mentioned yet or not…but don't forget to turn down your keep alive and connection time outs. You probably have a bunch of apache processes sitting on their thumbs not doing anything as well. This will help reduce the number of apache connections required.

Something like 3 seconds should be enough for keep alive and connection time out.

As an alternative to Apache, try the nginx webserver. It doesn't create hundreds of processes like Apache does.

Also, lighttpd, and lightspeed (although your level of usage would involve the commercial version).

The MaxClients directive does seem to be the one that a substantial number of people miss. The first configuration item my Linux mentor emphasized for Apache was MaxClients.

Definitely limit the max # of apache processes to a sane number, but while could be more memory efficient, it isn't really as big a pig as most people make it out to be.

For example, db3l went through and added up the resident set size of all your apache processes and came up with ~2.5GB. That is an impossible number of a Linode 1440. There is simply no way to have 2.5GB resident at once. There are other hints that this number isn't saying what its interpreted as saying, look at the earlier output from vmstat. 176M memory free, 288M for cache, 12M for buffers, only 122M of swap used.

The problem is that it is hard to express how much of a processes resident memory is actually shared memory or a shared libraries. In the case of Apache, I it is often a lot of the latter.

It could still be that you are getting slammed by too many active apache processes at times, and you should definitely see about cutting down the max number of Apache processes.

Nginx can also help. A lot of people replace Apache with Nginx. That's an obvious move to make if you are serving static files. If you have dynamic stuff that already runs in Apache, you can still win by using Nginx as a reverse proxy. Set it up to serve static content directly and proxy dynamic content to Apache. It helps both by keeping apache out of requests for static content, and it allows apache to move on to the next dynamic request more quickly because nginx takes the result quickly and then deals with feeding it out to slower clients. Some example config info here

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct