High resource usage on drupal based website

I'm running a drupal multi-site with 3 sites and 2 embeded gallery2 multi-site installations. I'm using a 512 linode with ubuntu 10.04, apache2, php5 and mysql all installed using apt-get. I've also installed modevasive, modsecurity, memcached, failtoban, and webmin. I'm currently getting about 40,000 page views per month from mostly logged in users. I have the swap file set to a huge 4g and generally it's running from about 200-300mb but it's jumping up at times to almost a 1g then going back down. It's even maxed out the swap a couple of times and I've had to restart the server. This happens every couple of days. I've tried tracking memory usage using htop and it seems all the memory usage is coming from apache and mysql. I've checked the logs but haven't found anything (although there is a good chance I just don't know what I'm looking for). I've tried optimizing mysql using tuning-primer.sh. Using the tuning primer and installing memcache helped the page load times significantly but not the memory usage. CPU usage has stayed low and there is no spike in traffic around the time the memory usage soars (in fact it drops to nothing once the memory usage goes through the roof because of course the pages don't load.) Modevasive logs do show some blocking so I know its working and I'm using the generic free rules from atomic for modsecurity.

This can't be "normal" memory usage can it? As you can probably tell by this post this is my first vps. I've managed what I have so far with google and tutorials.

I don't know what else to do or where else to look. Suggestions would be very much appreciated.

14 Replies

Odds are good you're using mpm_prefork. In your apache2.conf, what is MaxClients set to? I'd bump that down to 10 or 15, and reduce your swap to 256 MB.

This is the settings I have for mpm_prefork.

StartServers 1

MinSpareServers 3

MaxSpareServers 9

ServerLimit 24

MaxClients 24

MaxRequestsPerChild 3000

Ok I checked and it is mpm_prefork that is configured. I also took this list of running processes after it tried to max out the memory on me the other night. Does this tell anyone anything?

Display : PID | User | Memory | CPU | Search | Run..

Real memory: 497.48 MB total / 34.54 MB free Swap space: 3.91 GB total / 2.76 GB free

ID Owner Size Command

16886 www-data 294964 kB /usr/sbin/apache2 -k start

23478 www-data 243696 kB /usr/sbin/apache2 -k start

23479 www-data 234424 kB /usr/sbin/apache2 -k start

23473 www-data 229292 kB /usr/sbin/apache2 -k start

25471 www-data 217880 kB /usr/sbin/apache2 -k start

28755 www-data 214968 kB /usr/sbin/apache2 -k start

28759 www-data 212928 kB /usr/sbin/apache2 -k start

28756 www-data 212808 kB /usr/sbin/apache2 -k start

28082 www-data 203740 kB /usr/sbin/apache2 -k start

31357 www-data 196784 kB /usr/sbin/apache2 -k start

31349 www-data 195456 kB /usr/sbin/apache2 -k start

31358 www-data 194476 kB /usr/sbin/apache2 -k start

31332 www-data 194472 kB /usr/sbin/apache2 -k start

31335 www-data 194232 kB /usr/sbin/apache2 -k start

31333 www-data 193308 kB /usr/sbin/apache2 -k start

31337 www-data 192472 kB /usr/sbin/apache2 -k start

31353 www-data 192384 kB /usr/sbin/apache2 -k start

31350 www-data 192320 kB /usr/sbin/apache2 -k start

31348 www-data 192316 kB /usr/sbin/apache2 -k start

31355 www-data 192308 kB /usr/sbin/apache2 -k start

31336 www-data 192304 kB /usr/sbin/apache2 -k start

31356 www-data 191448 kB /usr/sbin/apache2 -k start

31334 www-data 190212 kB /usr/sbin/apache2 -k start

31351 www-data 189496 kB /usr/sbin/apache2 -k start

18902 root 154716 kB /usr/sbin/apache2 -k start

23786 mysql 151972 kB /usr/sbin/mysqld

2079 nobody 44948 kB /usr/bin/memcached -m 64 -p 11211 -u nobody -l 127.0.0.1

2015 syslog 28444 kB rsyslogd -c4

2648 root 25592 kB /usr/bin/python /usr/bin/fail2ban-server -b -s /var/run/fail2ban/fail2ban.sock

5405 root 19232 kB /usr/share/webmin/proc/index_size.cgi

2788 root 17520 kB /usr/bin/perl /usr/share/webmin/miniserv.pl /etc/webmin/miniserv.conf

5407 munin 14124 kB /usr/bin/perl /usr/share/munin/munin-update

5416 root 11548 kB sendmail: MTA: startup with localhost.localdomain

2565 root 11492 kB sendmail: MTA: accepting connections

5403 root 10208 kB /usr/bin/perl /usr/share/webmin/status/monitor.pl

5396 munin 9528 kB sendmail: MSP: ./p642tOL9005396 [127.0.0.1]: client greeting

2124 root 7332 kB /usr/sbin/munin-node

2471 root 5600 kB /usr/sbin/sshd -D

5282 alforddm 4752 kB -bash

2462 root 3600 kB /usr/sbin/ntpd

2463 ntpd 3496 kB /usr/sbin/ntpd

2790 root 2772 kB /bin/login –

1 root 2732 kB /sbin/init

5078 root 2604 kB CRON

5398 root 2604 kB CRON

5399 root 2604 kB CRON

5421 root 2484 kB ps --cols 2048 -eo user:80,ruser:80,group:80,rgroup:80,pid,ppid,pgid,pcpu,vsz,ni …

2048 root 2428 kB cron

1020 root 2368 kB upstart-udev-bridge --daemon

2267 root 2288 kB dhclient3 -e IF_METRIC=100 -pf /var/run/dhclient.eth0.pid -lf /var/lib/dhcp3/dhc …

1022 root 2236 kB udevd --daemon

1143 root 2232 kB udevd --daemon

1144 root 2232 kB udevd --daemon

5400 munin 1884 kB /bin/sh -c if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi

5401 root 1884 kB /bin/sh -c /etc/webmin/status/monitor.pl

5406 munin 1884 kB /bin/sh /usr/bin/munin-cron

5420 root 1884 kB sh -c ps –cols 2048 -eo user:80,ruser:80,group:80,rgroup:80,pid,ppid,pgid,pcpu, …

2 root 0 kB [kthreadd]

3 root 0 kB [ksoftirqd/0]

4 root 0 kB [kworker/0:0]

5 root 0 kB [kworker/u:0]

6 root 0 kB [migration/0]

7 root 0 kB [migration/1]

8 root 0 kB [kworker/1:0]

9 root 0 kB [ksoftirqd/1]

10 root 0 kB [migration/2]

11 root 0 kB [kworker/2:0]

12 root 0 kB [ksoftirqd/2]

13 root 0 kB [migration/3]

14 root 0 kB [kworker/3:0]

15 root 0 kB [ksoftirqd/3]

16 root 0 kB [khelper]

17 root 0 kB [kworker/u:1]

21 root 0 kB [xenwatch]

22 root 0 kB [xenbus]

142 root 0 kB [sync_supers]

144 root 0 kB [bdi-default]

146 root 0 kB [kblockd]

156 root 0 kB [md]

240 root 0 kB [rpciod]

242 root 0 kB [kworker/1:1]

271 root 0 kB [kswapd0]

272 root 0 kB [ksmd]

273 root 0 kB [fsnotify_mark]

277 root 0 kB [ecryptfs-kthrea]

279 root 0 kB [nfsiod]

282 root 0 kB [jfsIO]

283 root 0 kB [jfsCommit]

284 root 0 kB [jfsCommit]

285 root 0 kB [jfsCommit]

286 root 0 kB [jfsCommit]

287 root 0 kB [jfsSync]

288 root 0 kB [xfsmrucache]

289 root 0 kB [xfslogd]

290 root 0 kB [xfsdatad]

291 root 0 kB [xfsconvertd]

292 root 0 kB [crypto]

854 root 0 kB [khvcd]

967 root 0 kB [kpsmoused]

968 root 0 kB [kworker/0:1]

996 root 0 kB [kjournald]

1824 root 0 kB [kworker/3:1]

1827 root 0 kB [kworker/2:1]

2480 root 0 kB [flush-202:0]

5394 root 0 kB [miniserv.pl] Thanks for the help

@alforddm:

Ok I checked and it is mpm_prefork that is configured. I also took this list of running processes after it tried to max out the memory on me the other night. Does this tell anyone anything?

Yeah, that you've set MaxClients way too high and should do what HoopyCat says.

I did lower maxclients to 15 and I can already see a difference.

Does maxclients mean the max number of people that can connect at once? I clearly need to do some studying on these settings can anyone recommend an easy-to-understand breakdown?

Yes, MaxClients means the number of people who can be connected to your server at the same time.

If you have a high MaxClients, a lot of people can connect at once, but your server has to use a lot resources in order to process all of those requests simultaneously. This is OK up to a certain limit, but after that everything will become slow because the server is trying to do too many things at the same time.

If you have a low MaxClients, only a few people can connect at once, and other people will have to wait in line. But since there are only a few requests being processed at any given time, each request gets processed quicker. This also uses much less RAM, because RAM usage of PHP scripts is proportional to the number of requests being processed at the same time. (CPU usage, on the other hand, is roughly proportional to the total number of requests processed in a given amount of time.)

It is not easy to find the right balance between serving more requests at the same time and serving each request quicker, but the latter is usually a better bet in RAM-constrained environments such as a VPS. It's usually OK to keep some users waiting for a fraction of a second longer, if the request itself will finish quicker.

In order to optimize performance in this case, you just make sure that each request finishes very quickly, so that the line-up doesn't get too long. For example, you should install APC to make PHP scripts run faster, and also turn down Apache's "KeepAlive" setting to something like 5, or turn it off altogether, so that users don't hold up a line once they're finished.

Well the apache settings helped (I've made all the changes suggested and restarted apache) but this evening the memory usage keeps climbing. I'm wondering if there is a possible memory leak? With memory stats like this if I reduce the swap won't it crash the server?

Real memory: 497.48 MB total / 43.64 MB free Swap space: 3.91 GB total / 2.49 GB free

ID Owner Size Command

29417 www-data 560444 kB /usr/sbin/apache2 -k start

30061 www-data 493860 kB /usr/sbin/apache2 -k start

29440 www-data 491756 kB /usr/sbin/apache2 -k start

3507 www-data 412992 kB /usr/sbin/apache2 -k start

5468 www-data 317728 kB /usr/sbin/apache2 -k start

16007 www-data 190636 kB /usr/sbin/apache2 -k start

16005 www-data 189736 kB /usr/sbin/apache2 -k start

16006 www-data 182712 kB /usr/sbin/apache2 -k start

17325 www-data 178708 kB /usr/sbin/apache2 -k start

16485 root 154712 kB /usr/sbin/apache2 -k start

2074 mysql 140868 kB /usr/sbin/mysqld

2080 nobody 44948 kB /usr/bin/memcached -m 64 -p 11211 -u nobody -l 127.0.0.1

2004 syslog 28452 kB rsyslogd -c4

2582 root 25592 kB /usr/bin/python /usr/bin/fail2ban-server -b -s /var/run/fail2ban/fail2ban.sock

17985 root 20660 kB /usr/share/webmin/webmincron/webmincron.pl

17990 root 19208 kB /usr/share/webmin/proc/index_size.cgi

27088 root 17624 kB /usr/bin/perl /usr/share/webmin/miniserv.pl /etc/webmin/miniserv.conf

2252 root 11492 kB sendmail: MTA: accepting connections

30064 root 8408 kB sshd: alforddm [priv]

30133 alforddm 8408 kB sshd: alforddm@pts/1

2181 root 7332 kB /usr/sbin/munin-node

2533 root 5600 kB /usr/sbin/sshd -D

30134 alforddm 4728 kB -bash

2524 root 3600 kB /usr/sbin/ntpd

2525 ntpd 3496 kB /usr/sbin/ntpd

1 root 2732 kB /sbin/init

18005 root 2484 kB ps –cols 2048 -eo user:80,ruser:80,group:80,rgroup:80,pid,ppid,pgid,pcpu,vsz,ni …

2053 root 2428 kB cron

1020 root 2368 kB upstart-udev-bridge --daemon

1022 root 2368 kB udevd --daemon

1154 root 2364 kB udevd --daemon

1157 root 2364 kB udevd --daemon

2178 root 2288 kB dhclient3 -e IF_METRIC=100 -pf /var/run/dhclient.eth0.pid -lf /var/lib/dhcp3/dhc …

18004 root 1884 kB sh -c ps --cols 2048 -eo user:80,ruser:80,group:80,rgroup:80,pid,ppid,pgid,pcpu, …

2604 root 1840 kB /sbin/getty -8 38400 hvc0

2 root 0 kB [kthreadd]

3 root 0 kB [ksoftirqd/0]

4 root 0 kB [kworker/0:0]

5 root 0 kB [kworker/u:0]

6 root 0 kB [migration/0]

7 root 0 kB [migration/1]

8 root 0 kB [kworker/1:0]

9 root 0 kB [ksoftirqd/1]

10 root 0 kB [migration/2]

11 root 0 kB [kworker/2:0]

12 root 0 kB [ksoftirqd/2]

13 root 0 kB [migration/3]

14 root 0 kB [kworker/3:0]

15 root 0 kB [ksoftirqd/3]

16 root 0 kB [khelper]

17 root 0 kB [kworker/u:1]

21 root 0 kB [xenwatch]

22 root 0 kB [xenbus]

142 root 0 kB [sync_supers]

144 root 0 kB [bdi-default]

146 root 0 kB [kblockd]

156 root 0 kB [md]

240 root 0 kB [rpciod]

242 root 0 kB [kworker/1:1]

271 root 0 kB [kswapd0]

272 root 0 kB [ksmd]

273 root 0 kB [fsnotify_mark]

277 root 0 kB [ecryptfs-kthrea]

279 root 0 kB [nfsiod]

282 root 0 kB [jfsIO]

283 root 0 kB [jfsCommit]

284 root 0 kB [jfsCommit]

285 root 0 kB [jfsCommit]

286 root 0 kB [jfsCommit]

287 root 0 kB [jfsSync]

288 root 0 kB [xfsmrucache]

289 root 0 kB [xfslogd]

290 root 0 kB [xfsdatad]

291 root 0 kB [xfsconvertd]

292 root 0 kB [crypto]

854 root 0 kB [khvcd]

953 root 0 kB [kworker/0:1]

969 root 0 kB [kpsmoused]

996 root 0 kB [kjournald]

2039 root 0 kB [kworker/3:1]

2041 root 0 kB [kworker/2:1]

2569 root 0 kB [flush-202:0]

You're running drupal with gallery software, this most likely means image processing. PHP image processing can easily take up over 100mb of ram.

My advise for that setup would be to split php and apache so run php under fcgi and set it to around 4 processes. Then use apache in worker mode to process static files and pass back to php (or use nginx or another web server it doesn't really matter).

As for swap, swap uses the disk so is very slow, swapping is when files go in and out of swap because the server doesn't have enough memory this will pretty much cause your server to lock up because putting data into the disk, then back into ram for processing then back again is magnitudes slower than just using the ram.

The main uses for swap on a server is to store data that isn't used frequently.

Also you have memcached running set to use up to 64 mb of ram, make sure you take that into consideration when doing ram calculations.

If it was image processing that is causing the memory usage wouldn't there be more cpu usage at that time as well? I'm not seeing anything like that in the munin graphs.

~~![](<URL url=)http://colorgenetics.info/munin/localdo … u-week.png">http://colorgenetics.info/munin/localdomain/localhost.localdomain/cpu-week.png" />

The blank spots are were the server was basically froze and munin couldn't record information.~~

Anything but minimal swap use is bad configuration.

You didn't mention whether you have php-apc installed as suggested above. That can help a lot.

I assume you're using putty. Run htop, sort by MEM%, Hide kernel threads, make the putty window as large as possible, go to the window options, Clear Scrollback, Copy All to Clipboard, and paste here. (I wish there were an easier way to output the data that htop provides)

That's the best way I know to view memory use. Then you can see where your 512MB is going.

Please put the output between [ code ] and [ /code ] for readability. Here's an example:

1  [                                                            0.0%] Tasks: 40; 1 running
2  [                                                            0.0%] Load average: 0.00 0.01 0.05
3  [                                                            0.0%] Mem[||||||||||||||||||||||||||||||||||||||||||||           138/497MB]
4  [|                                                           0.3%] Swp[|||                                                     21/511MB]
  PID USER     PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
 2187 mysql     25   5 80336 42592  2820 S  0.0  8.4 20:00.43 /usr/sbin/mysqld
 9269 www-data  20   0  134M 33036 25644 S  0.0  6.5  0:08.57 php-fpm: pool www
 9268 www-data  20   0  132M 32072 26556 S  0.0  6.3  0:09.65 php-fpm: pool www
 9265 www-data  20   0  134M 30540 23604 S  0.0  6.0  0:07.14 php-fpm: pool www
 9267 www-data  20   0  132M 30152 24556 S  0.0  5.9  0:08.25 php-fpm: pool www
 9263 www-data  20   0  132M 29592 24224 S  0.0  5.8  0:06.95 php-fpm: pool www
 9266 www-data  20   0  134M 29528 22536 S  0.0  5.8  0:08.19 php-fpm: pool www
11200 myuser    20   0  8348  6584  1224 S  0.0  1.3  0:00.55 -bash
12721 root      35  15 66404  4636  3080 S  0.0  0.9  0:04.81 collectd -C /etc/collectd/collectd.conf -f
11232 root      20   0  5320  3852  1400 S  0.0  0.8  0:00.33 bash
 3869 www-data  20   0 21200  3500  1912 S  0.0  0.7  2:19.93 nginx: worker process
 3870 www-data  20   0 21200  3488  1860 S  0.0  0.7  2:12.56 nginx: worker process
 3867 www-data  20   0 21336  3376  1876 S  0.0  0.7  2:12.94 nginx: worker process
 3868 www-data  20   0 21200  3360  1892 S  0.0  0.7  2:11.71 nginx: worker process
11187 root      20   0  8452  2616  2048 S  0.0  0.5  0:00.10 sshd: myuser [priv]
12982 root      20   0  129M  1864   896 S  0.0  0.4  1:25.01 php-fpm: master process (/etc/php5/fpm/main.conf)
12709 postfix   20   0  5876  1696  1344 S  0.0  0.3  0:00.00 pickup -l -t fifo -u -c
 5338 postfix   20   0  6316  1556  1332 S  0.0  0.3  0:00.65 tlsmgr -l -t unix -u -c
11199 myuser    20   0  8560  1436   840 S  0.0  0.3  0:00.44 sshd: myuser@pts/0
12850 root      20   0  2676  1364  1052 R  0.0  0.3  0:02.68 htop
 3871 www-data  20   0 20944  1280   712 S  0.0  0.3  0:07.41 nginx: cache manager process
 2386 postfix   20   0  6016  1248  1024 S  0.0  0.2  0:01.33 qmgr -l -t fifo -u
 3651 root      20   0 20944  1244   800 S  0.0  0.2  0:00.68 nginx: master process /usr/sbin/nginx
 2097 root      20   0  5656  1240  1040 S  0.0  0.2  0:00.02 /usr/sbin/sshd -D
 2132 syslog    20   0 29528  1196   832 S  0.0  0.2  0:09.56 rsyslogd -c4
 2379 root      20   0  5864  1192  1036 S  0.0  0.2  0:06.50 /usr/lib/postfix/master
11231 root      20   0  2656  1116   872 S  0.0  0.2  0:00.00 su
11230 root      20   0  2288  1072   872 S  0.0  0.2  0:00.00 sudo su
    1 root      20   0  2920  1056   724 S  0.0  0.2  0:01.12 /sbin/init
 2160 root      20   0  2272   636   500 S  0.0  0.1  0:05.46 cron
 2244 ntpd      20   0  3512   608   472 S  0.0  0.1  0:02.85 /usr/sbin/ntpd -f /etc/openntpd/ntpd.conf
 2243 root      20   0  3620   580   464 S  0.0  0.1  0:00.05 /usr/sbin/ntpd -f /etc/openntpd/ntpd.conf
 2425 root      20   0  1872   420   376 S  0.0  0.1  0:00.00 /sbin/getty -8 38400 hvc0
 2393 root      35  15  1852   400   344 S  0.0  0.1  0:00.68 /usr/sbin/collectdmon -P /var/run/collectdmon.pid -- -C /etc/collectd/collectd
 2208 www-data  20   0  1984   372   332 S  0.0  0.1  0:00.02 /usr/sbin/fcgiwrap
 1259 root      18  -2  2552   352   184 S  0.0  0.1  0:00.00 udevd --daemon
 1260 root      18  -2  2552   348   184 S  0.0  0.1  0:00.00 udevd --daemon
 1038 root      16  -4  2556   332   192 S  0.0  0.1  0:00.13 udevd --daemon
 1027 root      20   0  2548   320   212 S  0.0  0.1  0:00.14 upstart-udev-bridge --daemon
 1925 root      20   0  2412   188   156 S  0.0  0.0  0:00.00 upstart-socket-bridge --daemon

F1Help  F2Setup F3SearchF4InvertF5Tree  F6SortByF7Nice -F8Nice +F9Kill  F10Quit

From this you can see that the largest piece of memory is used by mysql and the RESident number is about 43MB. I have my.conf tuned so that the query caches and such will only fill it to about that size.

RES is close to actual memory use. Virt is the maximum possible memory that the process might use, but almost never does.

Then you can see that next I have six php-fpm backends and their RES is about 30MB each. However, they also use SHaRed memory of about 23MB each, so really only one is using the full 30MB, the rest are using only about 7MB each. That's APC at work caching the php code in a shared memory space for use by each thread. The rest of the processes only use a few MB each. Kernel is maybe 40MB. Voila, 138MB used out of 497MB total.

How about yours?

I had xcache installed but after doing some searching it seemed apc was recommended more I uninstalled xcache and installed apc. I also signed up for cloudflare and reduced my swap. We'll see if anything makes a difference. This morning after installing apc it has done really well. I have had a bit higher traffic this morning than yesterday morning with better stats. I will wait till this evening and see how it does before I post the results of htop.

Well so far today things have been much much better. It's quite possible that I had something with xcache misconfigured. The cache usage is stable at around 143mb. I even uploaded some photos to the gallery and checked the memory usage. It remained stable, server didn't even blink. If anything changes I'll post an update.

@alforddm:

I had xcache installed but after doing some searching it seemed apc was recommended more I uninstalled xcache and installed apc. I also signed up for cloudflare and reduced my swap. We'll see if anything makes a difference. This morning after installing apc it has done really well. I have had a bit higher traffic this morning than yesterday morning with better stats. I will wait till this evening and see how it does before I post the results of htop.

How is everything going with CloudFlare? I just wanted to advise that there is a Drupal Module available as well. More info here: http://drupal.org/project/cloudflare

Some other tips about CloudFlare: http://blog.cloudflare.com/top-tips-for … lare-users">http://blog.cloudflare.com/top-tips-for-new-cloudflare-users

> How is everything going with CloudFlare? I just wanted to advise that there is a Drupal Module available as well. More info here: http://drupal.org/project/cloudflare

Thanks for checking up on how I was doing. It seems to help page load times and according to the cloudflare metrics has already saved me over 1G bandwidth (on a low traffic site).

As far as the rest of the site everything seems to be running much more smoothly thanks to the changes suggested here. I did have one incidence where the server slowed to a crawl. Couldn't even login through SSH. I managed to load webmin and restart apache and that immediately fixed the problem. I lowered apache max connections from 15 to 10 and made a few other changes and it hasn't reoccurred.

I want to thank everyone who took the time to post and point me in the right direction.

Glad to help. Some other CloudFlare tips that might help you out: http://blog.cloudflare.com/top-tips-aft … cloudflare">http://blog.cloudflare.com/top-tips-after-installing-cloudflare

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct