Huge http processes

I'm new to linode and unmanaged servers. Any hints on decreasing the size of my httpd processes would be appreciated. I suspect the issue is with the initial configuration of apache, but I believe most of the configuration is there for a reason–but not really certain. I'm running a bloated drupal community site on a linode 768 slice with nginx as reverse proxy and apache on the backend. I've tuned mysql and apache best I have read about here and elsewhere. Still free memory gets consumed rather quickly. An httpd reload or restart makes things fine for about a day. I think the bottom line is that my http processes are gigantic.

Centos 5.5/apache 2.x/mysql 5.0x/php 5.2x/drupal 6.19 using APC for file cache with no fragmentation and 99.8% hits. I'm not using drupal's boost, since I encountered a problem with that in the past. The database has 326 tables and probably a million rows.

Here is top:

29648 apache 18 0 116m 71m 6544 S 0.0 9.3 0:40.39 httpd

29645 apache 16 0 109m 63m 6728 S 0.0 8.3 0:36.51 httpd

29210 mysql 18 0 166m 60m 4840 S 0.0 7.8 1:01.56 mysqld

29646 apache 16 0 102m 56m 6504 S 0.0 7.4 0:39.09 httpd

29644 apache 18 0 99.0m 53m 6556 S 0.0 6.9 0:36.58 httpd

29711 apache 18 0 92080 42m 5228 S 0.0 5.6 0:10.46 httpd

29651 apache 17 0 89808 42m 6632 S 0.0 5.5 0:37.42 httpd

29567 root 18 0 56732 8620 5364 S 0.0 1.1 0:00.07 httpd

Here is the configuration:

'./configure' '--build=i686-redhat-linux-gnu' '--host=i686-redhat-linux-gnu' '--target=i386-redhat-linux-gnu' '--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib' '--libexecdir=/usr/libexec' '--localstatedir=/var' '--sharedstatedir=/usr/com' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--cache-file=../config.cache' '--with-libdir=lib' '--with-config-file-path=/etc' '--with-config-file-scan-dir=/etc/php.d' '--disable-debug' '--with-pic' '--disable-rpath' '--without-pear' '--with-bz2' '--with-curl' '--with-exec-dir=/usr/bin' '--with-freetype-dir=/usr' '--with-png-dir=/usr' '--enable-gd-native-ttf' '--without-gdbm' '--with-gettext' '--with-gmp' '--with-iconv' '--with-jpeg-dir=/usr' '--with-openssl' '--with-png' '--with-pspell' '--with-expat-dir=/usr' '--with-pcre-regex=/usr' '--with-zlib' '--with-layout=GNU' '--enable-exif' '--enable-ftp' '--enable-magic-quotes' '--enable-sockets' '--enable-sysvsem' '--enable-sysvshm' '--enable-sysvmsg' '--enable-track-vars' '--enable-trans-sid' '--enable-yp' '--enable-wddx' '--with-kerberos' '--enable-ucd-snmp-hack' '--with-unixODBC=shared,/usr' '--enable-memory-limit' '--enable-shmop' '--enable-calendar' '--enable-dbx' '--enable-dio' '--without-mime-magic' '--without-sqlite' '--with-libxml-dir=/usr' '--with-xml' '--with-system-tzdata' '--with-apxs2=/usr/sbin/apxs' '--without-mysql' '--without-gd' '--without-odbc' '--disable-dom' '--disable-dba' '--without-unixODBC' '--disable-pdo' '--disable-xmlreader' '--disable-xmlwriter' '--disable-json'

here are the loaded modules:

core prefork httpcore modso modauthbasic modauthdigest modauthnfile modauthnalias modauthnanon modauthndbm modauthndefault modauthzhost modauthzuser modauthzowner modauthzgroupfile modauthzdbm modauthzdefault utilldap modauthnzldap modinclude modlogconfig modlogio modenv modextfilter modmimemagic modexpires moddeflate modheaders modusertrack modsetenvif modmime moddav modstatus modautoindex modinfo moddavfs modvhostalias modnegotiation moddir modactions modspeling moduserdir modalias modrewrite modproxy modproxybalancer modproxyftp modproxyhttp modproxyconnect modcache modsuexec moddiskcache modfilecache modmemcache modcgi modrpaf-2 modphp5 modproxy_ajp

Thanks.

10 Replies

The size of your processes seems roughly on par with what I'd expect for Apache with so many modules loaded. I'd suggest that you:

1) Drop modules you don't use

2) Switch to fastcgi so that you can decouple PHP from your httpd processes

This may be implicit in your question, but you don't actually mention if you're seeing performance problems or not. There's nothing necessarily wrong with using up all your memory with the Apache processes (well, sans enough for other processes such as your db or admin access), even if that's only 4-5 processes, as long as your application stack is keeping up with your client load.

You do, however, want to ensure that you don't overload memory and start swapping, even if that means keeping your Apache process count down to single digits, since generally in a VPS environment swapping is likely to hurt performance more than just delaying a client request until there is a free worker process to handle it.

– David

Thanks David for your reply.

Since using nginx as a front end proxy and adding apc and tuning mysql, I'm seeing really fast performance for about 24 hours. Then I watch the free memory go down to almost nothing and I just restart httpd before it kills the site. That works for about 24 hours.

I decided to try the low memory apache settings suggested on this site but soon after I did that, I saw a quantum jump in CPU use and a crescendo of IO which is not hitting baseline anymore. It's not affected performance, but I am afraid where that is heading. So I reverted back to the default settings. I'm not sure why the jump in CPU happened. It almost looks like pre-nginx installation, but nginx status says running. Error files haven't been helpful and vmstat either. htop yields no clues to me except I'm seeing nginx working, but not as much as I would have expected.

I'm really perplexed by this and am so tempted to just go with nginx and drop apache, but I think there is less experience with that in the drupal community so I probably should stay more mainstream with solutions.

@mpratt:

Since using nginx as a front end proxy and adding apc and tuning mysql, I'm seeing really fast performance for about 24 hours. Then I watch the free memory go down to almost nothing and I just restart httpd before it kills the site. That works for about 24 hours.
This sort of behavior is almost always indicative of having a configuration that lets too many Apache processes run simultaneously when your request load is high. Although there's a difference between memory usage that is "almost nothing" and yet you still don't swap (which could be fine), versus "almost nothing" just because stats aren't reflect perfect usage and you're swapping a ton, which is really bad.

Pretty much, if you tune things adequately, no request load should be able to stop your Linode from being responsive (if only to your admin access), though you could certainly find your web site becoming very slow. Having to restart the web server should not be necessary.

I can't necessarily speak to the higher CPU usage without more data, including what "quantum" might mean (2x, 10x?) but one possibility is that with the better settings you weren't I/O bound as much and thus could actually use more CPU to process requests rather than just waiting on I/O. But that's a wild guess.

– David

Thank you for all for your remarks as they really help to give a better feel for where I am at with this and the direction I need to take. Most appreciated.

I will try to keep a watchful eye and let this ride without reloading httpd and see what happens now. It seems like I never have a swap of zero. There is always at least a swap of 5 out of 250. So I don't know if that is to be expected or a swap of zero is what we are shooting for. I have noticed prior to tuning that any swap in the double digits seems to spell impending trouble.

I unfortunately was so focused on the error files and everything else, that I forgot to look at the access files. Coincidentally, when I changed to the low apache server memory configuration, I was hit by a massive onslaught of googlebot, msnboot, yandex all at the same time. That was what caused the upper limits of my previous cpu usage to become the lower limits of my new cpu %. That has normalized this morning even on the reverted default settings. CPU % baseline is just a few percent instead of 12%.

So I am less perplexed and now more focused on lowering the httpd process amount. If I may ask another question, is the httpd process amount not only a function of the apache configuration and modules, but is it also a function of the software I am running as well–namely the bloated drupal site?

I have another specific question, but I don't know if I should ask it here or in another thread. But here it goes, anyways, since it is related to decreasing resource load. Since I am using nginx as reverse proxy to serve static files, is it redundant to use something like drupal's boost which serves html files to anonymous users?

Thanks again.

Mike

@mpratt:

I'm really perplexed by this and am so tempted to just go with nginx and drop apache, but I think there is less experience with that in the drupal community so I probably should stay more mainstream with solutions.

While I'm normally all gung-ho for ditching Apache in favour of lighttpd or nginx, I would be remiss if I didn't point out that many of the benefits of moving to nginx can be had with Apache by switching to fastcgi (php-fpm normally, I believe?), mpm_worker, and an appropriate number of threads/processes.

worker is much faster/memory efficient than prefork (prefork is only normally used because mod_php is not threadsafe, a problem solved by fastcgi), fastcgi saves RAM by decoupling PHP from the web server enabling you to independently tune the number of PHP processes (so that PHP isn't loaded into the memory of a process serving static content), and tuning ensures it all fits into a reasonable memory budget.

Personally, I think lighttpd is far easier to tune than Apache since the lighttpd defaults work well enough that all there really is to tune is the number of PHP processes, and lighttpd/nginx are still faster than Apache regardless of tuning, but you can probably get 80-90% of the way there with Apache alone.

I used fastcgi with eaccelerator before, but it was an easy setup through cpanel. Fastcgi makes sense as the logical next step. I'll look into it. Thanks.

@mpratt:

I will try to keep a watchful eye and let this ride without reloading httpd and see what happens now. It seems like I never have a swap of zero. There is always at least a swap of 5 out of 250. So I don't know if that is to be expected or a swap of zero is what we are shooting for. I have noticed prior to tuning that any swap in the double digits seems to spell impending trouble.
I wouldn't worry about a small non-zero swap value - most of my machines have a small single digit value. In fact, even a larger value is fine as long as it's not causing actual swapping (e.g., you could have some process space forced out to swap but if it's not part of your working set, it'll just sit there). It's just that looking at the swap value is a quick way to judge.

You could run "vmstat 1" for a short while and check the swap columns. Unless you're getting large values there (which in turn will likely reflect large I/O wait percentages in the top display, though other disk access can contribute to that latter number) I wouldn't sweat the actual swap space usage.

For longer term monitoring, heavy swap will pump up the disk I/O graphs on the Linode manager, but those combine all disk accesses. If you use a monitoring took (e.g., munin or equivalent) you can watch I/O specifically to your swap device over time.

But that's fine tuning, and you can go a long way by just setting conservative Apache parameters, hitting your site with enough parallel test queries with "ab" (use a URL that invokes your entire application stack) and observing memory usage with top to make sure you always have a little free (allocated to cache) and that your machine remains responsive for management purposes.

> So I am less perplexed and now more focused on lowering the httpd process amount. If I may ask another question, is the httpd process amount not only a function of the apache configuration and modules, but is it also a function of the software I am running as well–namely the bloated drupal site?
Depends on your configuration. In a default modphp setup, then yes, your entire application stack (memory and CPU) - well, sans database - is going to count against the httpd process, since modphp runs inside of the Apache process. If you're using other mechanisms (such as fastcgi which runs php in a separate process) then your drupal application stack will count against that process, and not Apache's httpd.

Also be aware that in both cases, such processes may come and go (Apache recreates when it uses a single child process for the set maximum number of requests, not sure about php-fpm's pool management), so over time there isn't necessarily a single process that will accumulate all your aggregate stats.

> I have another specific question, but I don't know if I should ask it here or in another thread. But here it goes, anyways, since it is related to decreasing resource load. Since I am using nginx as reverse proxy to serve static files, is it redundant to use something like drupal's boost which serves html files to anonymous users?
I'm not familiar with boost, but if nginx is handling the static requests without ever proxying them to Apache/drupal, then it seems like it would be redundant yes. Of course, this assumes you are properly configured to capture all possible static files in nginx (e.g., a consistent URL prefix to identify them or something) so I'm not sure it hurts all that much to leave such support enabled on the drupal side - if it's never used I suspect it's impact is small. Unless of course there's a bunch of configuration/management associated with it, or a large memory footprint required even when not used, that you'd like to avoid.

– David

David, thanks a lot for taking the time to help me.

I have a great deal to think about and test. I am updating my linode development slice to test out everything I learned here. I have intended to run ab, but didn't want to do it on the production server.

I delayed on installing munin, because I didn't know if it would consume much memory. I'll test it all on a small slice and see.

Thanks once again.

Mike

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct