A (hopefully useful) Summary of Memory Tweaks
The first thing thing is remove bloat. Removing uneeded modules and such from apache are an obvious step, but what about ssh, dns, and other services? Often overlooked, but all areas that can be tuned.
Firstly, unless you have a reason to use openssh's sshd look at alternatives. dropbear is a great alternative and uses up about half of the memory that the openssh daemon does. Sample ps listings from dropbear and sshd:
dropbear:
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
1013 ? Ss 0:00 0 111 1776 432 0.5 /usr/sbin/dropbear -d /etc/dropbear/dropbear_dss_host_key -r /etc/dropbear/dropbear_rsa_host_key -p 22
11195 ? Ss 0:00 0 111 2076 952 1.1 /usr/sbin/dropbear -d /etc/dropbear/dropbear_dss_host_key -r /etc/dropbear/dropbear_rsa_host_key -p 22
sshd:
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
27192 ? Ss 0:02 16 274 3433 900 0.1 /usr/sbin/sshd
9420 ? S 0:00 27 274 6225 1948 0.4 sshd: xxx@pts/11
*Note %MEM listings are not comparable between dropbear and sshd, they are from different machines with different amounts of memory (dropbear: 80M, sshd: 512M)
The next thing worth looking at is your DNS, if you run one. Consider running djbdns over bind. I do not have a machine to run a test setup of bind on so I don't have real world numbers to show you here to prove djbdns is better, however, I supply a listing of a djbdns setup and a link to an article comparing bind and djbdns.
tinydns (x2)/dnscache/axfrdns:
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
1411 ? S 0:00 1 12 1367 304 0.3 supervise tinydns
1413 ? S 0:00 1 12 1371 308 0.3 supervise tinydns2
1415 ? S 0:00 0 12 1371 304 0.3 supervise dnscache
1423 ? S 0:00 1 12 1371 304 0.3 supervise axfrdns
1425 ? S 0:00 10 20 1611 384 0.4 /usr/bin/tinydns
1428 ? S 0:00 8 20 1611 380 0.4 /usr/bin/tinydns
1429 ? S 0:00 2 46 2765 1588 1.9 /usr/bin/dnscache
1430 ? S 0:00 0 36 1367 312 0.3 tcpserver -vDRHl0 -x tcp.cdb -- x.x.x.x 53 /usr/bin/axfrdns
Bind vs djbdns:
FTP daemons. I don't have a complete list of the various daemons to compare with, but here is a comparison of memory usage for vsftpd vs proftpd:
proftpd:
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
4742 ? Ss 0:00 8 752 3871 2376 0.4 proftpd: (accepting connections)
4775 ? S 0:00 127 752 4095 3052 0.6 proftpd: xxx - xxx.linux.bogus: IDLE
vsftpd:
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
5011 ? Ss 0:00 236 87 3276 916 0.1 /usr/sbin/vsftpd
5024 ? Ss 0:00 134 87 3400 1272 0.2 /usr/sbin/vsftpd
Other notes worth making. Spamassassin will NOT behave nicely on a linode80. No how, no way. This doesn't mean you have to abandon all spam filtering, though. So what can you do? Consider using RBL's. These are a low overhead and help immensely. Also, it is still possible to run a virus scanner, even on a linode80 provided your email server is not extremely busy. On my node I've got clamd running fine and per email scans take about .5s through the RBL and virus scan.
Also, ask yourself if you really need to be running apache? There are alternatives such as lighttpd that use a significantly smaller footprint and will work perfectly well in many situations.
One last thing I will mention is, if you are serving PHP websites, consider using a cacher like eAccelerator. Even on a non-overloaded linode the load times were noticeably faster with eAccelerator compared to without.
If there's any other services that someone would like to see tuning info for, just post a reply. At first I started this as a way to get my linode running as best I could, but now it's become fun tweaking everything to get the best performance so I'd be more than happy, free time willing, to explore other areas.
13 Replies
Require that a connecting machine introduce itself with HELO. Require that it give you a FQDN (as in the RFC). Block if it tells you it's you; a lot of spam introduces itself as YOUR hostname or IP address.
Require that it wait for a greeting before beginning to send its message. Introduce a delay in order to trap poorly-written SMTP bots.
You'd be amazed how much spam doesn't follow the simple protocol rules.
Another thing you can do is SMTP callouts. This is easy on exim, which is Debian's default. I'm not sure about others. The idea here is that after you've heard the From: line, you make a connection to the sender's mail server and verify that the sender actually exists. If not, the incoming connection is dropped.
All these allow you to drop spam early in the cycle, before they've even sent you the body of the message. It's easy on the old CPU/RAM!
I'm on a Linode80 and currently using Qmail along with Courier for IMAP. Now, aside from the barrage of spam, I don't have a lot of email users, so the traffic isn't heavy. IMAP is only used for webmail. I'm running Debian, if that matters.
But it makes assumptions about the way email works that you really can't make. Why can't a re-send come from a different IP? Why can't there be a unique sender, if I'm running a list and want it that way?
So there's a number of sites you can't get mail from, and you can't really complain because they're not violating any spec. So you end up having to maintain a whitelist.
I'd definitely try the HELO / callout methods before I'd resort to greylisting (or to relying on blocklists, but that's another story).
> I've seen some other discussions now on spam-fighting techniques with limited resources. But, along the lines of the discussion in the original post, comparing alternative choices for applications, is there particular mail software (or array of software) that is uniquely good on limited resources?
I'm also running a qmail setup and it seems minimal enough resource wise. I didn't post resource usage of various MTA's because I didn't have a spare box to setup non-qmail setups on to test with. Entirely without any proof I'd guess qmail is going to be on the lowend of resource usage compared to other MTAs because of the way it's designed and such, but like I said … no proof one way or other other. If anyone has a non-qmail setup and wouldn't mind posting some numbers we could compare and actually get some numbers (the above listings were done with a "ps vax").
> I'm not sure about greylisting, myself. It sounds like a great idea and I'm sure it works quite well, at least most of the time.
But it makes assumptions about the way email works that you really can't make. Why can't a re-send come from a different IP? Why can't there be a unique sender, if I'm running a list and want it that way?
I also looked at greylisting, but had similar concerns. One thing that particularly bothered me was the need to keep a whitelist up to date with a list of IPs running broken MTAs that won't try to rettempt delivery. That is more trouble that I want to go through for the little email traffic my linode gets, and for a higher traffic server I'd suspect you'd be running on hardware, or a linode, capable of running spamassassin.
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
25072 ? Ss 0:00 19 741 7882 2100 1.8 /usr/sbin/exim4 -bd -q30m
2596 ? S 0:00 138 741 7902 2888 2.5 /usr/sbin/exim4 -bd -q30m
top indicates that almost all of that memory is shared, so look at the bigger number rather than adding them up.
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
1143 ? S 0:00 1 31 1388 312 0.3 bin/qmail-inject -a -- xxx
1417 ? S 0:00 1 12 1371 308 0.3 supervise qmail-send
1419 ? S 0:00 1 12 1367 304 0.3 supervise qmail-smtpd
1421 ? S 0:00 1 12 1367 304 0.3 supervise qmail-pop3d
1432 ? S 0:00 0 18 1501 364 0.4 multilog t s100000 n20 /var/log/qmail/qmail-smtpd
1434 ? S 0:00 1 36 1523 484 0.5 qmail-send
1435 ? S 0:00 0 18 1501 364 0.4 multilog t s100000 n20 /var/log/qmail/qmail-send
1436 ? S 0:00 0 36 1547 540 0.6 /usr/local/bin/tcpserver -v -R -l xxx -x /etc/tcp.smtp.cdb -c 30 -u 1008 -g 1005 0 smtp rblsmtpd -r sbl-xbl.spamhaus.org -r bl.spamcop.net -r opm.blitzed.org /var/qmail/bin/qmail-smtpd xxx /home/vpopmail/bin/vchkpw /usr/bin/true
1437 ? S 0:00 1 36 1371 312 0.3 tcpserver -H -R -v -c100 0 110 qmail-popup xxx /home/vpopmail/bin/vchkpw qmail-pop3d Maildir
1438 ? S 0:02 0 18 1505 372 0.4 multilog t s100000 n20 /var/log/qmail/qmail-pop3d
1443 ? S 0:00 1 13 1503 348 0.4 qmail-lspawn ./Maildir
1444 ? S 0:00 1 9 1502 360 0.4 qmail-rspawn
1445 ? S 0:00 0 6 1501 344 0.4 qmail-clean
Of course it's hard to say just what difference the amount of system memory makes, too.
@Xan:
I'm not sure about greylisting, myself. It sounds like a great idea and I'm sure it works quite well, at least most of the time.
But it makes assumptions about the way email works that you really can't make. Why can't a re-send come from a different IP? Why can't there be a unique sender, if I'm running a list and want it that way?
No problem, at least with SQLgrey :
* it's smart enough to allow multiple IP in the same class C network to try to send the same email,
when IPs are not from the same network, default whitelists are available from a central repository (hosted on a Linode 80
:) ) which means you benefit from other's whitelists if they post new entries on the user's mailing-list.it has provisions to handle most mailing-list unique sender schemes.
Relevent lighttpd conf bits:
fastcgi.server = ( ".php" => ((
"bin-path" => "/usr/bin/php-cgi",
"socket" => "/tmp/php.socket",
"max-procs" => 2,
"min-procs" => 2,
"bin-environment" => (
"PHP_FCGI_CHILDREN" => "1",
"PHP_FCGI_MAX_REQUESTS" => "1000"
),
"bin-copy-environment" => (
"PATH", "SHELL", "USER"
),
"broken-scriptfilename" => "enable"
)))
Relevent Apache2 conf bits:
StartServers 1
MinSpareServers 1
MaxSpareServers 3
ServerLimit 64
MaxClients 64
MaxRequestsPerChild 100
In both setups eAccelerator is used and identical url lists as well. The setup consists of http and https request to a half dozen vhosts.
Tests were conducted using http_load with these parameters:
./http_load -verbose -parallel 100 -seconds 20 urls
lighttpd: 12.4793 fetches/sec, 154712 bytes/sec
apache2: 0.018649 fetches/sec, 21.6142 bytes/sec
Lighttpd remained reached a max load of .42 while apache2 hit a 16.51 load. Additionally, lighttpd remained at a used about 20M ram (the majority by php-cgi processes) and did not need to resort to swap space at all. Apache2, on the other hand, chewed well into the swap space by over 150M. Finally, lighttpd remained responsive the entire time, while apache2 served only timeout messages and thrashed.
To further test lighttpd I ran the above http_load line from 3 different machines simultaneously to ensure the client was not being a bottleneck and obtained nearly identical results to that shown above.
Worth mentioning is that the node these test were run on is one of the beta xen nodes (80M ram).